Spatio-Temporal Distribution of Dissolved Inorganic Nitrogen in the Changshan Islands Archipelago Based on a Multiple Weighted Regression Model Considering Spatial Characteristics

: Ammonia nitrogen (NH4-N), nitrite nitrogen (NO2-N), and nitrate nitrogen (NO3-N) are important nutrients for maintaining the ecological balance of seawater archipelagos. Obtaining the concentrations of the three nitrogenous compounds simultaneously can allow us to comprehensively analyze nitrogen cycling in archipelago waters, which is beneﬁcial to the ecological protection of both agriculture and ﬁsheries. The existing studies have usually considered a single nitrogen compound or dissolved inorganic nitrogen (DIN), which can only identify the water quality but cannot comprehensively judge the water puriﬁcation situation or the toxicity of the nitrogen compounds in the water. In the process of constructing an inversion model, only the speciﬁc bands of remote sensing imageries used in training/learning are directly related to the actual measured values, ignoring the fact that the speciﬁc bands contain information on water quality parameters is different that would affect the ﬁtting accuracy. Furthermore, the existing empirical models and machine learning models have not yet been applied to high-resolution inversion in archipelago waters with active ﬁshing activities. In view of this, we constructed a multiple weighted regression model considering spatial characteristics (S-WSVR) to simultaneously retrieve the distribution of NH4-N, NO2-N, and NO3-N in archipelagic waters. By using the S-WSVR model and considering the complexity of the spatial distribution of the three nitrogen compounds in the mesoscale archipelagic waters, longitude and latitude were added to the experimental dataset as spatial features to ﬁt the nonlinear spatial relationships. Meanwhile, a multivariate weighting module based on the Mahalanobis distance was integrated to calculate the contribution of the characteristic bands and improve the inversion accuracy. The S-WSVR model was applied in the water of Changshan Islands, China, with a retrieval resolution of 30 m, and the r-values of the three nitrogen compounds achieved 0.9063, 0.8900, and 0.9755, respectively. Notably, the sum of the three nitrogen compounds has an r-value of 0.9028 when compared with the measured DIN. In addition, we obtained the Landsat 8 characteristic bands for the three nitrogen compounds and plotted the spatial distributions of the nitrogen compounds in spring and autumn from 2013 to 2022. By analyzing the spatio-temporal variations, it was apparent that the three nitrogen compounds are controlled by human activities and river inputs, and the anoxic discharge of the Yalu River has a strong inﬂuence on NO2-N content. Therefore, the accurate estimation in this study can provide scientiﬁc support for the protection of sensitive archipelago ecosystems.


Introduction
Islands account for only 5% of the world, but they support more than 20% of the biodiversity and provide abundant marine resources for human beings [1,2]. Under the influence of human activities like urban expansion, industrial land, agriculture and fisheries, and tourism activities, the seawater quality of island archipelagos can become seriously polluted, which aggravates the vulnerability and instability of archipelago ecosystems [3][4][5][6][7][8][9]. Dissolved inorganic nitrogen (DIN) is the main index for seawater pollution in archipelago ecosystems [10]. Excess DIN content may lead to seawater acidification [11] and eutrophication [12], which can cause the poisoning of marine organisms [13,14]. Therefore, it is of great significance to monitor the DIN content in the seawater of archipelago ecosystems, to protect marine organisms and maintain the ecological environment.
Traditional seawater DIN monitoring uses ship-borne equipment or buoy equipment to measure the contents of ammonia nitrogen (NH4-N), nitrite nitrogen (NO2-N), and nitrate nitrogen (NO3-N) and then sums the contents of the three inorganic nitrogen forms/species [15,16]. This method is limited by factors such as spatial location, weather, manpower, and equipment cost. Large-scale, high-density, and high-frequency monitoring cannot be performed, and it is difficult to obtain the DIN distribution in continuous wide fields [16,17]. With the development of remote sensing technology, the spectral information received by remote sensing satellites can be used to establish a physical model, an empirical model, or a semi-analytical model for surface seawater quality, and the results have the advantages of high spatial coverage, spatio-temporal continuity, and low cost [18,19]. Nitrogen has a weak optical reaction [20], so many researchers have used empirical models to reflect the nitride content or the mixture of nitrogen on the surface of the water.
Isenstein and Park [21] and Li et al. [22] developed linear regression equations for the inversion of the total nitrogen in small water bodies such as lakes and reservoirs, but the R 2 was only 0.75. Yu et al. [23] divided a large area of the sea into three small regional areas, and a stepwise regression model was constructed for each region, obtaining an R 2 fitting accuracy of greater than 0.77 for all three regions. However, in recent years, researchers have found that the relationship between the dissolved nutrients or a mixture of nitrogen and the band reflectivity in surface water is nonlinear, and the linear fitting accuracy was not satisfactory. Meanwhile, machine learning models have achieved good fitting results [18]. For example, Huang et al. [19] used a back propagation neural network (BPNN) model to invert the DIN in Shenzhen Bay in China, obtaining a fitting accuracy R 2 of 0.9; Guo et al. [17] used a variety of machine learning methods to explore the total nitrogen of small lakes, and the highest fitting accuracy R 2 reached 0.88; Wang et al. [24] established a support vector machine regression model for NH4-N content in small rivers, and the fitting accuracy R 2 was as high as 0.98; and Vakiliz et al. [25] constructed an artificial neural network (ANN) model for reservoir total nitrogen, and the fitting accuracy R 2 reached 0.93. These machine learning models can perform well in small water bodies, but due to the limitations of the known data space and quantity, they cannot solve the complex differences of optical properties in different regions of medium-and large-scale water bodies [26]. In view of this, researchers have attempted to add geographical information elements to the inversion of large-scale sea areas to solve the nonlinear distribution of water quality in space. For example, Du et al. [27] proposed a geographically neural network weighted regression (GNNWR) model to evaluate the water quality of the Zhejiang coastal sea in China, obtaining a model accuracy R 2 of >84%. Du et al. [28] used a geographically and cycle-temporally weighted regression (GCTWR) model to explore the spatially continuous distribution of chlorophyll in coastal waters, and the fitting accuracy R 2 was 0.8721. However, these huge models cannot take into account the changes in the content of small-and medium-sized details, and they require a large volume of measured data. Island sea areas are mesoscale water bodies, lying between small-and large-scale water bodies. On the one hand, the water quality of mesoscale water bodies is the same as that of the large-scale sea areas, in that the water quality is affected by the horizontal and vertical movement of the ocean currents and the unbalanced primary productivity, and the spatial distribution can be complex in different regions; simple spatial regression models struggle to smooth the complex spatial nonlinearity of large-scale water quality [27,28]. On the other hand, the volume of the measured data in mesoscale research areas is often small, making it difficult to establish suitable water quality inversion models for mesoscale sea areas [29,30].
At the same time, the existing inversion models involve simple band selection or simple band combination. For example, Huang et al. [19] directly utilized the b (coastal), b (blue), b (green), b (red), and b (NIR) bands of Landsat 8 and the b (blue), b (green), b (red), and b (NIR) bands of Landsat 5; Guo et al. [17] adopted the Sentinel-2 B3, B4, B5, B6, B7, and B8 bands; Wang et al. [24] used the R1, R2, R3, and R4 bands of SPOT-5; Du et al. [28] used the Moderate Resolution Imaging Spectroradiometer (MODIS) B1, B3, B5, and B7 bands; Vakili et al. [25] used Landsat 8 Band 4 and the band ratio of Band 3/Band 2; and Torbick et al. [31] used the Landsat Thematic Mapper (TM)3, TM1/TM3, and TM3/TM1 bands. However, the reflectance of remote sensing bands can show the phenomenon of different spectra for the same object, even if the same characteristic bands with high correlation have different degrees of nitrogen information. Simple band selection and band combination only consider the high contribution degree and do not consider the differences between the contribution degrees of these characteristic bands during the inversion process, resulting in a reduction in the fitting accuracy. Therefore, it is necessary to analyze the contribution of the related bands to nitrogen and use a weighted calculation where the band information with a large contribution and low error is retained to participate in the fitting regression calculation.
What is more, NH4-N, NO2-N, and NO3-N pose different risks to marine organisms: NH4-N can lead to asphyxia, acidosis, and decreased blood oxygen in fish [32][33][34]; NO2-N can lead to serious electrolyte imbalances in aquatic animals [35,36]; while the toxicity of NO3-N to aquatic organisms is extremely low and almost negligible [12]. When the dissolved oxygen (DO) content in seawater is normal, nitrification will occur, and NH4-N will be converted into NO2-N and then into NO3-N, and the water quality of the seawater will gradually improve. When the DO content is insufficient, denitrification will occur, and the water quality cannot be improved with the increase in NO2-N content [35,37,38]. To date, the continuous distribution of NH4-N, NO2-N, and NO3-N has not been mapped synchronously in the existing research. The individual nitrogen compounds or total DIN content can neither allow for a comprehensive analysis of the toxicity of seawater nor judge the water purification situation in detail. Therefore, it is necessary to monitor the content of NH4-N, NO2-N, and NO3-N synchronously, which can not only determine whether the DIN content exceeds the standard but also allow a more thorough judgment on the specific situation of water environmental pollution at that time.
In this paper, in order to synchronously monitor the contents of the three inorganic nitrogen forms/species in mesoscale archipelago waters by considering the nonlinear spatial distribution of the nitrogen compounds and using more effective information in the selected feature bands, we propose a multiple weighted regression model considering spatial characteristics (S-WSVR). In the S-WSVR model, the spatial features and correlated bands are taken as the input parameters, and a multivariate weighting module based on the Mahalanobis distance is added to help the support vector regression (SVR) model calculate the weight relationship between the input bands and the target parameters, and the regression relationship is established by finding the optimal hyperplane. In this study, the coastal area of Changshan Islands, China was selected as the experimental area, and 320 sets of measured NH4-N, NO2-N, and NO3-N data from 2013 to 2022 were collected. According to the time and spatial scale, the band reflectance of the medium spatial resolution Landsat 8 multi-spectral remote sensing imagery was matched. Three experimental datasets of NH4-N, NO2-N, NO3-N and their correlated bands containing spatial characteristics were established. Finally, the NH4-N, NO2-N, and NO3-N regression models were obtained by training and testing the data at an 8:2 ratio. In addition, we mapped the spatial and temporal distribution of NH4-N, NO2-N, and NO3-N in the archipelagic waters of the study area during the spring and autumn based on the designed Water 2023, 15, 3176 4 of 23 experimental regression model. We also analyzed the distribution of the three types of nitrogen using the "spatial quantification of the relationship between human activities and marine ecosystems" (SQRHM) model and dissolved oxygen.

Study Area
The Changshan Islands are the largest island group in the Yellow Sea, and they are located between 38 • 55 to 39 • 41 north and 122 • 04 to 123 • 32 east, covering an area of 10,324 km 2 . The Changshan Islands have the typical characteristics of an archipelago ecosystem. Figure 1 shows the location of the study area. The northwest of the Changshan Islands is connected to the Liaodong Peninsula, and the whole area is rich in marine resources. In recent years, with the development of large-scale economic activities such as tourism, transportation, industry, and aquaculture, land-based and island-derived pollutants have entered the archipelago area. The main pollution sources for DIN in the area are marine aquaculture [18], industrial discharge [39,40], port seawater pollution [41], tourist garbage [42], and river discharge [43]. In order to study the influence of each driving factor on marine water quality and refine the spatial and temporal distribution of the water quality changes, it was necessary to analyze the sensitivity of the ocean water quality conditions and influencing factors over a wide range, with a long time series and high precision.
cording to the time and spatial scale, the band reflectance of the medium spatial resolution Landsat 8 multi-spectral remote sensing imagery was matched. Three experimental datasets of NH4-N, NO2-N, NO3-N and their correlated bands containing spatial characteristics were established. Finally, the NH4-N, NO2-N, and NO3-N regression models were obtained by training and testing the data at an 8:2 ratio. In addition, we mapped the spatial and temporal distribution of NH4-N, NO2-N, and NO3-N in the archipelagic waters of the study area during the spring and autumn based on the designed experimental regression model. We also analyzed the distribution of the three types of nitrogen using the "spatial quantification of the relationship between human activities and marine ecosystems" (SQRHM) model and dissolved oxygen.

Study Area
The Changshan Islands are the largest island group in the Yellow Sea, and they are located between 38°55′ to 39°41′ north and 122°04′ to 123°32′ east, covering an area of 10,324 km 2 . The Changshan Islands have the typical characteristics of an archipelago ecosystem. Figure 1 shows the location of the study area. The northwest of the Changshan Islands is connected to the Liaodong Peninsula, and the whole area is rich in marine resources. In recent years, with the development of large-scale economic activities such as tourism, transportation, industry, and aquaculture, land-based and island-derived pollutants have entered the archipelago area. The main pollution sources for DIN in the area are marine aquaculture [18], industrial discharge [39,40], port seawater pollution [41], tourist garbage [42], and river discharge [43]. In order to study the influence of each driving factor on marine water quality and refine the spatial and temporal distribution of the water quality changes, it was necessary to analyze the sensitivity of the ocean water quality conditions and influencing factors over a wide range, with a long time series and high precision.

Measured Data
In this study, the NH4-N, NO2-N, and NO3-N contents were investigated in the whole coastal area of Changshan Islands from 2013 to 2022. The data sampling was carried out using continuous measurement, i.e., the data were sampled twice at the same point in the spring tide and neap tide each month, and 25 samples of NH4-N, NO2-N, and NO3-N were collected at one-hour sampling intervals in each sampling task. The data were analyzed in the laboratory, and the average value of 50 samples for the two tides at the same point was taken as the measured value of the month at that point. The DIN detection instruments and analysis methods for the measured data are listed in Table 1. There were 320 measured sets of data for each DIN over the 10 years, and the data were randomly and evenly distributed within the waters of the Changshan Islands, for which the maximum value of NH4-N was 0.318 mg/L, and the minimum value was 0.006 mg/L. The maximum value of NO3-N was 0.164 mg/L, and the minimum value was 0 mg/L. The maximum value of NO2-N was 0.015 mg/L, and the minimum value was 0 mg/L. The remote sensing data were Landsat 8 multi-spectral remote sensing images with a spatial resolution of 30 m, which can meet the needs of mesoscale water quality inversion. Compared with large-scale water quality monitoring, this approach can improve the spatial location matching accuracy with the measured data and also allows the use of more accurate location coordinates in the model calculation. The remote sensing satellite data were obtained from the United States Geological Survey EarthExplorer website (https://earthexplorer.usgs. gov/, accessed on 1 May 2022). We selected the images with the most similar survey time to the measured data, and the row numbers included (119, 32) and (119, 33). The Operational Land Imager (OLI) sensor has a total of nine bands, and Band 1-Band 7 were selected in this study, which are the coastal zone, blue, green, red, near-infrared, short-wavelength infrared 1, and short-wavelength infrared 2 bands, respectively. The band range was 0.433 µm to 2.300 µm. The panchromatic Band 8 for enhanced resolution and the cirrus Band 9 for cloud detection were not used in this study. In addition, all the above data are Landsat 8 Collection 2 Level-2 (L2) products, which do not require radiometric calibration or atmospheric correction.

Experimental Procedure
As shown in Figure 2 the experiments of this study were divided into three parts: data preprocessing, model training, and plotting the distribution of the three nitrogen compounds.

•
Step 1: data preprocessing. The image with the closest measured data collection time was selected (image revisiting period is 16 days), and the remote sensing reflectance of its closest image raster position was selected based on the spatial location coordinates of the three nitrogen samples (spatial resolution is 30 m). The bands with a high correlation with NH4-N, NO2-N, and NO3-N were judged using Pearson correlation coefficients and used as the characteristic bands. The measured values, spatial characteristics (longitude and latitude), and characteristic bands of the three inorganic nitrogen forms/species were composed into a sample dataset. The sample dataset was then processed by removing outliers by IBM SPSS Statistics 22 and performing dequantization, which only scales the data characteristics and eliminates the influence of correlation between data due to the difference of the quantization, but it also keeps the coefficient of variation and the degree of mutual influence between parameters unchanged [44,45].

•
Step 2: model training. The sample datasets of all three nitrogen compounds were divided into training and test sets in the ratio of 8:2 [19], with the characteristic bands and spatial features as the input parameters and the actual measured values of the inorganic nitrogen forms/species as the target parameters. The regression results for NH4-N, NO2-N, and NO3-N were obtained by training the S-WSVR model, where the characteristic bands were used to calculate the weights in the multivariate weighting module. The regression results were evaluated using five accuracy indicators, and the training results for the three nitrogen compounds at the same point were summed and compared with the actual DIN values to verify the model accuracy.

•
Step 3: mapping the distribution of the three nitrogen compounds. The image data from the spring and autumn of 2013-2022 were selected, and the images with the same time and rank numbers (119, 32) and (119, 33) were stitched and cropped. The characteristic bands and spatial feature information of the processed images were read and input into the trained regression model to calculate the concentrations of the three nitrogen compounds, and the calculated results were divided into 10 classes to plot the spatial and temporal distributions of NH4-N, NO2-N, and NO3-N.
ence of correlation between data due to the difference of the quantization, but it also keeps the coefficient of variation and the degree of mutual influence between parameters unchanged [44,45].

•
Step 2: model training. The sample datasets of all three nitrogen compounds were divided into training and test sets in the ratio of 8:2 [19], with the characteristic bands and spatial features as the input parameters and the actual measured values of the inorganic nitrogen forms/species as the target parameters. The regression results for NH4-N, NO2-N, and NO3-N were obtained by training the S-WSVR model, where the characteristic bands were used to calculate the weights in the multivariate weighting module. The regression results were evaluated using five accuracy indicators, and the training results for the three nitrogen compounds at the same point were summed and compared with the actual DIN values to verify the model accuracy.

•
Step 3: mapping the distribution of the three nitrogen compounds. The image data from the spring and autumn of 2013-2022 were selected, and the images with the same time and rank numbers (119, 32) and (119, 33) were stitched and cropped. The characteristic bands and spatial feature information of the processed images were read and input into the trained regression model to calculate the concentrations of the three nitrogen compounds, and the calculated results were divided into 10 classes to plot the spatial and temporal distributions of NH4-N, NO2-N, and NO3-N.

Support Vector Regression
Previous studies have shown that SVR models perform well in the case of limited samples [17,46]. SVR is used to control the number of margins and support vectors using kernel functions, sparse solutions, and Vapnik-Chervonenkis core theory [47,48]. A linear regression hyperplane is then found in the high-dimensional feature space to solve the nonlinear supervised learning. In this study, the input parameter space information, the selected feature bands, and the target parameters of NH4-N, NO2-N, and NO3-N were considered to have a nonlinear relationship. Assuming a certain inorganic nitrogen

Model Method 2.4.1. Support Vector Regression
Previous studies have shown that SVR models perform well in the case of limited samples [17,46]. SVR is used to control the number of margins and support vectors using kernel functions, sparse solutions, and Vapnik-Chervonenkis core theory [47,48]. A linear regression hyperplane is then found in the high-dimensional feature space to solve the nonlinear supervised learning. In this study, the input parameter space information, the selected feature bands, and the target parameters of NH4-N, NO2-N, and NO3-N were considered to have a nonlinear relationship. Assuming a certain inorganic nitrogen forms/species training sample as N = {(x 1 , n 1 ), (x 2 , n 2 ), . . . , (x m , n m )}, n m ∈ R, the SVR model was constructed as follows: where ω is a normal vector, which determines the characteristic hyperplane direction; b is the displacement constant; x is the input relevant parameter; n is the inorganic nitrogen forms/species measured value; m is the inorganic nitrogen forms/species sample size; and f (x) is the inorganic nitrogen forms/species estimate. The goal of SVR is to minimize the "distance" to the inorganic nitrogen forms/species sample points farthest from the hyperplane. Unlike other regressions, SVR can be conducted with the deviation ε between f (x) and n to seek a maximum "interval band", so that more sample points are located in the interval band. Losses are computed only for sample points outside the interval band, where the absolute value of the deviation of f (x) from n is greater than ε. Relaxation variables are introduced to cope with nonlinearities and outliers, and points are allowed to exist outside the interval band, so that the losses should be as small as possible. The soft interval SVR model can be denoted as follows: where C is the regularization constant that provides a balance between the smoothness of the fitting function and the bias of the training data; ξ i , ξ * i is a positive relaxation variable; ε represents the allowable fit tolerance; and m indicates the actual number of data.
At the same time, by introducing Lagrange multipliers, the original space of the input data is mapped to a higher-dimensional feature space, and the optimal feature hyperplane is obtained by a nonlinear kernel function. The inorganic nitrogen forms/species inversion model SVR, after introducing the Lagrange multipliers, can be denoted as follows: where α * i , α I is a Lagrange multiplier greater than or equal to 0, and κ(x, x i ) is the kernel function.
The kernel function learning method extends linear learning to nonlinear learning by "kernelization", for which the commonly used functions include a linear kernel, polynomial kernel, s-type kernel, and radial basis function (RBF) kernel. The RBF kernel was selected as the kernel function for the quantitative inversion of seawater nitrogen compounds in this study, and its formula is as follows: where σ is the Gaussian kernel bandwidth.

Multivariate Weighting Module
Each characteristic band contains different amounts of information about the target inorganic nitrogen forms/species, so there is variability in the contribution of the characteristic bands to the target parameters in the fitting calculation process. The traditional algorithms represented by single bands and band combinations tend to lose important information in the inversion, so we added a multivariate weighting module to the basic model. The weighting can assign higher weights to the bands with less intra-class variation and more inter-class variation, to retain more effective information and improve the fitting accuracy. The multivariate weighting module is based on the statistical idea of using the Mahalanobis distance metric to find the central distance between samples to determine the inter-sample similarity. Compared with other similarity measures, the Mahalanobis distance calculation is based on the overall samples and can take into account the correlation between individual samples. It is also independent of the measurement scale [49][50][51], so when discriminating between the three nitrogen compounds and their characteristic bands, the Mahalanobis distance can solve the problem of similarity between samples. Therefore, the Mahalanobis distance can resolve the non-independent homogeneous distribution between the characteristic bands without considering the different measurement units of the three nitrogen compounds and the band reflectance when determining the similarity between the nitrogen compounds and their characteristic bands. In this study, we finally chose to calculate the similarity between the three nitrogen compounds and their associated bands using the Mahalanobis distance to set the weights. The formula for the Mahalanobis distance in calculating the weights between the inorganic nitrogen forms/species and the associated bands is as follows: where I represents the selected bands with a high correlation with inorganic nitrogen forms/species; N is the actual measurement value of the sample of collected inorganic nitrogen forms/species; n is the number of characteristic bands with inorganic nitrogen forms/species; and ∑ is the estimated parameters and all the associated band covariance matrices, which is a measure of the correlation between variables in a multidimensional dataset, as shown in Equation (6): where var is the variance and cov is the covariance.

S-WSVR
The multiple weighted regression model that takes into account spatial features (S-WSVR) is a model that builds a regression relationship based on the SVR model with spatial features (longitude and latitude) and feature bands as the model input parameters. It incorporates a multivariate weighting module to calculate the weighting coefficients for the feature bands and uses the nitrogen and salt concentrations as the output parameters. The algorithm inputs spatial features into the model and makes the feature bands have unequal weight calculations in the regression fitting process, thereby retaining more useful information and improving the model fitting capability. Figure 3 shows the model structure of S-WSVR, which can be represented as follows: where LON, LAT stand for the geographical latitude and longitude, respectively. WBAND represents the product of the feature band and the Mahalanobis distance weight.

Characteristic Bands and Weights
The proposed approach is based on the use of the Pearson correlation coefficient in the multivariate statistical method to determine the correlation between the Landsat 8

Characteristic Bands and Weights
The proposed approach is based on the use of the Pearson correlation coefficient in the multivariate statistical method to determine the correlation between the Landsat 8 band reflectance and the three nitrogen compounds, and the whole process is realized in SPSS statistical software. The Pearson correlation coefficient is widely used to measure the correlation degree between two variables, where the value indicates the correlation between the two variables. In order to avoid the loss of information and to ensure the correct selection of correlated bands, the Pearson correlation coefficient was used as the judgment indicator when selecting the correlated bands. The Landsat 8 bands with high correlation with NH4-N, NO2-N, and NO3-N are, respectively, B2, B3, B4, B5, B6, and B7; B3, B6, and B7; and B5, B6, and B7. These bands were taken as the characteristic bands. Table 2 lists the Pearson correlation coefficients and the significance for NH4-N, NO2-N, NO3-N, and the band reflectance. Notes: * correlation is significant at the 0.05; ** correlation is significant at the 0.01.
The S-WSVR model calculation was implemented based on PyCharm Community Edition 2021.3.3. The longitude, latitude, and characteristic bands of the training set of the three nitrogen compounds were input into the algorithm, and the weighting coefficients of the characteristic bands of the three nitrogen compounds were obtained using the multivariate weighting module. This allows the relevant bands to have unequal weight calculations in the subsequent fitting regression. The weighting coefficients calculated by Equation (5) of the feature bands of the three nitrogen compounds are listed in Table 3.

Regression Results
The weighting coefficients calculated by the multivariate weighting module were multiplied by the values of the characteristic bands, and the spatial features were trained with the three inorganic nitrogen forms/species concentration values of the training set as the learning target. A genetic algorithm was used for the optimization process to avoid overfitting during the training process. The three inorganic nitrogen forms/species concentration of the test set was finally output and the regression fitting results were evaluated for accuracy using the Pearson correlation coefficient (r), root-mean-square error (RMSE), mean square error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) for the regression training results and the measured values. The formulas for the model evaluation equations are as follows: whereN i is the predicted value for the three inorganic nitrogen forms/species, N i is the measured value for the three inorganic nitrogen forms/species, N i is the average value of the measured three inorganic nitrogen forms/species, and n is the number of test sets. The regression results for the NH4-N, NO2-N, and NO3-N model training are shown in Figure 4a Table 4, it can be found that OLR models have the worst performance results, and the WSVR and S-WSVR models with the addition of the multivariate weighting module show a certain magnitude of improvement over both the SVR and S-SVR fitting results (r), with a minimum improvement of 0.0222 and a maximum improvement of 0.2160, indicating that the multivariate weighting module can effectively filter out more effective information in the feature bands and make the relevant bands improve the accuracy in the fitting process. In addition, GWR, S-SVR, and S-WSVR training results are better, indicating that the regression relationship between water quality value and the band reflectance value of the selected bands is significantly nonstationary in space. Table 5 shows that there is a strong correlation between longitude and latitude and the three nitrogen compounds, and after adding spatial features to the input parameters, S-SVR and S-WSVR show a substantial improvement in fitting accuracy over SVR and WSVR, with a minimum improvement of 0.3833 and a maximum improvement of 0.4659, which shows that the addition of spatial features can improve the fitting accuracy. The S-WSVR model takes into account the spatial information, and the band-weighted comparison of the other three models shows that the S-WSVR model shows the best performance and the highest fitting accuracy. The fitting accuracy r-values are increased by 0.4353, 0.4535, and 0.5210, compared with the original SVR model, and the RMSE, MSE, MAE, and MAPE values are reduced. The experiments prove that the S-WSVR model can obtain good fitting results for all three nitrogen compounds in this mesoscale archipelagic sea area. 4, it can be found that OLR models have the worst performance results, and the WSVR and S-WSVR models with the addition of the multivariate weighting module show a certain magnitude of improvement over both the SVR and S-SVR fitting results (r), with a minimum improvement of 0.0222 and a maximum improvement of 0.2160, indicating that the multivariate weighting module can effectively filter out more effective information in the feature bands and make the relevant bands improve the accuracy in the fitting process. In addition, GWR, S-SVR, and S-WSVR training results are better, indicating that the regression relationship between water quality value and the band reflectance value of the selected bands is significantly nonstationary in space. Table 5 shows that there is a strong correlation between longitude and latitude and the three nitrogen compounds, and after adding spatial features to the input parameters, S-SVR and S-WSVR show a substantial improvement in fitting accuracy over SVR and WSVR, with a minimum improvement of 0.3833 and a maximum improvement of 0.4659, which shows that the addition of spatial features can improve the fitting accuracy. The S-WSVR model takes into account the spatial information, and the band-weighted comparison of the other three models shows that the S-WSVR model shows the best performance and the highest fitting accuracy. The fitting accuracy r-values are increased by 0.4353, 0.4535, and 0.5210, compared with the original SVR model, and the RMSE, MSE, MAE, and MAPE values are reduced. The experiments prove that the S-WSVR model can obtain good fitting results for all three nitrogen compounds in this mesoscale archipelagic sea area.   To further validate the model performance, the DIN concentration, i.e., the sum of the NH4-N, NO2-N, and NO3-N measured values, was calculated based on the measured data and compared with the measured DIN values. Figure 5 and Table 6 show the DIN fitting accuracy evaluation indices. The sum of the predicted values of the three nitrogen compounds, compared with the actual DIN, has a fitting accuracy r-value of 0.9028 and an RMSE of 0.0990 mg/L, and the accuracy evaluation of DIN compared with the three nitrogen compounds has a ranking of r-value of NO3-N > NH4-N > DIN > NO2-N and an error accuracy ranking of NH4-N > NO2-N > NO3-N > DIN. The accuracy ranking shows that the fitting accuracy r-value of DIN is not as high as that of NO3-N and NH4-N, but the RMSE accuracy ranking is the smallest, which indicates that the S-WSVR model has no overfitting phenomenon for the prediction of the three nitrogen compounds. The sum of the predicted results of the three nitrogen compounds correlates well with the DIN, so that the DIN content can be judged by the predicted results of the S-WSVR model.

Representation of the Weights in the Model
The goal of the SVR model is to find the optimal hyperplane in high-dimensional space with the smallest "distance" of the farthest sample points. The sample points falling into the "spacing band" under an equal weight calculation will lose some correct information. In order to allow more correlated band features to be selected in the "interval band", more useful information can be obtained in the SVR model calculation, and the sample points can be stretched to be closer to the optimal hyperplane by assigning weight coefficients to the correlated bands [52,53]. In addition, from the comparison between Tables 2 and 3, it can be found that, the larger the r-value, the smaller the weight is, and vice versa. To explain this phenomenon, we selected some NO3-N(z), B5(x), and B6(y) data for regression, and assigned a larger weight to B5(y) for the regression to compare the role of band weighting in the regression model. Figure 6 shows the data point distance from the hyperplane location, where the data point in Figure 6b is close to the hyperplane. The essence of this is that the weighting is a stretching change to the B6 data to make B5 more favorable to the model convergence, to reduce the losses.

Representation of the Weights in the Model
The goal of the SVR model is to find the optimal hyperplane in high-dimensional space with the smallest "distance" of the farthest sample points. The sample points falling into the "spacing band" under an equal weight calculation will lose some correct information. In order to allow more correlated band features to be selected in the "interval band", more useful information can be obtained in the SVR model calculation, and the sample points can be stretched to be closer to the optimal hyperplane by assigning weight coefficients to the correlated bands [52,53]. In addition, from the comparison between Tables 2 and 3, it can be found that, the larger the r-value, the smaller the weight is, and vice versa. To explain this phenomenon, we selected some NO3-N(z), B5(x), and B6(y) data for regression, and assigned a larger weight to B5(y) for the regression to compare the role of band weighting in the regression model. Figure 6 shows the data point distance from the hyperplane location, where the data point in Figure 6b is close to the hyperplane. The essence of this is that the weighting is a stretching change to the B6 data to make B5 more favorable to the model convergence, to reduce the losses. sample points can be stretched to be closer to the optimal hyperplane by assigning weight coefficients to the correlated bands [52,53]. In addition, from the comparison between Tables 2 and 3, it can be found that, the larger the r-value, the smaller the weight is, and vice versa. To explain this phenomenon, we selected some NO3-N(z), B5(x), and B6(y) data for regression, and assigned a larger weight to B5(y) for the regression to compare the role of band weighting in the regression model. Figure 6 shows the data point distance from the hyperplane location, where the data point in Figure 6b is close to the hyperplane. The essence of this is that the weighting is a stretching change to the B6 data to make B5 more favorable to the model convergence, to reduce the losses.  Figure 7 shows the projection of the points in the hyperplane on the B5-o-NO3-N plane and the B6-o-NO3-N plane, respectively. After weighting B6, the r-value of the B5-o-NO3-N plane increases and the RMSE decreases. The correlation of B5 with NH4-N increases and the r-value of the B6-o-NO3-N plane decreases and the RMSE increases. The correlation of B6 with NH4-N decreases, which is due to the large weight given to B6. In summary, when using the distance weighting idea in the SVR prediction model, weighting the bands with a small degree of correlation can improve the prediction accuracy of the model [54].  The correlation of B6 with NH4-N decreases, which is due to the large weight given to B6. In summary, when using the distance weighting idea in the SVR prediction model, weighting the bands with a small degree of correlation can improve the prediction accuracy of the model [54].

Spatio-Temporal Distribution of the Three Nitrogen Compounds
Since the sea surface can be ice-covered in the winter and the air is cloudy in the summer in the study area, we selected high-quality images from the spring and autumn, and after data preprocessing, the images were input into the regression models for the

Spatio-Temporal Distribution of the Three Nitrogen Compounds
Since the sea surface can be ice-covered in the winter and the air is cloudy in the summer in the study area, we selected high-quality images from the spring and autumn, and after data preprocessing, the images were input into the regression models for the three inorganic nitrogen forms/species after training. The variation trends are similar to the results obtained by Li et al. [55] and Yang et al. [56]. The spatial and temporal distribution of NH4-N, NO2-N, and NO3-N in the spring and autumn from 2013 to 2022 are shown in Figure 8. We referred to Camargo et al. [13] to classify the distribution results into 10 classes of inorganic nitrogen forms/species toxicity. The first nine classes for NH4-N and NO2-N are low toxicity, but the 10th class is more toxic [32]. NO3-N has no toxicity but was also divided into 10 classes according to the minimum to maximum value. In 2013-2022, the average value of NH4-N in the spring was 0.040 mg/L and in the autumn, it was 0.045 mg/L. The average concentration of NO2-N was 0.003 mg/L in the spring and 0.004 mg/L in the autumn. The average concentration of NO3-N was 0.062 mg/L in the spring and 0.069 mg/L in the autumn. The average concentrations of the three nitrogen compounds are higher in the autumn than in the spring, which is mainly due to the influence of biological activities. Plankton growth is active in the spring and weakens in the autumn. As a result, the nitrogen consumption is high in the spring, so that the seawater has a higher content of the three nitrogen compounds in the autumn than in the spring. spring and 0.069 mg/L in the autumn. The average concentrations of the three nitrogen compounds are higher in the autumn than in the spring, which is mainly due to the influence of biological activities. Plankton growth is active in the spring and weakens in the autumn. As a result, the nitrogen consumption is high in the spring, so that the seawater has a higher content of the three nitrogen compounds in the autumn than in the spring. The lowest value of NH4-N content was 0 mg/L in both the spring and autumn of 2013. The lowest value was no longer 0 mg/L after the spring of 2014, indicating that the whole sea area was polluted by NH4-N after 2014. From 2014, the Changhai county government vigorously developed island tourism and mariculture activities. The NH4-N concentration in the whole sea area in the spring and autumn has increased since 2014, but The lowest value of NH4-N content was 0 mg/L in both the spring and autumn of 2013. The lowest value was no longer 0 mg/L after the spring of 2014, indicating that the whole sea area was polluted by NH4-N after 2014. From 2014, the Changhai county government vigorously developed island tourism and mariculture activities. The NH4-N concentration in the whole sea area in the spring and autumn has increased since 2014, but marine organisms in the aquaculture area need to absorb a lot of nitrogen to grow in the spring, so the area with a higher NH4-N concentration in spring has become less. NO2-N is an intermediate product of the nitrification reaction of NH4-N under conditions of sufficient oxygen, and NO3-N can also undergo denitrification to produce NO2-N in an anaerobic environment [38]. NO2-N reached a maximum of 0.015 mg/L in the spring of 2019, while the minimum value of 0 mg/L occurred each year. The formation of NO3-N has a lag, and human activities increased after 2014. In 2013-2014, human activities in the offshore waters of Liaodong Peninsula were weak, and the NO3-N in the spring had a period of low concentration. On the other hand, the DIN discharged from the Yalu River is five times higher than that of other rivers flowing into the North Yellow Sea waters [56], and NO3-N is the main form present. The volume of river water flowing into the sea in the spring is larger than that in the autumn due to the influence of rainfall, so the concentration on the east side of the study area is generally higher in the spring than in the autumn. In summary, the whole study area from 2013 to 2022 was enhanced by human activities after 2014 and subject to increased NH4-N pollution, but the NO2-N content was relatively stable and the NO3-N content was no greater than 0.18 mg/L, indicating that the seawater still maintained a stable state for self-cleaning and the seawater environment was in a relatively balanced state.
In addition to the characteristics of the temporal changes, the spatial distribution of the three nitrogen compounds shows a certain pattern. The southwestern part of the study area receives the influence of human activities on the one hand and the water exchange near the Bohai Strait on the other, making the NH4-N content higher than in the other areas. The central part features a marine reserve and a large volume of algae farming activities, making the NH4-N content in this area the least abundant. The northeastern part of the area is affected by the discharge of the Yalu River, and the NH4-N content increases slightly. The central part is close to the wide sea area away from the source of pollution, but with the changes in the flow of the ocean, it is also slightly polluted by NH4-N. NO2-N shows the characteristic of a gradually decreasing concentration from the northeast to the southwest, mainly because the organic matter in the Yalu River is particularly high and the oxygen consumption of suspended bacteria and organisms is also very high, and the lower the dissolved oxygen content, the higher the NO2-N concentration [14,35]. The NO3-N content decreases in the southwest of the study area with the residential islands as the center, and the NO3-N concentration increases in the east and northeast under the influence of the Yalu River discharge. The concentration decreases in the central offshore area with fewer human activities, while the concentration is lower in the central distant sea area away from frequent human activities.
In the NO3-N spatial and temporal variation maps for spring 2014, autumn 2014, autumn 2015, and autumn 2017, the study area on the east side shows a higher concentration of NO3-N. NO3-N is the end product of inorganic nitrogen purification, and NO3-N accumulates in the ocean until nitrogen-fixing organisms solidify NO3-N to further purify the water. The accumulated NO3-N concentration changes with the movement of the current. Figure 9 shows the superimposed flow field and NO3-N concentration in the spring of 2014 and 2019 in the eastern area. When the flow velocity decreases, the NO3-N flow will be weakened, and the NO3-N concentration will temporarily increase. Meanwhile, the NO3-N will disperse as the flow velocity increases and the flow direction changes, but this transient concentration increase still does not cause harm to the marine environment.

Influencing Factors for the Three Nitrogen Compounds
From the previous section, it can be inferred that human activities are the main influencing factors for NH4-N and NO3-N, while the DO content is the main influencing factor for NO2-N, so in this section, we analyze these two important influencing factors in depth. In order to better elaborate the influence of human activities on the three nitrogen compounds, Figure 10 divides the whole study area into four zones:

•
A is the southwest zone of the study area, which has a dense population and is close to the inshore area in the south of Liaodong Peninsula. • B is the offshore area in the middle of the study area, where there are several marine reserves and reservation areas and large areas of shellfish and algae farming. • C is the northeast zone of the study area, which is also one of the areas closest to the discharge of the Yalu River. tial islands as the center, and the NO3-N concentration increases in the east and northeast under the influence of the Yalu River discharge. The concentration decreases in the central offshore area with fewer human activities, while the concentration is lower in the central distant sea area away from frequent human activities. In the NO3-N spatial and temporal variation maps for spring 2014, autumn 2014, autumn 2015, and autumn 2017, the study area on the east side shows a higher concentration of NO3-N. NO3-N is the end product of inorganic nitrogen purification, and NO3-N accumulates in the ocean until nitrogen-fixing organisms solidify NO3-N to further purify the water. The accumulated NO3-N concentration changes with the movement of the current. Figure 9 shows the superimposed flow field and NO3-N concentration in the spring of 2014 and 2019 in the eastern area. When the flow velocity decreases, the NO3-N flow will be weakened, and the NO3-N concentration will temporarily increase. Meanwhile, the NO3-N will disperse as the flow velocity increases and the flow direction changes, but this transient concentration increase still does not cause harm to the marine environment.

Influencing Factors for the Three Nitrogen Compounds
From the previous section, it can be inferred that human activities are the main influencing factors for NH4-N and NO3-N, while the DO content is the main influencing factor for NO2-N, so in this section, we analyze these two important influencing factors in depth. In order to better elaborate the influence of human activities on the three nitrogen compounds, Figure 10 divides the whole study area into four zones: • A is the southwest zone of the study area, which has a dense population and is close to the inshore area in the south of Liaodong Peninsula. • B is the offshore area in the middle of the study area, where there are several marine reserves and reservation areas and large areas of shellfish and algae farming. • C is the northeast zone of the study area, which is also one of the areas closest to the discharge of the Yalu River.    Figure 10, there are 49 sea area functional zones (AFZs) in the study area, including reserved areas, port and shipping areas, industrial and urban sea use areas, marine protected areas, mineral and energy areas, tourism and recreation areas, agriculture and fishery areas, and special use areas. The SQRHM model can quantitatively analyze the degree of impact of human activities on marine ecosystems with the following equations: where I is the impact of a human activity on each spatial location, m is the number of spatial locations of human activity, i is the action point (the action point is the center of the human activity area), F i is the intensity of the first action point of the human activity, D i is the maximum impact distance of the first action point of the human activity, d i is the distance between the unit point and action point i of the human activity, I c denotes the combined impact of multiple human activities on the cell site, s denotes the presence of type s human activities, j denotes the jth human activity, I j is the impact of the jth human activity on the cell site, and W j is the weight of the jth human activity in the comprehensive evaluation [57]. Seven evaluation indicators were selected for the evaluation of the degree of impact of human activities on the marine ecosystem in the Changshan Islands sea area: reclaimed land area, aquaculture raft area, tourism development area, port navigation area, mineral resources area, marine protected area, and reserved area. The reclaimed land area is the cumulative reclaimed land area from 2013 to 2022, and the total reclaimed area is less than 10 km 2 . The aquaculture raft area, as shown in Figure 11, is the area of aquaculture rafts in 2022. The other data were calculated according to the planned area, and we referred to the results of the study of Li et al. [57] to assign the role intensity and weight in Table 7. The final results were classified into four levels from weak to strong, as shown in Table 8. The final degrees of human activity impact in 2022 in areas A-D were, respectively, very strong impact, strong impact, and medium impact. 10 km 2 . The aquaculture raft area, as shown in Figure 11, is the area of aquaculture rafts in 2022. The other data were calculated according to the planned area, and we referred to the results of the study of Li et al. [57] to assign the role intensity and weight in Table 7.
The final results were classified into four levels from weak to strong, as shown in Table 8.
The final degrees of human activity impact in 2022 in areas A-D were, respectively, very strong impact, strong impact, and medium impact.   Although both areas A and B are strongly affected by human activities, the results of the nitrogen concentration in areas A and B are different. The main impact of area B is the floating raft aquaculture in the surface layer. The large number of shellfish and seagrasses cultured in the surface layer of seawater can absorb NH4-N for growth. Secondly, there are many marine protected areas and reserved areas in area B with better environmental protection of seawater quality, and although land reclamation has been carried out in area B, the influence of the area is relatively small and low, so NH4-N pollution in Area B is low. On the other hand, area A has been greatly impacted by tourism development and pollution from harbor shipping, so the NH4-N pollution is more serious. In addition, NO3-N concentration is highest in the northeast of area A and the southwest of area B, i.e., centered on the residential islands, which are the areas where human activities have the strongest impact on the ecosystem. These areas are also the most sensitive areas for seawater pollution. Area C is strongly affected by human activities, and the NH4-N and NO3-N concentrations show a certain increase. Area D is less affected by human activities, and NH4-N and NO3-N pollution decrease with distance from human activities. In summary, human activities have a strong influence on NH4-N and NO3-N, and the more frequent the human activities, the stronger the pollution.

. Dissolved Oxygen Environment
The Yalu River, with an annual runoff of 32.76 billion m 3 , is the largest seaward river in the entire North Yellow Sea, and studies have shown that hypoxia often occurs at the mouth of the river due to its high oxygen consumption [58]. However, this phenomenon usually occurs at the bottom of the water body, while the DO content in the surface layer is more adequate, compared with the bottom layer. To prove that the spatial and temporal distribution of NO2-N concentration corresponds to the DO content, the average of the measured DO data in the surface layer of the seawater in areas A, B, and C in the spring and autumn of 2018-2020 was recorded.
As shown in Figure 12, the DO content in the surface seawater is ranked as A > B > C, which is the result of the unique geographical characteristics of the study area. Area A is closer to the wide sea than area C, where the wind is greater and the waves are bigger. As a result, the DO content in area A is higher than that in area C. The surface water quality is more serious and is closer to the mouth of the Yalu River, and the NO2-N concentration is A < B < C. Meanwhile, the DO content in the spring is higher than that in the autumn, showing obvious seasonal characteristics. The spring flood of the Yalu River caused greater river flow in the spring than that in the autumn of the same year. Thus, the greater river flow in the spring leads to a higher DO content; on the contrary, the DO content in the autumn is lower. In summary, the DO content of the surface water of the Changshan Islands and the distance to the mouth of the Yalu River are positively correlated, while the NO2-N concentration is negatively correlated with the DO content. Therefore, to control the NO2-N concentration within a certain range, the discharge of the Yalu River needs to be regulated. and autumn of 2018-2020 was recorded.
As shown in Figure 12, the DO content in the surface seawater is ranked as A > B > C, which is the result of the unique geographical characteristics of the study area. Area A is closer to the wide sea than area C, where the wind is greater and the waves are bigger. As a result, the DO content in area A is higher than that in area C. The surface water quality is more serious and is closer to the mouth of the Yalu River, and the NO2-N concentration is A < B < C. Meanwhile, the DO content in the spring is higher than that in the autumn, showing obvious seasonal characteristics. The spring flood of the Yalu River caused greater river flow in the spring than that in the autumn of the same year. Thus, the greater river flow in the spring leads to a higher DO content; on the contrary, the DO content in the autumn is lower. In summary, the DO content of the surface water of the Changshan Islands and the distance to the mouth of the Yalu River are positively correlated, while the NO2-N concentration is negatively correlated with the DO content. Therefore, to control the NO2-N concentration within a certain range, the discharge of the Yalu River needs to be regulated.

Conclusions
In this paper, a multiple weighted regression (S-WSVR) model taking into account spatial information has been proposed to monitor the continuous distribution of NH4-N, NO2-N, and NO3-N in the surface layer of the seawater in a mesoscale archipelagic environment, which smooths the spatial complexity of the mesoscale water quality distribution and retains more useful information in the characteristic bands. Based on the

Conclusions
In this paper, a multiple weighted regression (S-WSVR) model taking into account spatial information has been proposed to monitor the continuous distribution of NH4-N, NO2-N, and NO3-N in the surface layer of the seawater in a mesoscale archipelagic environment, which smooths the spatial complexity of the mesoscale water quality distribution and retains more useful information in the characteristic bands. Based on the Mahalanobis distance and mathematical and statistical analysis, the contribution of the different characteristic wavebands was calculated, and the spatial information was utilized as one of the input parameters. The accuracy of the experimental results for NH4-N, NO2-N, and NO3-N was better than that of the original model, with r-values of 0.9063, 0.8900, and 0.9755 and RMSEs of 0.2097 mg/L, 0.1230 mg/L, and 0.1573 mg/L, respectively, which represent an increase in r-value of 43.53%, 45.35%, and 52.10% and a decrease in RMSE of 0.0487 mg/L, 0.2977 mg/L, and 0.1571 mg/L, respectively, compared with the original model. The accuracy was improved the most by the spatial information, while the multivariate weighting resulted in a small improvement, which proves that the three nitrogen compounds are nonlinear and heterogeneous in spatial distribution, and the contributions of the characteristic bands in the calculation are different. Moreover, the inversion results for the three nitrogen compounds were summed and compared to the measured DIN concentration, obtaining an r-value of 0.9028. In addition, we also obtained the characteristic bands associated with Landsat 8 for the three nitrogen compounds: B2, B3, B4, B5, B6, and B7 for NH4-N; B3, B6, and B7 for NO2-N; and B5, B6, and B7 for NO3-N.
Furthermore, we input the 2013-2022 images into the S-WSVR model to plot the spatial and temporal distributions of NH4-N, NO2-N, and NO3-N. The distribution patterns of the three nitrogen compounds in the Changshan Islands area were then analyzed from the spatio-temporal distribution map. It was found that the Changshan Islands sea area has been polluted by DIN in the whole area since 2014, but the seawater is still in a relatively healthy state. The SQRHM analysis revealed that the intensity of human activities had a greater impact on NH4-N and NO3-N. Human daily life on the islands, tourism development, and harbor shipping were the main sources of pollution, and shellfish and algae on surface culture floating rafts were conducive to nitrogen purification. In addition, the water body discharged by the Yalu River is anoxic, and the closer the sea area is to the estuary of the Yalu River, the lower the DO content and the higher the NO2-N content. Overall, the pollution concentration of inorganic nitrogen decreases with the distance