Hyperspectral Remote Sensing of Phytoplankton Species Composition Based on Transfer Learning

Phytoplankton species composition research is key to understanding phytoplankton ecological and biogeochemical functions. Hyperspectral optical sensor technology allows us to obtain detailed information about phytoplankton species composition. In the present study, a transfer learning method to inverse phytoplankton species composition using in situ hyperspectral remote sensing reflectance and hyperspectral satellite imagery was presented. By transferring the general knowledge learned from the first few layers of a deep neural network (DNN) trained by a general simulation dataset, and updating the last few layers with an in situ dataset, the requirement for large numbers of in situ samples for training the DNN to predict phytoplankton species composition in natural waters was lowered. This method was established from in situ datasets and validated with datasets collected in different ocean regions in China with considerable accuracy (R2 = 0.88, mean absolute percentage error (MAPE) = 26.08%). Application of the method to Hyperspectral Imager for the Coastal Ocean (HICO) imagery showed that spatial distributions of dominant phytoplankton species and associated compositions could be derived. These results indicated the feasibility of species composition inversion from hyperspectral remote sensing, highlighting the advantages of transfer learning algorithms, which can bring broader application prospects for phytoplankton species composition and phytoplankton functional type research.


Introduction
Marine phytoplankton plays a crucial role in aquatic ecosystems [1]. It contributes to primary production, affects the abundance and diversity of marine organisms [2], and exerts influence on climate processes [3]. Therefore, a better understanding of phytoplankton greatly enhances our understanding of the carbon and nitrogen cycles in marine ecosystems [4,5]. Phytoplankton functional types are species groups with specific roles in biogeochemical cycles [6]. Because the physiological processes of different phytoplankton species and associated compositions are distinct from each other, each individual phytoplankton species and its associated composition information are fundamental to its functional type [7]. In addition, they are also indicative of variability in phytoplankton species diversity in the ocean [8]. With the development of satellite-based ocean color remote sensing, which provides the advantages of wide-range, long-term coverage, high efficiency, and low cost [9], phytoplankton species, composition, and related research from space have been carried out [10,11].
The inherent optical properties of different phytoplankton species or groups have been studied [12,13], and numerous efforts have been made to understand phytoplankton communities in aquatic systems with the advancement of ocean color remote sensing. Much research has been

Hyperspectral Radiometric Measurements
In situ radiometry measurements including the downwelling spectral irradiance (E d ), sky incoming spectral radiance (L s ), and total radiance of the water (L tot ) were carried out using the Hyperspectral Surface Acquisition System (HyperSAS, Satlantic corporation, Bellevue, WA, USA) by closely following NASA protocols [46]. The L tot and L s sensors were pointed to the sea and sky, respectively, at the same nadir and zenith angles between 30 • and 50 • , with an optimum of 40 • . To minimize the sun glint effect, the azimuthal angle of the sensors was set to be within 90 • -180 • away from the sun, with an optimum of 135 • [47,48]. R rs (λ) was then calculated according to Equation (1): where λ is the wavelength and ρ sky (λ) is the sky radiance spectral reflectance at wavelength λ. The sun glint correction was performed to R rs according to Busch et al. [49], and the corrected R rs spectra were interpolated into 1 nm intervals between a wavelength of 370-858 nm.

Taxonomic Species Identification
Samples were fixed with formaldehyde (5%) immediately after collection and transferred back for the laboratory analyses. In the laboratory, phytoplankton cells were first concentrated with 100 mL settlement columns for 24 to 48 h, then identified and counted using an inverted microscope (Olympus corporation, Tokyo, Japan) [50,51]; for the method, we referred mainly to Utermöhl [52]. A total of 242 phytoplankton species were identified, including 129 diatoms, 97 dinoflagellates, 5 chlorophytas, 5 chrysophytas, 4 cyanophytas, 1 xanthophyta, and 1 euglenophyta (the classification was mainly based on http://www.algaebase.org/); after species identification and cell counting, the composition of each species in each station was calculated.

Validation Dataset
For this study, data at 20 stations were selected as the validation dataset (Figure 1), including hyperspectral R rs data and concurrent data of phytoplankton species composition. In the algorithm validation stage (Section 4.2), by taking the preprocessed spectral data into the NN TL , we then acquired the 26 most abundant phytoplankton species, which together accounted for more than 90% of the cell abundance of all species. A detailed description of the 26 phytoplankton species is listed in Table 1, and the species names are sorted alphabetically. Eleven phytoplankton species, namely, three dinoflagellates (Prorocentrum dentatum, zooxanthella, and Karenia mikimotoi), six diatoms (Skeletonema costatum, Thalassiosira weissflogii, Chaetoceros debilis Cleve, Phaeodactylum tricornutum, Chaetoceros curvisetus, and Cyclotella cryptica), one cryptophyta (Heterosigma akashiwo), and one chlorophyte (Nannochloris sp.), which were frequently observed in Chinese ocean regions [7,8,15], were cultured in a laboratory incubator. The temperature was set at 18-20 • C, light intensity was 2500 lx, 12 h light and 12 h dark. The a ph (λ) spectra of phytoplankton species were measured using a Lambda-1050 UV/Vis Spectrophotometer (PerkinElmer corporation, Boston, MA, USA). The chlorophyll a concentration (C ph ) was measured using a F-2500 Fluorescence Spectrophotometer (Hitachi corporation, Tokyo, Japan). The representative mass-specific absorption spectra (normalized at 440 nm) of the 11 species are shown in Figure 2.

Rrs simulation dataset
Semi-analytical models were used to generate Rrs simulation dataset (Equations (2)-(10) in Table  2). The mass-specific absorption spectra of mixed algae a * ph_mix(λ) were calculated using Equation (2), where n1 is the total number of phytoplankton species, a * ph_i(λ) is the ith species' mass-specific absorption spectra (Section 2.2.1), wi is the composition of ith phytoplankton species, the sum of wi is 1 and can be interpreted as the contribution of each species to the total aph(λ), the range of Cspm was set to 0.1-200 g/m 3 , the range of Cph was set to 0.1-50 μg/L, and the range of ag(440) was set to 0.01-0.5 m −1 . After the variables were set, the Rrs spectra of mixed algae were calculated using Equations (2)- (10). In this study, a total of 200,000 Rrs spectra of mixture algae were generated, and the band range of simulated Rrs was 370-858 nm, with 1 nm intervals, which was consistent with the in situ measurements. Table 2. Rrs simulated formulas based on semi-analytical models in this study.

Eq.
Math Formula References Figure 2. Representative mass-specific absorption spectra (normalized at 440 nm) of 11 species commonly observed in Chinese ocean regions.

R rs Simulation Dataset
Semi-analytical models were used to generate R rs simulation dataset (Equations (2)-(10) in Table 2). The mass-specific absorption spectra of mixed algae a * ph_mix (λ) were calculated using Equation (2), where n1 is the total number of phytoplankton species, a * ph_i (λ) is the ith species' mass-specific absorption spectra (Section 2.2.1), w i is the composition of ith phytoplankton species, the sum of w i is 1 and can be interpreted as the contribution of each species to the total a ph (λ), the range of C spm was set to 0.1-200 g/m 3 , the range of C ph was set to 0.1-50 µg/L, and the range of a g (440) was set to 0.01-0.5 m −1 . After the variables were set, the R rs spectra of mixed algae were calculated using Equations (2)- (10). In this study, a total of 200,000 R rs spectra of mixture algae were generated, and the band range of simulated R rs was 370-858 nm, with 1 nm intervals, which was consistent with the in situ measurements. Table 2. R rs simulated formulas based on semi-analytical models in this study.

Satellite Data
One cloud-free HICO image [61] over the Changjiang Estuary was acquired on 28 March 2012 (H2012088004724.L1B_ISS) via the ocean color website (http://oceancolor.gsfc.nasa.gov/) ( Figure 1). L1B (Level 1 B: Top of atmosphere radiance) data were processed using SeaDAS software (https: //seadas.gsfc.nasa.gov/; version 7.4, NASA, Washington D.C., WA, USA) to first perform a geometric correction, and the atmospheric correction was made using the approach referred to in [62] to generate ocean color R rs . Then, the R rs spectra of each pixel was interpolated into 1 nm intervals between 370 and 858 nm. Thereafter, we smoothed the R rs spectra with a locally weighted scatterplot smoothing (LOWESS) filter [63], and the 2nd derivative spectra of smoothed R rs spectra were then calculated with a band separation of 27 nm [64] and normalized by the L2-norm (Section 3.2.1).

Introduction of Transfer Learning
Transfer learning is often used in the scenario, where training and validation data do not feature in the same space or follow the same distribution [65]. Taking a remote sensing application as an example, Jean et al. [66] used high-resolution satellite images and convolutional neural networks to predict poverty in five African countries; with limited training data about socioeconomic indicators, nighttime light intensities were used as a data-rich proxy by the transfer learning method. The results proved that the method is feasible. In the present study, we built a large and general R rs dataset (200,000 simulated spectra) with 11 common species, and the data volume was adequate to train a reasonable neural network; meanwhile, the phytoplankton species diversity was more abundant in the field measurement R rs (e.g., there were 242 phytoplankton species during the five cruise investigations from 2015 to 2018). The parameter-transfer approach was used by freezing the first few layers of the trained neural network for the simulation dataset, and updating the last few layers with in situ dataset. This can be accomplished because, for the vast majority of DNNs, the first few layers learn features that are more general and appear not to be specific to a particular dataset or task, and eventually transitions from general to specific in the last few layers [67].

Transfer Learning for Deep Neural Network Construction
As described in Section 2.2.2, a large general simulation dataset was first generated, and a DNN was subsequently trained using this simulation dataset. However, the trained neural network was not directly applicable for the in situ dataset because of the differences in the numbers, as well as the types of phytoplankton species between the simulation and the in situ datasets. To cope with this challenge, we used the transfer learning technique to transfer part of the knowledge learned from the simulation dataset to the prediction model for the in situ dataset.

Preprocessing for Input Data
Several preprocessing procedures were taken for input data. First, a LOWESS filter (fraction: 0.1; weight function: quadratic function; iterations: 2) was applied to the raw R rs spectra to minimize random noises [63]. Second, to enhance detailed information about small spectral variations, the 2nd derivative spectra of smoothed R rs spectra (R rs ) were calculated according to Equation (11) [68]: where ∆λ = λ k −λ j = λ j −λ i , is the band separation, which was set to 27 nm in this study according to Torrecilla et al. [64]. Additionally, to emphasize the shape of the spectra rather than its magnitude, each derivative spectrum was normalized by its L2-norm.

Architecture
A five-layer neural network architecture for the simulation dataset including an input, output, and three hidden layers, i.e., NN sim , was developed. The activation functions of the first two hidden layers were set to be ReLU [69,70], and for the last hidden and output layers, the activation function was set to be Sigmoid and Softmax [71,72], respectively. The dimension of the input layer was 435, the same length as the normalized derivative spectra, and the dimension of the output layer was 11, corresponding to the number of phytoplankton species in the simulation dataset. The dimensions for the hidden layers were 256, 64, and 32, respectively, and the simulation dataset was split into training (90%) and validation (10%) sets. When fitting the model, 90% of the training set was randomly chosen for training and the remaining 10% was used for testing; the Adam optimizer was used [73] and the loss function was set to MAE (mean absolute error); and the training procedure stopped after 800 epochs with a batch size of 512.
The NN TL architecture was defined as the same as that of NN sim , except that the dimension of the output layer was the species number of the in situ dataset, and the batch size was 5. The weights of the first three layers were copied from NN sim and were frozen during the following training procedure, whereas the weights of the last two layers were set to be trainable. Among the 183 in situ samples, 163 were used for training and 20 samples were used for validation (Section 4.2), and at each epoch, NN TL was trained on 130 random choices of samples (80% of the 163 samples) and tested on the remaining 33 samples (20% of the 163 samples). The training procedure stopped after 250 epochs. The flowchart of the method is shown in Figure 3.
The parameter settings in the model training (data distribution ratios for the simulation dataset and the in situ dataset, the number of dimensions for three hidden layers, the number of the batch size, the number of epochs, the activation functions) in the NN sim and NN TL are the parameter adjustment results. The standards of adjustments are the convergence performance of the loss function in the model training. We conducted tests and found the optimal parameters in this study, as shown in Table 3.
Remote Sens. 2019, 11, x FOR PEER REVIEW 8 of 22 10,066 were learnable and 137,280 were non-learnable. The neural networks were stored as a JavaScript Object Notation (.json) file, and can be called by Python conveniently.

Accuracy Evaluation
The model performance was evaluated in terms of mean absolute error (MAE), root mean squared error (RMSE), and mean absolute percentage error (MAPE) according to Equations (12)- (14), respectively:  The program was coding with keras 2.2.4 (using TensorFlow backend) and Python 3.6.3, and running on a personal computer with CoreTM i7 processor (Intel Corporation, City of Santa Clara, CA, USA) and 20 GB Random Access Memory (RAM). Training the NN sim took about 2 h and 12 min, and training the NN TL took about 3 min. The layers were fully connected. In NN sim , there were 139,723 weights in total and all weights were learnable; in NN TL , there were 147,346 weights in total and 10,066 were learnable and 137,280 were non-learnable. The neural networks were stored as a JavaScript Object Notation (.json) file, and can be called by Python conveniently.

Accuracy Evaluation
The model performance was evaluated in terms of mean absolute error (MAE), root mean squared error (RMSE), and mean absolute percentage error (MAPE) according to Equations (12)- (14), respectively: where m is the number of species and n is the number of samples in the training and validation process, P i,j prd is the predicted composition of species j in sample i, and P i,j true is the true composition of species j in sample i.

Transfer Learning and Neural Network Test
The DNN with transfer learning was tested; in addition, the conventional DNN was considered for comparative analysis. As shown in Figure 4, this includes mainly three parts: NN sim (simulation dataset), NN TL (combined with simulation dataset and in situ dataset), and conventional DNN (combined with simulation dataset and in situ dataset). The convergence process of NN sim is shown in Figure 4a, where MAE decreased rapidly in the first 20 epochs, the rate of descent flattened out with increasing epoch numbers, and we found that, after 800 epochs, MAE was less than 1%, and NN sim tended to be stable. By the transfer learning method, the trained parameters of first few layers in NN sim were preserved, after updating the last few layers with the in situ dataset, NN TL was well constructed (Figure 4b); the convergence process of NN TL was similar to that of NN sim and, after 250 epochs, the MAE of the test set was stable around 4%. The convergence process of the conventional DNN is shown in Figure 4c; the convergence process of the conventional DNN failed due to the enlargement of the MAE with the increasing epoch numbers.   Further tests of NN sim were conducted. For the randomly selected 10% of the simulation dataset (there are 20,000 spectra), the predicted compositions versus the true compositions are shown in Figure 5. The statistical indicators are as follows: R 2 = 0.97, MAE = 0.52%, MAPE = 43.62%, and RMSE = 0.90% on average. The predicted accuracy is acceptable.

Phytoplankton Species Composition Prediction and Validation
Through comparing the in situ and NN TL -predicted compositions (%) in the validation dataset, 26 species were acquired (Section 2.1.3). All species compositions and the sum of the rest of the species compositions ("others") in each station are presented in the form of stacked bars ( Figure 6). Different color columns represent different types of phytoplankton and the height of one column represents the composition of that phytoplankton. At some stations (e.g., stations 7, 8, 9, 13 and 20), specific species (S. costatum, P. delicatissima, P. dentatum, T. thiebaultii) were predominant, whereas at other stations, multiple species co-existed at similar ratios. Generally, NN TL -predicted species compositions were highly consistent with the in situ measurements.
color columns represent different types of phytoplankton and the height of one column represents the composition of that phytoplankton. At some stations (e.g., stations 7, 8, 9, 13 and 20), specific species (S. costatum, P. delicatissima, P. dentatum, T. thiebaultii) were predominant, whereas at other stations, multiple species co-existed at similar ratios. Generally, NNTL-predicted species compositions were highly consistent with the in situ measurements.

Phytoplankton Species Composition Prediction from HICO
Through performing HICO data preprocessing (Section 2.3), the normalized 2nd derivative spectra data in each pixel of HICO were obtained, and the dimension of the spectra were kept consistent with the input layer in the NNTL. In the model output phase, we acquired the 12 most abundant species, which together accounted for more than 99% of the cell abundance of all species color columns represent different types of phytoplankton and the height of one column represents the composition of that phytoplankton. At some stations (e.g., stations 7, 8, 9, 13 and 20), specific species (S. costatum, P. delicatissima, P. dentatum, T. thiebaultii) were predominant, whereas at other stations, multiple species co-existed at similar ratios. Generally, NNTL-predicted species compositions were highly consistent with the in situ measurements.

Phytoplankton Species Composition Prediction from HICO
Through performing HICO data preprocessing (Section 2.3), the normalized 2nd derivative spectra data in each pixel of HICO were obtained, and the dimension of the spectra were kept consistent with the input layer in the NNTL. In the model output phase, we acquired the 12 most abundant species, which together accounted for more than 99% of the cell abundance of all species

Phytoplankton Species Composition Prediction from HICO
Through performing HICO data preprocessing (Section 2.3), the normalized 2nd derivative spectra data in each pixel of HICO were obtained, and the dimension of the spectra were kept consistent with the input layer in the NN TL . In the model output phase, we acquired the 12 most abundant species, which together accounted for more than 99% of the cell abundance of all species in each pixel on average. The predicted phytoplankton compositions of these 12 species and the sum of the remaining species ("others") are shown in Figure 8. The composition of each species varied greatly from each other and, for the same species, the composition varied spatially: among the 12 species, P. dentatum occupied the largest composition from the whole image, whereas S. costatum and P. delicatissima accounted for relatively large proportions near the Changjiang Estuary.
There were no coincident matchups between satellite and in situ observations to validate the predicted species composition. However, Song et al. [74] reported phytoplankton species distributed in this region through a cruise survey carried out from 22 to 28 May 2012, and the sampling range (30.5-32 • N, 121.5-123.5 • E) basically overlaps with the coverage of HICO. There are five of ten dominant species identified by light microscope in the survey of Song et al. [74], which are consistent with our HICO-predicted results (P. dentatum, Paralia sulcata, P. delicatissima, S. costatum, and S. trochoidea). This investigation validated our predicted results to some extent.
The phytoplankton species compositions are complex and variable, and which are affected by various environmental factors (e.g., temperature, transparency and nutrients) [74]. For example, the ratio of nitrogen to phosphorus (N/P) is considered an important factor influencing community structure; as the N/P increases, the composition of dinoflagellate increases and the composition of diatom decreases [75]. These factors linked to the remote sensing of phytoplankton species composition will be explored in the future. P. delicatissima accounted for relatively large proportions near the Changjiang Estuary.
There were no coincident matchups between satellite and in situ observations to validate the predicted species composition. However, Song et al. [74] reported phytoplankton species distributed in this region through a cruise survey carried out from 22 to 28 May 2012, and the sampling range (30.5-32° N, 121.5-123.5° E) basically overlaps with the coverage of HICO. There are five of ten dominant species identified by light microscope in the survey of Song et al. [74], which are consistent with our HICO-predicted results (P. dentatum, Paralia sulcata, P. delicatissima, S. costatum, and S. trochoidea). This investigation validated our predicted results to some extent.
The phytoplankton species compositions are complex and variable, and which are affected by various environmental factors (e.g., temperature, transparency and nutrients) [74]. For example, the ratio of nitrogen to phosphorus (N/P) is considered an important factor influencing community structure; as the N/P increases, the composition of dinoflagellate increases and the composition of diatom decreases [75]. These factors linked to the remote sensing of phytoplankton species composition will be explored in the future.

Transfer Learning for Phytoplankton Community
In Section 4.2, the phytoplankton composition inversion at the species level by transfer learning-based DNN was validated, and the predicted results are acceptable (R 2 = 0.88, MAE = 3.38%, RMSE = 4.4%, and MAPE = 26.08%). Whether the method is applicable to inversion of phytoplankton composition at the community level is discussed below with respect to related tests and comparative analysis.

DNN Tests for Phytoplankton Community Composition
As mentioned in Section 2.2.1, four communities (11 species) were cultured in the laboratory, and the absorption spectral data were reorganized at the community level. Similar to the research process at the species level, a NNC sim (DNN based on the simulation dataset at the community level) was constructed. Applying the NNC sim to randomly selected 10% of the simulation dataset at the community level, the predicted results are shown in Figure 9, and the statistical indicators are as follows: R 2 = 0.99, MAE = 0.43%, RMSE = 0.84%, and MAPE = 17.03% on average.
Following the DNN tests at the species level, the results are shown in Figure 5, and the statistical indicators are as follows: R 2 = 0.97, MAE = 0.52%, MAPE = 43.62%, and RMSE = 0.90% on average. Compared with the predicted accuracy at the species level (MAPE = 43.62%), the accuracy at the community level (MAPE = 17.03%) is improved significantly.

DNN Tests for Phytoplankton Community Composition
As mentioned in Section 2.2.1, four communities (11 species) were cultured in the laboratory, and the absorption spectral data were reorganized at the community level. Similar to the research process at the species level, a NNCsim (DNN based on the simulation dataset at the community level) was constructed. Applying the NNCsim to randomly selected 10% of the simulation dataset at the community level, the predicted results are shown in Figure 9, and the statistical indicators are as follows: R 2 = 0.99, MAE = 0.43%, RMSE = 0.84%, and MAPE = 17.03% on average.
Following the DNN tests at the species level, the results are shown in Figure 5, and the statistical indicators are as follows: R 2 = 0.97, MAE = 0.52%, MAPE = 43.62%, and RMSE = 0.90% on average. Compared with the predicted accuracy at the species level (MAPE = 43.62%), the accuracy at the community level (MAPE = 17.03%) is improved significantly.

Validation for Phytoplankton Community Composition
A transfer learning-based DNN, combined the simulation dataset with in situ dataset at the community level, short as NNCTL, was also established. The in situ and NNCTL-predicted

Validation for Phytoplankton Community Composition
A transfer learning-based DNN, combined the simulation dataset with in situ dataset at the community level, short as NNC TL , was also established. The in situ and NNC TL -predicted compositions (%) in the validation dataset (Section 2.1) are shown in Figure 10. The NNC TL -predicted seven community compositions (dinoflagellate, chrysophyta, chlorophyta, xanthophyta, cyanophyta, euglenophyta, and diatom) are highly consistent with the in situ measurements. In addition, the statistical indicators of seven communities predicted were calculated ( Figure 11): there are 63 points in total (phytoplankton community composition greater than 0), indicating R 2 = 0.99, MAE = 0.33%, MAPE = 1.74%, and RMSE = 1.28%.  Through comparing results shown in Figures 6 and 10, the predicted results at the community level are more consistent with the in situ measurement than those at the species level, which is also proved by comparing results shown in Figures 7 and 11. Compared with the predicted accuracy at the species level (MAPE = 26.08%), the predicted accuracy improved by an order of magnitude at the community level (MAPE = 1.74%). Because the types and concentrations of pigments in different species are different, this results in the subtle absorption differences. The pigment difference of different communities is greater than different species. Thus, the DNN may have fewer errors for the prediction at the community level.

Sensitivity Analysis
Sensitivity analyses of the transfer learning-based DNN were performed under different conditions at the species and community levels. Initially, the optical active components' effects were considered ( Figure 12). It can be seen that, among the three optical active components (C spm , C ph , a g (440)) considered, the predicted MAPE (%) increased as C spm increased (Figure 12a), and the variation range was reasonable (maximum of less than 35% at the species level, maximum of less than 17% at the community level) [34], even under relatively high SPM concentrations (180-200 g/m 3 ), which revealed the strong robustness to the SPM of our method. This is very important for phytoplankton species/community composition inversion in optical complex waters, such as in the Changjiang Estuary, where the SPM was dominant in optical active components [76]. Thereafter, MAPE (%) decreased as C ph increased ( Figure 12b); because C ph can be regarded as an indicator of phytoplankton biomass [77], our results indicated that the model performance was better in high biomass conditions. However, it was even more remarkable that, under lower C ph conditions, our model still worked well (MAPE less than 35% at the species level, MAPE less than 15% at the community level); therefore, the model feasibility in Case 1 waters is predictable. The CDOM (in terms of a g (440)) had small effects on the model performance (Figure 12c) and the range of variation was less than 4%, possibly because the optical contribution of the CDOM to R rs was relatively small. All these conclusions prove that our method has reliable prediction results in different water environments.
Remote Sens. 2019, 11, x FOR PEER REVIEW 15 of 22 considered ( Figure 12). It can be seen that, among the three optical active components (Cspm, Cph, ag(440)) considered, the predicted MAPE (%) increased as Cspm increased (Figure 12a), and the variation range was reasonable (maximum of less than 35% at the species level, maximum of less than 17% at the community level) [34], even under relatively high SPM concentrations (180-200 g/m 3 ), which revealed the strong robustness to the SPM of our method. This is very important for phytoplankton species/community composition inversion in optical complex waters, such as in the Changjiang Estuary, where the SPM was dominant in optical active components [76]. Thereafter, MAPE (%) decreased as Cph increased (Figure 12b); because Cph can be regarded as an indicator of phytoplankton biomass [77], our results indicated that the model performance was better in high biomass conditions. However, it was even more remarkable that, under lower Cph conditions, our model still worked well (MAPE less than 35% at the species level, MAPE less than 15% at the community level); therefore, the model feasibility in Case 1 waters is predictable. The CDOM (in terms of ag(440)) had small effects on the model performance (Figure 12c) and the range of variation was less than 4%, possibly because the optical contribution of the CDOM to Rrs was relatively small. All these conclusions prove that our method has reliable prediction results in different water environments. The effects of the spectral resolution and signal-to-noise ratio (SNR) to the transfer learningbased DNN at the species and community levels were analyzed synchronously. Different proportions of random noise were added to the simulated and in situ spectral data (at the species and community levels) simultaneously, and the SNR was set as 100, 200, 500, 1000 and +∞ separately. The simulated hyperspectral Rrs and in situ Rrs with 1 nm bandwidth was resampled at different bandwidths synchronously, and the bandwidth was set from 1 to 20 nm and increased at 1 nm intervals. Then, we conducted data preprocessing as described in Section 3.2.1, and the transfer learning-based DNN for specific spectral resolution and SNR at the species and community levels was retrained. Afterward, the transfer learning-based DNN was used to predict phytoplankton species/community compositions under the corresponding spectral resolution and SNR conditions. The results of the accuracy evaluation are shown in Figure 13 (a at the species level and b at the community level). It was found that the MAPE increased with increasing bandwidth and decreased with the increasing SNRs. For the bandwidth greater than 5 nm (species level) and greater than 14 nm (community level), the MAPE increased significantly. The MAPE was 29.61%, 31.46%, 34.56%, and 37.69% (bandwidth equal to 5 nm at the species level), and it was 16.24%, 22.59%, 31.19%, and 34.34% (bandwidth equal to 14 nm at the community level), corresponding to 1000, 500, 200, and 100 of the SNR, respectively. Therefore, for ocean color satellite sensors in the future [78], sensors with bandwidths less than 5 nm at the species level, sensors with bandwidths less than 14 nm at the community level, and concurrently with higher SNR should be recommended. The effects of the spectral resolution and signal-to-noise ratio (SNR) to the transfer learning-based DNN at the species and community levels were analyzed synchronously. Different proportions of random noise were added to the simulated and in situ spectral data (at the species and community levels) simultaneously, and the SNR was set as 100, 200, 500, 1000 and +∞ separately. The simulated hyperspectral R rs and in situ R rs with 1 nm bandwidth was resampled at different bandwidths synchronously, and the bandwidth was set from 1 to 20 nm and increased at 1 nm intervals. Then, we conducted data preprocessing as described in Section 3.2.1, and the transfer learning-based DNN for specific spectral resolution and SNR at the species and community levels was retrained. Afterward, the transfer learning-based DNN was used to predict phytoplankton species/community compositions under the corresponding spectral resolution and SNR conditions. The results of the accuracy evaluation are shown in Figure 13 (a at the species level and b at the community level). It was found that the MAPE increased with increasing bandwidth and decreased with the increasing SNRs. For the bandwidth greater than 5 nm (species level) and greater than 14 nm (community level), the MAPE increased significantly. The MAPE was 29.61%, 31.46%, 34.56%, and 37.69% (bandwidth equal to 5 nm at the species level), and it was 16.24%, 22.59%, 31.19%, and 34.34% (bandwidth equal to 14 nm at the community level), corresponding to 1000, 500, 200, and 100 of the SNR, respectively. Therefore, for ocean color satellite sensors in the future [78], sensors with bandwidths less than 5 nm at the species level, sensors with bandwidths less than 14 nm at the community level, and concurrently with higher SNR should be recommended.

Potentials and Limitations
Original domain (simulation dataset) and target domain (in situ dataset) exist in terms of our work, and there are similarities between these domains but they are not identical, which satisfies the precondition for knowledge transfer [65]. We used a mixture of the mass-specific absorption coefficients of 11 phytoplankton species cultures as the input for the simulation dataset, and the predicted results in natural waters were not limited to these 11 species. In addition, the associated composition percentages were also different; in fact, during the learning process for the general simulation dataset, there were about 140,000 nonlinear parameters to solve. Therefore, the general knowledge of phytoplankton species and associated composition was learned well. The optical properties of different species cultivated in the laboratory and natural waters were different, which was related to the associated abundance and growth conditions [79], but they shared similar spectral shapes. This is a disadvantage for traditional methods for distinguishing between them, but is beneficial for general knowledge transfer, as in our study.

Potentials and Limitations
Original domain (simulation dataset) and target domain (in situ dataset) exist in terms of our work, and there are similarities between these domains but they are not identical, which satisfies the precondition for knowledge transfer [65]. We used a mixture of the mass-specific absorption coefficients of 11 phytoplankton species cultures as the input for the simulation dataset, and the predicted results in natural waters were not limited to these 11 species. In addition, the associated composition percentages were also different; in fact, during the learning process for the general simulation dataset, there were about 140,000 nonlinear parameters to solve. Therefore, the general knowledge of phytoplankton species and associated composition was learned well. The optical properties of different species cultivated in the laboratory and natural waters were different, which was related to the associated abundance and growth conditions [79], but they shared similar spectral shapes. This is a disadvantage for traditional methods for distinguishing between them, but is beneficial for general knowledge transfer, as in our study.
Following the transfer learning method, we first used the in situ dataset to update the last layers in the corresponding DNN, which transitions the species information from general to specific as the number of layers increases, and then obtained predictions for phytoplankton species and associated compositions in natural waters. The output of our method is dynamic, and the dominant species and numbers depend on the Rrs spectra and total percentages we set up. For example, in the neural network model for the field measurements (NNTL; Section 4.2), there were 26 species, which together accounted for more than 90% of the cell abundance of all species, and when applied to the HICO imagery (Section 4.3), 12 species were acquired, which totaled 99% of the total cell abundance.
We demonstrated the feasibility of the transfer learning approach in predicting phytoplankton species composition in natural waters. Further improvements could be made to this work. First, as the environment between laboratory measurements and in situ measurements is different [77], more effort should be placed on establishing a more complete simulation dataset, following these steps: 1. More species must be cultivated in the laboratory for optical properties research; 2. The optical properties of mixed species must be taken into consideration; 3. Light, nutrition, and related growth parameters should be connected with optical properties and finally incorporated into the simulation dataset [80]. Second, many factors affect the signal transmission from natural waters to the satellite sensor and much work needs to be done; for example, spectral classification is an effective method for deriving useful information [81], accurate atmospheric correction is also an important research Following the transfer learning method, we first used the in situ dataset to update the last layers in the corresponding DNN, which transitions the species information from general to specific as the number of layers increases, and then obtained predictions for phytoplankton species and associated compositions in natural waters. The output of our method is dynamic, and the dominant species and numbers depend on the R rs spectra and total percentages we set up. For example, in the neural network model for the field measurements (NN TL ; Section 4.2), there were 26 species, which together accounted for more than 90% of the cell abundance of all species, and when applied to the HICO imagery (Section 4.3), 12 species were acquired, which totaled 99% of the total cell abundance.
We demonstrated the feasibility of the transfer learning approach in predicting phytoplankton species composition in natural waters. Further improvements could be made to this work. First, as the environment between laboratory measurements and in situ measurements is different [77], more effort should be placed on establishing a more complete simulation dataset, following these steps: 1. More species must be cultivated in the laboratory for optical properties research; 2. The optical properties of mixed species must be taken into consideration; 3. Light, nutrition, and related growth parameters should be connected with optical properties and finally incorporated into the simulation dataset [80]. Second, many factors affect the signal transmission from natural waters to the satellite sensor and much work needs to be done; for example, spectral classification is an effective method for deriving useful information [81], accurate atmospheric correction is also an important research direction [82], and if the atmospheric correction can be done well, combined with other advantages (sensor hardware situations, inversion algorithms, etc.), an accurate inversion of phytoplankton information from space is feasible. In addition, the inversion accuracy still needs to be improved, although transfer learning methods can make up for the deficiency of an in situ dataset to some extent; inversion accuracy could be optimized by increasing the amount of data, especially in situ-measured data. We used semi-analytical models [54,55] for building the simulation R rs dataset, instead of a more accurate radiative transfer numerical model (e.g., HydroLight) [83]. However, HydroLight requires some time. For instance, we would need more than two months to build the simulation dataset, if 30 s was necessary for 1 spectrum on average, whereas the semi-analytical method only takes a few hours.
As the previous study undertaken by Chen [58] has shown, compared with in situ-measured data, semi-analytical models show an underestimation in comparison with HydroLight. Thus, inversion accuracy would possibly decrease more than in a numerical model.

Conclusions
In the present study, we presented a machine-learning based phytoplankton species composition inversion method using hyperspectral optical data. First, we trained a deep neural network (NN sim ) based on learning from the simulated R rs dataset; second, we reformed the NN sim into an in situ neural network (NN TL ) through the introduction of an in situ R rs dataset using transfer learning. Using the NN TL , the validation of the predicted results of phytoplankton species composition shows acceptable results in different ocean regions and different cruise campaigns, indicating R 2 = 0.88, MAPE = 26.08%, MAE = 3.38%, and RMSE = 4.4%.
Assuming that the types and optical properties of phytoplankton species in one region do not vary considerably, the NN TL can be applied to other hyperspectral measurements, such as the Hyperspectral Imager for the Coastal Ocean (HICO). The HICO-predicted results indicated that the compositions of different phytoplankton species varied considerably between and within species, with uneven spatial distributions. For the Changjiang Estuary, P. dentatum accounted for the largest average composition.
The performances of the transfer learning-based DNN at the species and community levels were analyzed comparatively, and the inversion accuracy at the community level was better than that at the species level (17.03% versus 43.62% in the randomly selected 10% of the simulation dataset, 1.74% versus 26.08% in the validation dataset). Sensitivity analyses indicated that the MAPE increased as SPM concentration increased and chlorophyll a concentration decreased. The signal-to-noise (SNR) and bandwidth also affected the predicted accuracy. The bandwidth smaller than 5 nm at the species level, smaller than 14 nm at the community level, and higher SNR should be suggested for the acceptable accuracy.
As the physics involved in all three methods is different (simulation dataset, in situ measured dataset, and HICO data), we analyzed the effective methods for improving differences between three datasets used in this article, by building a more accurate and complete simulation dataset, accurate atmospheric correction, etc. Moreover, as we only used an in situ dataset collected from five cruises in spring and summer, the types and optical properties of phytoplankton species may change in different seasons. To make the method more applicable, collecting more in situ data should be included in future work. Although the HICO ended operation in 2014, successive hyperspectral satellite missions (e.g., Chinese GF-5 (launched in 2018), scheduled NASA PACE and German EnMAP, etc.) may allow us to have more opportunities of matchups between satellite and in situ data for validation in the future.