## 1. Introduction

New Caledonia is a South Pacific archipelago located between longitudes 162° and 169° and latitudes −23° and −19°. The New Caledonian lagoon, which extends over 24,000 km

^{2}, contains one of the most extensive reef systems in the world. These systems exhibit exceptional diversity of coral and fish species and a continuum of habitats from mangroves to sea grasses [

1]. UNESCO added the New Caledonia Barrier Reef to the World Heritage List on 7 July 2008 [

2], emphasizing the importance of preserving such biodiversity sites.

However, this fragile environment is subject to both anthropic and environmental stresses. Nickel mining is the major sector of the economy in New Caledonia. The various islands contain about 10% of the world’s nickel reserves [

3]. The massive erosion caused by the mineral extraction process produces mineral matter inputs into the coastal ecosystem, particularly during rainy events [

4]. These coastal inputs sometimes lead to fish death, coral bleaching,

etc. [

5]. Thomas

et al. quantified farm discharges and evaluated their impacts in the lagoon of New Caledonia [

6] where chlorophyll-

a concentration ([chl-

a]) is generally lower than 1.2 µg·L

^{−1} except in bays subject to anthropogenic influences where [chl-

a] may increase to 3.6 µg·L

^{−1} [

7].

Acanthaster planci (a coral-eating starfish) proliferation is probably due to algae proliferation, which is itself due to increased anthropogenic inputs, and a recent study highlighted a link between

Acanthaster outbreaks and ocean productivity, favored by upwelling increased due to wind forcing [

8].

Such increases in chlorophyll (up to 2 µg·L

^{−1} observed in the South Western lagoon) are either linked to rain [

9,

10] or to other processes such as upwelling or tides [

11,

12], which were recently modelled [

13,

14]. Climate change is also a factor of stress for reefs and lagoon ecosystems. Increase of ocean temperature, acidity, overexposure to sunlight and decrease in salinity affect the rate at which lagoons lose or gain water from evaporation, precipitation, surface runoff, and exchange with the ocean, and therefore water quantity and quality. Disturbances and other stressors may act concomitantly, or even interact, at multiple spatial and temporal scales, with consequences already documented or expected for the physical structure, ecological properties, and social values associated with lagoons. Many coral reefs in the world already suffer from climate and anthropogenic changes. Since 1985, Great Barrier Reef in Australia has lost more than half of its coral meadows [

5]. Coral bleaching events happened in 1998 and 2002: more than 60% of the coral populations were hit, and even though the situation improved after several weeks, about 10% of the population perished [

5,

15,

16]. To avoid or to monitor such events, it is necessary to accurately assess water properties in terms of chemical, biogeochemical and thermal characteristics, among which is chlorophyll concentration, an indicator of phytoplankton biomass.

Empirical algorithms have been developed in order to predict [chl-

a] from ocean color seen by satellites [

17,

18]. The NASA OC3 algorithm uses a relation between [chl-

a] and logarithms of R

_{rs} ratios. The relation used for MODIS imagery is a polynomial function of the maximum R

_{rs} ratio in spectral bands centered on 443 and 555 nm and 488 and 555 nm. This algorithm is valid in oceanic waters where a change in [chl-

a] mainly causes a shift in the blue to green water reflectance ratio [

19]. By using a color index defined as the difference between R

_{rs} in the green and a reference formed linearly between R

_{rs} in the blue and in the red, [

20] improved OC3 assessments in global ocean where [chl-

a] is less or equal to 0.25 µg·L

^{−1}. In coastal waters, the ratios used in these algorithms can vary due to the influence of other optical components (colored dissolved organic matter—CDOM—and suspended sediments) besides [chl-

a], which may introduce large errors in [chl-

a] retrievals from satellite data [

21,

22]. Recently, a round-robin exercise for MERIS was conducted for some coastal areas of the world including the Great Barrier Reef, giving clues for discrepancies noticed in optically complex waters [

23]. In coastal waters with high [chl-

a] (10 µg·L

^{−1}), algorithms based on regressions between MODIS reflectance ratios and [chl-

a] improved the [chl-

a] estimates [

24]. Supervised learning was also used to develop an algorithm adapted to a coastal eutrophic region (between 5 and up to 50 µg·L

^{−1} chl-

a) [

25,

26]. Camps-Valls

et al. [

27] showed how to improve the use of Support Vector Regression (SVR) or Relevance Vector Machine (RVM) [

28] to estimate oceanic [chl-

a] from remote sensing.

Past inter-comparison studies [

29,

30] reported negative or positive biases in [chl-

a] retrievals with OC3 applied in lagoon imagery of New Caledonia, leading us to investigate a new algorithm. Average depth is 25 m in the lagoon and waters are principally oligotrophic, with weak [chl-

a] and turbidity [

29,

31]. The changing bathymetry, very variable bottom types, and the oligotrophic and shallow nature of these waters are sources of errors made with OC3 applied to MODIS data [

29]. In such shallow waters, R

_{rs} not only depends on the absorption and scattering properties of dissolved and suspending materials in the water column but also on the depth and reflectivity of the bottom [

32]. Recently, an inversion method was developed for the lagoon waters of New Caledonia [

33]. A recent study has shown that depth strongly influences [chl-

a] assessments [

34]. Moreover, the influence of attenuation on seabed reflectance and exact bathymetry retrievals has been defined for the New Caledonia lagoon [

35]. Thus, NASA products based on OC3 are not adapted to New Caledonia coastal waters.

Another [chl-

a] algorithm, OC5 [

36], was tested for New Caledonian lagoon waters. However, OC5 was especially designed for the very turbid waters of the shallow Brittany coast or for Tunisian waters [

37] with high [chl-

a]. This means it is not well-suited for the oligotrophic waters of New Caledonia. Moreover OC5 sets a minimum [chl-

a] value of 0.1 µg·L

^{−1}, but lower values are sometimes encountered in the New Caledonia lagoon [

38] (see also below).

A preliminary study [

39], inspired from encouraging results described in [

25,

26,

27,

28], showed that a statistical approach could improve [chl-

a] retrievals in coastal New Caledonia waters. Indeed, when using a statistical approach, we expect to take into account particularities of optical properties in the study region. Moreover, even though atmospheric correction algorithms are generally not accurate for coastal applications and could lead to large [chl-

a] errors, a statistical approach could overcome such problems for our specific area. In this paper, we show how to design a semi-empirical algorithm for estimating [chl-

a] in the New Caledonian lagoon from

in situ data collected in the region in coincidence with MODIS data from 2002 to 2010 [

38]. The resulting algorithm is compared with the NASA OC3 version 6. No comparison was done with the reflectance difference algorithm [

20] because our focus is on lagoon waters. The addition of variables such as bathymetry [

32,

34] and coastal distance [

7] is also tested in order to investigate their ability to improve [chl-

a] retrievals.

## 2. Material and Methods

#### 2.1. Data

Two databases are used in this study: world data from SeaWIFS Bio-optical Archive and Storage System (SeaBASS:[

40]) and data collected in the New Caledonia area (NCDataBase). Each database contains

in situ and MODIS R

_{rs} values in several spectral bands centered on 412 nm, 443 nm, 488 nm, 531 nm, 555 nm, and 667 nm for NCDataBase [

29,

31,

38] and 547 instead of 555 nm for SeaBASS [

40,

41]. All MODIS R

_{rs} over New Caledonia in the NCDataBase were extracted from 2002 to 2010 [

42]. When the two databases are merged, which we call Full DataBase (FDB), it is assumed that the 547 and 555 nm spectral bands give an equivalent signal,

i.e., they are considered one category [

20]. The NCDataBase contains bathymetry (in meters) and

in situ [chl-

a] (µg·L

^{−1}) measured by fluorometry and spectrofluorometry [

29,

43]. Water samples were collected from a Niskin bottle at 2 m depth. SeaBASS contains

in situ [chl-

a] obtained by fluorometry and HPLC, but for consistency only fluorometric measurements were used, and bathymetry was extracted from each latitude-longitude of the measurements.

Figure 1a,b display the distribution of satellite and

in situ [chl-

a] in SeaBASS and NCDataBase, respectively. The SeaBASS data distribution is bi-modal, with separation at about 3 µg·L

^{−1}. The two methods for satellite assessments (closest and weighted mean) are introduced in

Section 2.2. When constructing the NCDataBase [

43], all field measurements of [chl-

a] by fluorometry collected from 1997 to 2010 during more than ten campaigns, mainly in the Southern lagoon [

26], were selected with coincident MODIS R

_{rs}. The full area extends from 165.95° to 168.65°E and from 24° to 19.99°S [

38].

Figure 2 and

Figure 3 display measurement stations and

Table 1 gives information about dates and campaigns used for the NCDataBase. Data were collected during each seasonal period for 13 years, which ensures a large range of situations. As several years and all seasons are sampled, we expect no bias due to El-Niño or La Niña event and seasonal variations.

**Figure 1.**
(**a**) SeaBASS [chl-a] histogram (from the SeaBASS website): green line for in situ measurements and blue line for MODIS Aqua assessments; (**b**) NCDataBase [chl-a] histogram: green line for in situ measurements, cyan line for “OC3-Closest pixel” assessment and blue line for “OC3-Weight Mean” assessment.

**Figure 1.**
(**a**) SeaBASS [chl-a] histogram (from the SeaBASS website): green line for in situ measurements and blue line for MODIS Aqua assessments; (**b**) NCDataBase [chl-a] histogram: green line for in situ measurements, cyan line for “OC3-Closest pixel” assessment and blue line for “OC3-Weight Mean” assessment.

**Table 1.**
Sea campaigns from 1997 to 2010 in New Caledonia in [

43].

**Table 1.**
Sea campaigns from 1997 to 2010 in New Caledonia in [43].
Sea Campaign | Dates | Study Area |
---|

**Camelia and Camecal (1–9)** | 21 Oct. 1997 to 27 Jun. 2003 | South-West lagoon |

**Diapalis (1–9)** | 13 Oct. 2001 to 15 Oct. 2003 | Loyalty Channel/Ouinné lagoon |

**Topaze 1–29** | 26 Apr.2001 to 26 Jan. 2004 | South-West lagoon |

**Transects 1** | 1 Apr. 2003 to 10 Apr. 2003 | South-West lagoon |

**Timeseries** | 12 Dec. 2001 to 22 Apr. 2003 | South-West lagoon |

**Transects 2** | 4 May 2002 to 29 Feb. 2004 | South-West lagoon |

**Southern and Northern 1 transect** | 21 Jun. 2003 to 7 Aug. 2003 | South-West lagoon |

**Southern and Northern 2 transect** | 9 Nov. 2004 to 9 Dec. 2004 | South-West lagoon |

**Bissecote** | 1 Feb. 2006 to 14 Feb. 2006 | South-West lagoon |

**Echolag** | 14 Feb. 2007 to 5 Mar. 2007 | Great South Lagoon |

**Zonalis** | 2 Mar. 2008 to 14 Mar. 2008 | South of New Caledonia |

**Valhybio** | 22 Mar. 2008 to 8 Apr. 2008 | LSO and GLS, offshore stations |

**ValhybioSM** | 27 Apr. 2008 to 21 Jul. 2010 | Lagoon and offshore OC1 station |

Table 2 describes the content of the two databases in terms of [chl-

a] values. In the NCDataBase (811 coincidences), we distinguish oceanic waters (bathymetry > 70 m, 159 coincidences) for which the bottom does not affect the water color, lagoon’s deep waters (20 m ≤ bathymetry ≤ 70 m, 352 coincidences) for which the bottom has

a priori a little influence on the water color, and lagoon’s shallow waters (bathymetry ≤ 20 m, 300 coincidences) for which the bottom may strongly affect the water color [

29]. Similarly, we distinguish waters according to bathymetry for SeaBASS even if the influence of bathymetry is probably not equivalent both in world data and NCDataBase (see

Section 4.1). However, thereafter and especially during the construction of our algorithm, no distinction is made based on the depth of the station because the bathymetry was not found to be a good explanatory variable (see

Section 4.3). Furthermore, we distinguish data according to [chl-

a] values, since we will treat high values (>3 µg·L

^{−1}) and low values (≤3 µg·L

^{−1}) separately; see

Section 2.3.

**Figure 2.**
Visited stations in New Caledonia.

**Figure 2.**
Visited stations in New Caledonia.

**Figure 3.**
Visited stations in the south-west lagoon.

**Figure 3.**
Visited stations in the south-west lagoon.

**Table 2.**
Database description.

**Table 2.**
Database description.
| | [chl-a] (µg·L^{−1}) |
---|

Data base | N | Min | Max | ≤3 (%) | >3 (%) |
---|

**FDB (NCDataBase + SeaBASS)** | 1378 | 0.03 | 38.07 | 81.28 | 18.72 |

**NCDataBase (<20 m)** | 300 | 0.08 | 2.71 | 100 | 0 |

**NCDataBase (20 m ≤ bathy ≤ 70 m)** | 352 | 0.11 | 3.70 | 99.53 | 0.47 |

**NCDataBase (>70 m)** | 159 | 0.08 | 1.05 | 100 | 0 |

**NCDataBase (total)** | 811 | 0.08 | 3.70 | 99.75 | 0.25 |

**SeaBASS (<20 m)** | 262 | 0.37 | 38.07 | 13.36 | 86.54 |

**SeaBASS (20 m ≤ bathy ≤ 70 m)** | 20 | 0.22 | 6.26 | 75.00 | 25.00 |

**SeaBASS (>70 m)** | 285 | 0.03 | 13.18 | 91.58 | 8.42 |

**SeaBASS (total)** | 567 | 0.03 | 38.07 | 54.85 | 45.15 |

#### 2.2. Match-Up

The

in situ [chl-

a] and R

_{rs} data were matched with MODIS Aqua standard retrievals for NCDataBase at original resolution (1-km, non-gridded data) [

41], as provided by the NASA Ocean Color Biology Processing Group (OBPG). The atmospheric correction scheme took into account non-black pixels in the near infrared, but no adjacency effects. SeaDAS flags were applied to the satellite data to eliminate situations with sun glint, large viewing zenith angle, high water turbidity, clouds, land, high top-of-atmosphere radiance, and stray light [

42]. To assign a value to a station on a day, two methods were used. The first method consists in assigning the value of the closest pixel: the closest neighbor method (CL) [

38,

41]. The second method consists in averaging the values from neighboring pixels, using weights depending on the distance to the station: the weighted mean method (WMM) [

38,

41]. This was done for the spectral bands centered on 412, 443, 488, 531, 555 and 667 nm.

The match-ups from MODIS Aqua images were created using a 0.04° square (about 4 × 4 km²) centered on the visited station as in [

41] and in a 5-day temporal window. The two aforementioned methods were compared. They were applied with a temporal window from 0-day to 5-day. Several indices were computed, namely the variation coefficient (VC), the normalized mean bias (NMB), mean normalized bias (MNB), and root mean square error (RMSE):

where

$n$ is the number of observations,

${x}_{i}$ is the

${i}^{th}$ observation of

in situ parameter,

${y}_{i}$ is the

${i}^{th}$ observation of remote sensing parameter,

$\overline{x}$ is the

in situ parameter mean,

$\overline{y}$ is the remote sensing measures mean and

$\mathsf{\sigma}$ is the standard deviation of remote sensing measures.

Table 3 displays the comparison statistics for MODIS and

in situ R

_{rs} matched data. It highlights that RMSE is not affected by the temporal window with a difference lower than 0.001 both between 0-day CL and 5-day CL, and between 0-day WMM and 5-day WMM. The VC values are very close too (0.358 for 0-day WMM and 0.355 for 5-day WMM). Moreover, NMB and MNB are better with a 5-day window (from −0.266 for 0-day WMM to −0.204 for 5-day WMM for NMB, and from −0.171 for 0-day WMM to −0.101 for 5-day WMM for MNB). Thus using a 5-day temporal window does not affect much the accuracy of results to assess R

_{rs}(443) retrievals and the WMM [

29,

38,

41,

43] provides the best performance.

Figure 4 and

Figure 5 display error densities computed with the different

in situ measurements and remote sensing assessments of R

_{rs}(443) for the two methods. “Error densities” enable detection of whether an algorithm tends to overestimate, and whether errors are balanced or distributed around 0. They highlight that errors done with a 5-day temporal window are not much larger than errors made with a narrower window. This is explained by the fact that algorithm errors are similar to those introduces by temporal variability over a few days (see also

Section 4.4). Moreover, our full dataset contains more than 86% of match-ups for which the temporal window is lower than or equal to 2 days. In order to keep a maximum of coincidences, we used the 5-day temporal window with the WMM. Since the weighted means method is more efficient, R

_{rs} values were determined using this second method in our NCDataBase to investigate appropriate [chl-

a] algorithms for the region.

**Table 3.**
Different methods for generating R_{rs}(443) (sr^{−1}) match-ups in New Caledonian waters. Min, Max, Mean Median and RMSE are given in sr^{−1}.

**Table 3.**
Different methods for generating R_{rs}(443) (sr^{−1}) match-ups in New Caledonian waters. Min, Max, Mean Median and RMSE are given in sr^{−1}.
Methods | n | Min | Max | Mean | Median | VC | NMB | MNB | RMSE |
---|

0-day CL | 397 | 0.0000 | 0.0281 | 0.0062 | 0.0057 | 0.4298 | −0.2700 | −0.1633 | 0.0047 |

0-day WMM | 397 | 0.0003 | 0.0213 | 0.0062 | 0.0058 | 0.3584 | −0.2660 | −0.1705 | 0.0044 |

1-day CL | 752 | 0.0000 | 0.0281 | 0.0063 | 0.0059 | 0.3990 | −0.2495 | −0.1592 | 0.0047 |

1-day WMM | 752 | 0.0003 | 0.0213 | 0.0065 | 0.0062 | 0.3452 | −0.2311 | −0.1454 | 0.0044 |

5-day CL | 986 | 0.0000 | 0.0281 | 0.0062 | 0.0058 | 0.4096 | −0.2289 | −0.1274 | 0.0045 |

5-day WMM | 986 | 0.0003 | 0.0213 | 0.0064 | 0.0061 | 0.3550 | −0.2044 | −0.1036 | 0.0042 |

**Figure 4.**
Error densities between in situ measurements and satellite assessments for R_{rs}(443) at the same day (D0), and from a 1-day temporal (D1) window to a 5-day temporal window (D5). Closest neighbor method.

**Figure 4.**
Error densities between in situ measurements and satellite assessments for R_{rs}(443) at the same day (D0), and from a 1-day temporal (D1) window to a 5-day temporal window (D5). Closest neighbor method.

**Figure 5.**
Error densities between in situ measurements and satellite assessments for R_{rs}(443) at the same day (D0), and from a 1-day temporal window (D1) to a 5-day temporal window (D5). Weighted mean method.

**Figure 5.**
Error densities between in situ measurements and satellite assessments for R_{rs}(443) at the same day (D0), and from a 1-day temporal window (D1) to a 5-day temporal window (D5). Weighted mean method.

#### 2.3. Algorithm Steps

Our goal was to find an algorithm allowing good [chl-a] assessments in the lagoon of New Caledonia from the ocean color imagery acquired by MODIS. When creating the models, explanatory variables for assessing in situ [chl-a] are satellite R_{rs}. This is a different approach from the OC* algorithms from NASA, which use in situ R_{rs} as explanatory variables. The statistical study was conducted without a priori knowledge, i.e., all potentially explanatory variables (R_{rs} in the various spectral bands) were taken in account.

As indicated in

Table 2, there are few data with high [chl-

a] (>3 µg·L

^{−1}) in the NCDataBase. As a result, the algorithm built from the NCDataBase will not be adapted to cases where the [chl-

a] is high. The steps to get an algorithm adapted to New Caledonia are the following: (1) using the NCDataBase, determine a model for low [chl-

a] (AFLC),

i.e. a well-suited model for waters having low [chl-

a]; (2) using the SeaBASS database, determine a model for waters with high [chl-

a] (AFHC); (3) using the two merged databases, determine a criterion to distinguish low and high [chl-

a]; and (4) implement a continuous connection between the models for low and high [chl-

a].

Step 1 consists in determining which variables can give a good [chl-

a] estimate. As variables are generally not independent, the support vector machine (SVM) method was used to select the best set of explanatory variables [

26,

28,

44]. This kernel method finds the best regression through optimality criteria even if it means increasing the dimension of the variable space. Note that choosing SVM parameters is easier than with a neural network, for which the architecture can be very complex and hard to interpret. A bootstrap with fifty random draws was performed to determine the best parameters. On each draw, each combination of the explanatory variables was used to create and test a model. The number of all the combinations with six variables (R

_{rs}(412), R

_{rs}(443), R

_{rs}(488), R

_{rs}(531), R

_{rs}(555) and R

_{rs}(667)) is 63 (

${{\displaystyle \sum}}_{i=1}^{6}\left(\begin{array}{c}6\\ i\end{array}\right)=63$). When a model formed with many variables gave results equivalent to a model formed with fewer explanatory variables, the model with fewer variables was chosen. For each of these 63 models, 50 RMSE values, one per sample, were computed. Results were compared by calculating averages, confidence intervals of RMSE averages, and by testing the equality of means. As computed averages did not follow a Normal Law, the Kruskal-Wallis test of means comparison was applied. For both the SeaBASS and NCDataBase combined, the best results were obtained with R

_{rs}(443), R

_{rs}(488) and R

_{rs}(531). Once the best predictors were known, relations, such as a linear or a log regression, between [chl-

a] and predictors and ratios of predictors were sought, with a method similar to the previous one: using bootstrap with 50 draws. With results statistically equivalent on test samples between the best SVM and a simpler relation, the simpler relation was selected.

In Step 2, only data with high [chl-a] were kept to build a specific model for [chl-a] greater than 3 µg·L^{−1}. A SVM model was built with a similar method as in Step 1. The predictive variables are the R_{rs} in the five spectral bands centered on 412, 443, 488, 531, and 555 nm. This SVM model was compared to OC3. The best model between this SVM and OC3 was chosen to complete the algorithm for high [chl-a].

Step 3 consists in determining from MODIS R

_{rs} if the [chl-

a] is high or low. In this step, two methods were tested to determine what MODIS color ranges are linked to a high or a low [chl-

a]: SVM (as a classifier) and decision tree. As explained in more detail later (

Section 4.1), the decision tree was preferred to the SVM because of its practicality. Indeed, only the ratio

${R}_{rs}\left(488\right)/{R}_{rs}\left(555\right)$ is used to determine which group of [chl-

a] should be linked to a MODIS color.

For Step 4, several kinds of continuous connections, with weight functions, between the AFHC and the AFLC were tried: linear, quadratic, root squared, logarithmic, exponential, and arc-tangential. Equations (5.1)–(5.4) describe some weight functions with $s$ the threshold determining the limit between high and low [chl-a], $\epsilon \in \left]0;s\right[$ the tolerance used to set the transition interval width, $a=s-\epsilon $ the inferior bound of the transition interval, $b=s+\epsilon $ the superior bound of the transition interval, and $x$ is the variable which represents the ratio ${R}_{rs}\left(488\right)/{R}_{rs}\left(555\right)$.

Given a value of the ratio ${R}_{rs}\left(488\right)/{R}_{rs}\left(555\right)$, the weight function $f$ is applied to the value determined by the AFLC algorithm, and the weight function $1-f$ is applied to the value determined by the AFHC (SVM or OC3).

We also tested a general SVM (SVMg) from the merged NCDataBase and SeaBASS database, built without differentiating between the two [chl-a] groups. For this SVMg construction, we used the bootstrap method described before (selection with 50 random draws). Explanatory variables belonging to the model were selected with learning and test samples. The model with the lower RMSE on test samples was retained. In this SVM, the kernel is the radial basis kernel and predictors are R_{rs} channels 443, 531 and 555 nm. The SVMg and the “AFLC + AFHC” algorithms were also compared with OC3.

#### 2.4. Statistical Tests

In order to verify the effectiveness of an algorithm without an overtraining effect, data were systematically divided into two samples: one learning sample to build the model, and one test sample on which the built model was applied and checked with indicators (specified after). The learning sample was constructed with 70% of the data and the test sample was formed with the remaining 30%. To maintain the proportions between high and low [chl-

a] in each sample when the NCDataBase and SeaBASS database were merged, samples were obtained with “semi-random draws”,

i.e., the dataset was partitioned into two groups (high and low [chl-

a]) and then a random draw was made for each group. For model comparisons, we essentially used RMSE (Equation (4)). In order to not rely on a single indicator, we also calculated the correlation coefficient between the values given by algorithms and the measured values, which provided a measure of “the link between two random variables” [

31].

## 5. Conclusions

In this paper, we have introduced an algorithm for estimating [chl-a] from satellite-derived R_{rs} without a priori information, based solely on statistical considerations. Through this approach, we have obtained a suitable algorithm for optically complex waters of New Caledonia. The bottom influence in the lagoon is smaller than with OC3. The main improvement is obtained for waters with [chl-a] less than 3 µg·L^{−1}, with a RMSE 30% lower in average than with OC3 in New Caledonian lagoon waters. We have also shown satisfactory results for both world data and New Caledonia data.

It is notable, but not surprising, that the best explanatory variables from the SVM regression analysis are R_{rs} corresponding to wavelengths of blue and green light. For the data sets considered, the best wavelengths are 443, 488, and 531 nm. To classify a pixel in the group of high or low [chl-a], it is sufficient to simply use a threshold in the ratio of R_{rs} in the blue (488 nm) and green (555 nm), here 0.76 to separate waters with [chl-a] below and above 3 µg·L^{−1}. This algorithm is sensor-dependent but it had been constructed and checked with around 1400 match-ups from two different data sources. The risks of overtraining are very low and it is therefore possible to apply this algorithm at least to MODIS data. Tests should be performed to extend this algorithm to other sensors and coefficients should be adjusted accordingly.

A great deal of work is ongoing concerning atmospheric correction in coastal, optically complex waters. This is essential to obtain satisfactory match-ups. A major step will be made when much better agreement is obtained with in situ R_{rs} measurements. The [chl-a] algorithms will then provide more accurate results, allowing more efficient evaluation of the impact of environmental stress factors on lagoon ecosystems, especially coral reefs. Stress factors affect coral health both with intensity and time, hence the interest in having a continuous monitoring of water properties over large areas, which is only possible thanks to satellite data.