Long-Term Spatiotemporal Variability of Whitings in Lake Geneva from Multispectral Remote Sensing and Machine Learning

: Whiting events are massive calcite precipitation events turning hardwater lake waters to a milky turquoise color. Herein, we use a multispectral remote sensing approach to describe the spatial and temporal occurrences of whitings in Lake Geneva from 2013 to 2021. Landsat-8, Sentinel-2, and Sentinel-3 sensors are combined to derive the AreaBGR index and identify whitings using appropriate filters. 95% of the detected whitings are located in the northeastern part of the lake and occur in a highly reproducible environmental setting. An extended time series of whitings in the last 60 years is reconstructed from a random forest algorithm and analyzed through a Bayesian decomposition for annual and seasonal trends. The annual number of whiting days between 1958 and 2021 does not follow any particular monotonic trend. The inter-annual changes of whiting occurrences significantly correlate to the Western Mediterranean Oscillation Index. Spring whitings have increased since 2000 and significantly follow the Atlantic Multidecadal Oscillation index. Future climate change in the Mediterranean Sea and the Atlantic Ocean could induce more variable and earlier whiting events in Lake Geneva.


Introduction
Calcium carbonate precipitation is an essential biogeochemical process in freshwater and marine ecosystems [1,2]. In hardwater lakes, calcite precipitation represents a major component of the inorganic carbon cycle. Calcite precipitation also interferes with lake nutrient cycles owing to its complexation with phosphates [3]. Calcite precipitation is a seasonal process that can occur discreetly at a low background level. However, under favorable conditions, it can also manifest more strikingly through massive short-term transitory events, so-called whiting events. Whiting events are common phenomena of marine environments [4][5][6][7] and lakes [8][9][10][11]. Whitings are characterized by a milky turquoise coloration of upper surface layers, generated by a fine-grained size calcite precipitates that increase the turbidity of the water column and its light reflectance [12].
The supersaturation of surface waters for calcite is a necessary but insufficient prerequisite for mineral precipitation and thus whiting events. Calcite supersaturation can be reached through a shift in carbonate equilibria induced by an increase in pH or CO 2 removal [3] along with greater water temperatures that decrease the retrograde solubility of calcite [13]. However, homogeneous nucleation requires overpassing the activation energy far above the strict supersaturation. Massive events such as whitings require adequate nucleation seeds for heterogeneous precipitation in the water column [14]. In hardwater lakes, whiting events have mainly been associated with phytoplankton activity. For instance, picoplankton growth can create the requested pH and CO 2 conditions for supersaturation, while the cells can act as heterogeneous nuclei [10,15]. Once supersaturation is reached, river-borne detrital particles can also trigger nucleation [11,16,17]. Altogether, these observations evidenced that warmer surface temperatures, enhanced primary production, and fine suspended sediments can potentially all contribute to whiting events, even though their interplay may vary from one lacustrine system to another. Moreover, whiting events are likely regulated by a broader combination of climatic and trophic factors that are both dynamic in time. Therefore, determining the long-term evolution of whiting events occurrences in relation to global change impacts on environmental factors (e.g., physical conditions of lakes, changes in river inputs, lakes' primary production) appears crucial for predicting changes in the inorganic carbon cycle of inland waters.
Due to their episodic and transient nature, the dynamics of whiting events can only be captured by high-frequency monitoring. However, whiting events are also patchy in space and can be missed by moored high-frequency sensors. In fact, as the typical turquoise coloration of whiting events usually covers large areas, these phenomena are excellent candidates for remote sensing detection. Whiting events have, for instance, been monitored through remote sensing techniques in diverse marine areas such as the Arabian Gulf [6,7], the Bahamas sea [18], or Florida coastal waters [19] as well as in diverse lacustrine systems in Germany [20], Switzerland [11], or North America [21]. However, while these approaches provided detailed information on the spatial extent of whiting events, they were also characterized by specific limitations in terms of temporal coverage. For instance, remote sensing datasets can be discontinuous due to both the satellite time resolution and a potential absence or limited quality of images associated with cloud cover. Hence, because of this limitation and the restricted availability of time-resolved, multi-annual ground monitoring data, there are few references of continuous records of whiting occurrences long enough to evaluate how their dynamics respond to changing environmental and climatic conditions. For instance, ref. [22] investigated the annual mean whiting occurrence frequency and spatial distribution from MODIS data on a decadal timescale in the coastal waters of Florida. However, they could not provide insights on the underlying drivers. Similarly, ref. [21] provided an extensive description of water clarityinferred whiting event dynamics in the Great Lakes on multi-decadal scales. However, they only related the observed changes to the reported long-term biogeochemical evolution of the lacustrine systems without statistically exploring the environmental drivers supporting the triggering of whiting events in the short term nor the response of these factors to long-term climatic forcing.
Herein, we aim to use machine learning techniques to combine ground-based and remote sensing data to fill the gap of the long-term dynamics of whiting events in a large peri-alpine hardwater lake-Lake Geneva. Accordingly, (i) we build an innovative dataset of multispectral long-term remote sensing data of Landsat-8, Sentinel-2, and Sentinel-3, to determine for the first time, the spatial and temporal occurrences of whiting events in Lake Geneva from 2013 to 2021. Then, (ii) we apply a random forest machine learning algorithm to identify, from ground-based monitoring data, the environmental setting during observed whitings in the lake and reconstruct the past "unseen" whiting days from 1958 to 2021 using an appropriate statistical approach. This is an important step forward in the understanding of the environmental conditions necessary for the onset of whiting events in Lake Geneva. Finally, (iii) we analyze the temporal dynamics of whiting occurrence over the past 60 years in relation to the relevant climate indices affecting Central Europe. This represents a better understanding of the influence of climatic activity in the phenology of whitings occurring in Lake Geneva.

Study Site
Lake Geneva is a peri-alpine lake along the Swiss-French border, at 372 m above sea level (46 • 26 N, 6 • 33 E, see Figure 1). The lake's surface area is about 580 km 2 , and its maximum depth (309 m) makes it the largest freshwater body in Western Europe, with a Figure 1. Map of the study area. RGB image from Landsat-8 of Lake Geneva on 29 June 2019. The whiting areas (i.e., turquoise 'milky' color of surface waters) are specified. The SHL2 monitoring point is shown in grey in the middle of the lake. The Rhone River is shown in blue. The lake's location between France and Switzerland is shown in the top-left corner. The 20 m isobath is shown in yellow and the Rhone estuary area in red.

Workflow
The workflow consists of multiple processing steps from remote sensing images selection, data filtering (region of interest, 30% cloud cover filtering), whiting index estimates, and data export using the cloud computing platform Google Earth Engine (GEE) [30,31] for Landsat-8 and Sentinel-2 data and from Datalakes (https://www.datalakes-eawag.ch/, accessed on 20 October 2022) for Sentinel-3 data. The next processing steps are computed in Matlab. They comprise a sensor response inter-calibration and identify and characterize whiting events. The final, aggregated metrics include the spatial extent and temporal occurrence of whiting events. Factors controlling whiting events in 2013 to 2021 are then studied through decision tree and random forest algorithms, computed in Python. Next, whiting events are classified using environmental indicators, such as meteorological data, Rhone River discharge, and the lake physical conditions. Finally, the optimized random forest is used to reconstruct 'unseen' whiting days from 1958 to 2021.

Satellite Data
Landsat-8, Sentinel-2, and Sentinel-3 satellites are considered in this work. Landsat-8 satellite has a 16-day temporal resolution (under cloud-free conditions; see Table 1 for details). Landsat-8 carries the Operational Land Imager (OLI), which collects image data in nine visible to shortwave infrared bands with a spatial resolution of 30 m. We use the Landsat-8 Collection 1 Tier 1 Raw Scenes (L1TP) provided by USGS on GEE platform to produce the reflectance factors in the RGB bands [32]. The main tributary to Lake Geneva is the Rhone, representing approximately 70% of the total water input. The Rhone River is also the primary supplier of sediment and phosphate to the lake [26,27] and plays a major role in lake ecosystem dynamics in terms of biogeochemical processes (primary production, fine sediments delivery, transport, and settling [17,28]). On the interannual scale, rainfall and summer temperature changes are expected to play a role in discharge variability. The Atlantic (AMO, NAO), Mediterranean (such as Western Mediterranean Oscillation Index; WeMOi), and even global (Oceanic Nino Index; ONI) climate indices appear to be crucial in describing this variability.
The inflowing water from the Rhône generally takes the form of an interflow when the lake is thermally stratified, i.e., a turbid layer that propagates along the thermocline where the Rhône water finds its neutral buoyancy [29]. However, these particulate inputs can also flow along the bottom of the lake when extreme densities are reached (cold water and high concentration of suspended particles). During these events, the Rhone inflow is not observable by satellite. However, extreme discharge events when the lake is not stratified can cause overflows detraining suspended particles toward surface waters. These events, episodically visible by remote sensing, are poorly described in the literature. It is therefore important to discriminate these events from whiting events in Lake Geneva, which will be addressed in this study.
Recent studies on whiting events in Lake Geneva have been carried out by in situ measurements, remote sensing, and hydrological modeling. So far, whiting events have been observed in late spring/early summer when (1) the Rhône discharge is high due to catchment snowmelt, and (2) the lake's waters are stratified and surface temperatures are warm. Ref. [11] demonstrated that whiting events are triggered along the Rhône interflow into the lake and that its spatial extent, influenced by local hydrodynamics, corresponds to the northeastern dispersion of riverine particles. Moreover, ref. [17] filled in the gap of in situ monitoring of whiting dynamics. They showed that there are different contributions of in situ CaCO 3 particles. A detrital part eroded from the Rhône catchment and brought into the interflow, and an authigenic part (i.e., newly formed CaCO 3 particles) probably precipitated on the surface of fine fluvial particles transported into the lake. This authigenic calcite component tends to increase with distance from the mouth of the Rhône, highlighting the role of the physical stability of the water column and the spread of the interflow in the dynamics of whitings in Lake Geneva.

Workflow
The workflow consists of multiple processing steps from remote sensing images selection, data filtering (region of interest, 30% cloud cover filtering), whiting index estimates, and data export using the cloud computing platform Google Earth Engine (GEE) [30,31] for Landsat-8 and Sentinel-2 data and from Datalakes (https://www.datalakes-eawag.ch/, accessed on 20 October 2022) for Sentinel-3 data. The next processing steps are computed in Matlab. They comprise a sensor response inter-calibration and identify and characterize whiting events. The final, aggregated metrics include the spatial extent and temporal occurrence of whiting events. Factors controlling whiting events in 2013 to 2021 are then studied through decision tree and random forest algorithms, computed in Python. Next, whiting events are classified using environmental indicators, such as meteorological data, Rhone River discharge, and the lake physical conditions. Finally, the optimized random forest is used to reconstruct 'unseen' whiting days from 1958 to 2021.

Satellite Data
Landsat-8, Sentinel-2, and Sentinel-3 satellites are considered in this work. Landsat-8 satellite has a 16-day temporal resolution (under cloud-free conditions; see Table 1 for details). Landsat-8 carries the Operational Land Imager (OLI), which collects image data in nine visible to shortwave infrared bands with a spatial resolution of 30 m. We use the Landsat-8 Collection 1 Tier 1 Raw Scenes (L1TP) provided by USGS on GEE platform to produce the reflectance factors in the RGB bands [32].
The Copernicus Sentinel-2 mission comprises two satellites. The satellites' Multispectral Imager (MSI) acquires data in high temporal resolution (5 days with two satellites at the equator under cloud-free conditions), high spatial resolution (10-60 m pixels, swath width of 290 km) and 13 spectral bands ranging from visible to shortwave infrared wavelengths. Sentinel-2 Level-2A data are available on GEE platform. Data are downloaded from the Copernicus datahub and are processed using sen2cor to produce the reflectance factors in the RGB bands [33]. Finally, images are exported from GEE using a spatial resolution of 30 m to correspond to the Landsat-8 dataset.
Sentinel-3 satellites (3A and 3B) have a daily temporal resolution. They carry the Ocean and Land Colour Instrument (OLCI), which acquires data along 21 spectral bands ranging from visible to shortwave infrared wavelengths. Medium-resolution (300 m) images are processed using the Python package SenCast (https://gitlab.com/eawag-rs/sencast, accessed on 23 October 2022). Normalized water-leaving reflectance in the RGB bands is calculated using the Polymer algorithm v4.13 [34], which is tried and tested for lake water quality retrieval in the Copernicus Global Land Service [35] and ESA's Climate Change Initiative [36]. All Sentinel-3 data used in this study are available in the Datalakes webportal (www.datalakes-eawag.ch, accessed on 20 October 2022).  [37]. Data are interpolated within a 1 m vertical 1-day temporal resolution grid. In this work, surface water temperature (0-10 m) is used as a filter to discard false-positive whiting days (see Section 4.1). The thermocline depth is computed over the entire period (i.e., 1958-2021). Historical discharge data of the Rhone River (1958-2021) are downloaded from the FOEN website [38]. Discharge data are monitored at the Porte du Scex station with a daily resolution.
The climatic indexes tested encompass the AMO (https://www.psl.noaa.gov/data/ timeseries/AMO/, accessed on 23 October 2022), which is referenced as a good indicator of the summer climate in central Europe [24], and the NAO (https://www.ncei.noaa.gov/ access/monitoring/nao/, accessed on 23 October 2022), which has been described as the main winter climate forcing [25]. Besides, we also test the WeMOi (https://crudata.uea.ac. uk/cru/data/moi/, accessed on 23 October 2022), estimated from the difference between atmospheric pressure from Northern Italy to Southwestern Spain [39]. It is representative of rainfall variability in both areas. Positive phases typically show an anticyclone in the Gulf of Cadiz and a low-pressure area over the Ligurian Sea, leading to increased precipitations in Northern Italy, and probably in our study area [40]. Finally, the Oceanic Nino Index (ONI, https://origin.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ONI_ v5.php, accessed on 23 October 2022) is also tested, as this index is referenced as the primary index for tracking the El Nino Southern Oscillation phenomenon, which is a major contributor of worldwide climate variability [41], and potentially a predictable signal in European rainfall [42].

Whiting Detection Using Remote Sensing
The AreaBGR index (see detail in [20]), i.e., the triangular area between the blue, green, and red reflectance values, determines the whiting spatial and temporal occurrences. We use this index as it is the best indicator available to study whiting events in inland waters. The AreaBGR index is computed for all pixels in the abovementioned satellite data of Lake Geneva, using the following expression: An inter-calibration of the different satellite sensors is performed. We compare the AreaBGR estimates for the whiting day on 29 June 2019 for which we have simultaneous images from Landsat-8, Sentinel-2, and Sentinel-3 satellites (see Figure 2) and ground data [17]. The range of the index measured by Sentinel-2 and Sentinel-3 is slightly lower than that of Landsat-8, as a likely result of different product types and sources, and atmospheric corrections [20]. The obtained equation AreaBGR S2 = 0.28 × AreaBGR L8 + 11066.43 with R 2 = 0.97 and AreaBGR S3 = 0.23 × AreaBGR L8 + 10650.23 with R 2 = 0.97 allows expression of the Sentinel-2 and Sentinel-3 derived AreaBGR indexes in the same range as the one determined by the Landsat-8 satellite (see Figure 2a). The residuals from the inter-calibration equation can be explained by differences in the sensors' spectral response functions and by the time difference between the shots. Nevertheless, this complementarity allows us to use the Landsat-8 (n = 140), Sentinel-2 (n = 101), and Sentinel-3 (n = 766) databases to describe the spatial and temporal occurrences of whiting days between 2013 and 2021.
Positive whiting is attributed to any pixel whose AreaBGR value is >13,000, according to [20] (see magenta contours in Figure 2b). The surface area of whitings for each image is then estimated by summing flagged pixels of 30 m 2 . This database is completed with the daily Sentinel-3 database, from which the AreaBGR is derived following a similar processing. Summing flagged pixels of 300 m 2 provides the area of whiting events.
The AreaBGR index can be sensitive to the presence of other suspended particles [6]. In Lake Geneva, in the case of wave-induced resuspension of fine sediments near the coast, AreaBGR may respond to an increase in the near-infrared wavelengths. Events when sediments brought by the Rhone reach the surface (i.e., unstratified lake and cold surface waters, see an example on 25 April 2013 in Figure 3) generate similar signals. Due to these processes, we apply several filters to discard satellite images showing false-positive whiting days.
First, we only select images with whitings larger than 15 km 2 to avoid minor contaminations due to remaining clouds. Then, we exclude the shallowest depths of the lake (i.e., <20 m depth) and the region of the Rhone mouth for our calculations (see the yellow isobath and red area in Figure 1). Another filter is applied to discard false-positive AreaBGR images due to Rhone inflow at the surface. We base this latter filter on the surface water temperature of the lake (SHL2 monitoring point). [17] showed that whiting events only happened when the lake's surface temperature reaches a minimum of 15 • C. Below 15 • C, calcite supersaturation is unlikely, while the lake stratification is not strong enough to allow for a Rhone interflow. Therefore, all images with a positive AreaBGR index but surface temperature below 15 • C (averaged over 0-10 m depth) are discarded.

Reconstruction of Past Whitings
We use available environmental indicators from 2013 to 2021, i.e., water discharge of the Rhone River, meteorological conditions over Lake Geneva, and the lake physical conditions (surface water temperature, thermocline depth) as input features of a machine learning classification algorithm for whiting occurrence (i.e., whitings or non-whitings, two classes with values of 1 and 0, respectively). The machine learning approach consists of a Decision Tree (DT) and a Random Forest (RF) to find the best classification method based on classical metrics [43,44]. The detail of the model development carried out in this work is specified in the Supplementary Material.
First, we split our database into three sub-datasets: (1) the training set (60% of the whole database), (2) the validation set (20%), and (3) the test set (20%). The training set is used to train the different models, i.e., to set the model parameters. The validation set is used to compare the model performances between different models and to choose the most accurate one. The test set is finally used to test the performance of the best model on the remaining 'unused' data. Remote Sens. 2022, 14, x FOR PEER REVIEW 7 of 22  isobath and red area in Figure 1). Another filter is applied to discard false-positive Are-aBGR images due to Rhone inflow at the surface. We base this latter filter on the surface water temperature of the lake (SHL2 monitoring point). [17] showed that whiting events only happened when the lake's surface temperature reaches a minimum of 15°C. Below 15°C, calcite supersaturation is unlikely, while the lake stratification is not strong enough to allow for a Rhone interflow. Therefore, all images with a positive AreaBGR index but surface temperature below 15°C (averaged over 0-10 m depth) are discarded.

Reconstruction of Past Whitings
We use available environmental indicators from 2013 to 2021, i.e., water discharge of the Rhone River, meteorological conditions over Lake Geneva, and the lake physical conditions (surface water temperature, thermocline depth) as input features of a machine learning classification algorithm for whiting occurrence (i.e., whitings or non-whitings, To evaluate the performances of the models, we use classical metrics such as the confusion matrix (i.e., a table including true negatives, false positives, false negatives, and true positives), the accuracy rate (i.e., the percentage of correct predictions for a given dataset), which is a summary of the confusion matrix, and the AUC (i.e., the Area Under the receiver operating characteristic Curve), which measures how well the whitings and nonwhitings events can be separated or distinguished by the model. This Machine Learning approach is expected to provide the main driving factors (among the input features) of the whiting events in Lake Geneva. The best model is then used to reconstruct the past unseen whiting days from 1958 to 2021 relying on the same input features used to train and validate the model for the 2013-2021 period.
Changes in the annual whiting occurrence reconstructed between 1958 and 2021 are tested using Mann-Kendall tests on the time series [45,46] and a BEAST decomposition (Bayesian Estimator of Abrupt change, Seasonality, and Trend). BEAST is a generic Bayesian model averaging algorithm to decompose time series or 1D sequential data into individual components, such as abrupt changes, trends, and periodic/seasonal variations [47]. The relations between the annual whiting frequency and large synoptic climatic indexes are tested using the Pearson correlation coefficient r and the related p-value. Overall, the long-term dataset build in this work offers a unique view of the occurrence of whitings in Lake Geneva with high temporal and spatial resolutions. The description of the spatial occurrence of the whiting days, i.e., the number of pixels flagged as whitings between 2013 and 2021, can be challenging as it depends on the available images, i.e., on the temporal resolution and cloud coverage. Note that this result is relative, i.e., a good description of the spatial variability, more than a good estimate of the absolute number of whiting days detected over the study period.

Spatial and Temporal
The distribution of whitings by areal coverage is bimodal (Figure 4a). In 96% of the days, the whiting covers < 40% of the lake area, and exceptional whitings occupy almost the whole lake surface (50-80%). Therefore, we consider them separately (class 1 for partial whitings and 2 for total whitings).

Temporal Occurrences of Observed Whitings in Lake Geneva
The days of whiting and their spatial extent over 2013-2021, as detected from Landsat-8, Sentinel-2, and Sentinel-3 satellite images, are presented in Figure 5. Whitings are more frequently observed in 2018-2019 and 2021 (i.e., >25 days) and reach greater maximal areas. Whiting days are less frequent in 2016 and 2017, and only three are detected in 2020. From 2013 to 2015, only the Landsat-8 dataset is available, the number of observations only represent a fraction of the later years, hence a much larger chance that whitings remain unseen (Figure 5a). The use of this long-term dataset in line with the monitoring of environmental parameters allows describing precisely the conditions in which whitings (class 1 and 2) occur as described below.
Whitings of class 1 occur at high Rhone discharge (Figure 5b, average discharge of about 320 m 3 s −1 , Table 2) when air and water surface temperatures are high (i.e., approx. 22 • C for air and 18 • C between 0 and 10 m for water, averaged over the observed whitings), and the thermocline depth is ca. 10 m depth (Figure 5c-e). Wind speed is more variable during the whiting days of class 1, with a mean value of 2.3 m s −1 and a standard deviation of 1.2 m s −1 (Figure 5f). Whitings of class 2 occur in similar conditions, except for a lower Rhone discharge (i.e., approximately 250 m 3 s −1 , Table 2). However, the limited number of class 2 events (i.e., only seven days) does not allow for further analysis. Table 2. Averaged environmental conditions during observed whiting days from 2013 to 2021 (>15 km 2 ) in Lake Geneva. The standard deviations for each condition are also specified. The number of whiting days for each class is specified.

Drivers of Whitings Using Machine Learning
The detailed optimization results of the machine learning models are shown in Figure 6. The detailed method is described in the Supplementary Material. Note that only class 1 whiting days are considered, class 2 whitings being too few to be significantly related to the corresponding ground data.

Reconstruction of Past Unseen Whitings
Daily class 1 whiting presence-absence is reconstructed from the RF algorithm over the 1958-2021 time period (Figure 7a). This reconstruction provides a first assessment of past evolution of whiting days based on the use of available ground data. The total number of whitings (class 1, expressed as days per year) is highly variable over the years (annual average of n = 18 days of whiting per year). Values range from years with very few or no whiting days (n < 3; 1964, 1974, 1976, 1997) to years with frequent whiting days (n > 35; 1958, 1963, 1966, 1982, 1994, 2001) (Figure 7a). Neither the Mann-Kendall test (pM-K = 0.117) nor the BEAST decomposition (low probability of changing points) detect any clear temporal trend in the annual whiting occurrence between 1958 and 2021, reconstructed by the RF algorithm (Figure 7b). There is yet a shift in the whiting phenology. The number of spring whiting increases from 1958 to 2021 (pM-K = 0.011; Figure 7c). The BEAST decomposition detected a changing point in 2000 (maximum probability in changing points). It corresponds to an increase in spring whiting occurrence (+1 day on average since 2000). As seen in Section 4.2, the objective is to relate the occurrences of class 1 whitings to the corresponding ground data through the best model by comparing a DT and an RF algorithm. We first built a simple DT to determine the most important environmental factors to classify whiting events. The results show that water temperature and Rhone discharge are the two most discriminating factors for the occurrence of whitings between 2013 and 2021 (see Figure 6a). Indeed, the two thresholds necessary to classify whitings are a minimum Rhone discharge of 207 m 3 s −1 and a minimum water temperature of 15 • C. Using these thresholds allows for classifying the majority of the whitings (see the blue points in Figure 6b). This DT has good performances (validation AUC = 0.86; validation accuracy = 74%), but can be improved by using the cost complexity pruning method. The best DT (see the Supplementary Material) has similar performances (validation AUC = 0.83; validation accuracy = 81%), but still makes some classification errors by creating false positives (n = 55 in the training dataset; n = 28 in the validation dataset).
To go further, we compare the results obtained from the DT with those of the RF. The construction and optimization of the RF (see Supplementary Material) lead to the best RF composed of approximately twenty trees, with a training accuracy of~1 (i.e., approx. 100% of whiting and non-whiting events in the training data have been correctly classified) and a validation AUC of 0.90. Besides, the model provides the most important indicators for the classification of whitings, namely Rhone discharge and water temperature (Figure 6c). Using these two predictors and the decision boundaries, the classification results are shown in Figure 6d. The main advantage of this model is the consequent reduction of the number of false positives (n = 0 in the training dataset; n = 4 in the validation dataset) using a finer classification. This final RF is able to classify the whiting occurrences as a function of environmental conditions and to identify the most important factors controlling whiting triggering. This optimized RF is then used to reconstruct the past 'unseen' whiting days, based on the ground data monitored between 1958 and 2021 (see below).

Reconstruction of Past Unseen Whitings
Daily class 1 whiting presence-absence is reconstructed from the RF algorithm over the 1958-2021 time period (Figure 7a). This reconstruction provides a first assessment of past evolution of whiting days based on the use of available ground data. The total number of whitings (class 1, expressed as days per year) is highly variable over the years (annual average of n = 18 days of whiting per year). Values range from years with very few or no whiting days (n < 3; 1964, 1974, 1976, 1997) to years with frequent whiting days (n > 35; 1958, 1963, 1966, 1982, 1994, 2001) (Figure 7a). Neither the Mann-Kendall test (p M-K = 0.117) nor the BEAST decomposition (low probability of changing points) detect any clear temporal trend in the annual whiting occurrence between 1958 and 2021, reconstructed by the RF algorithm (Figure 7b). There is yet a shift in the whiting phenology. The number of spring whiting increases from 1958 to 2021 (p M-K = 0.011; Figure 7c). The BEAST decomposition detected a changing point in 2000 (maximum probability in changing points). It corresponds to an increase in spring whiting occurrence (+1 day on average since 2000). Here we attempt to determine the relationship between the temporal variability of class 1 whiting occurrences in Lake Geneva and climatic indices to assess the influence of climate activity on whitings' phenology. The interannual and seasonal variabilities of

Factors Controlling Occurrences of Whitings from 1958 to 2021
Here we attempt to determine the relationship between the temporal variability of class 1 whiting occurrences in Lake Geneva and climatic indices to assess the influence of climate activity on whitings' phenology. The interannual and seasonal variabilities of whiting days reconstructed from the RF algorithm are tested against the climate indices that affect central Europe and Switzerland.
The inter-annual variability of the total and spring numbers of whitings (expressed as anomalies in days per year) is shown in Figure 8. A comparison is made between the whiting anomalies per year, using the RF algorithm predictions, and the climatic indices most known to influence the Swiss and European climates. The anomalies in the total number of whiting days per year can be partly explained by the climatic index WeMOI (Figure 8a

Discussion
The objective of this study is to measure the spatial extent and temporal occurrences of whiting days (i.e., massive clouds of suspended CaCO3 particles induced by intense calcite precipitation) in Lake Geneva using Landsat-8, Sentinel-2, and Sentinel-3 satellite data between 2013 and 2021. An RF algorithm then demonstrates the link between these

Discussion
The objective of this study is to measure the spatial extent and temporal occurrences of whiting days (i.e., massive clouds of suspended CaCO 3 particles induced by intense calcite precipitation) in Lake Geneva using Landsat-8, Sentinel-2, and Sentinel-3 satellite data between 2013 and 2021. An RF algorithm then demonstrates the link between these occurrences and the meteorological, lake physical, and riverine conditions. The latter is finally used to reconstruct the past occurrences between 1958 and 2021 based on the main identified controlling factors of whitings. Below, we first discuss the complementarity of the satellites and the robustness of the index used. Then, we detail the results obtained regarding spatial and temporal observations and discuss the reconstruction of past whiting days in light of the climatic indices influencing the central part of Europe.

Remote Sensing of Whitings in Lake Geneva
Satellite observations are increasingly used to characterize biogeochemical processes in inland waters [48][49][50]. We chose to combine Sentinel-2 and Landsat-8 datasets with Sentinel-3 to describe whitings in Lake Geneva. The different spatial (i.e., 30 m or 300 m) and temporal (i.e., 1 day or approx. 15 days) resolutions enable a relatively good monitoring of the aspect of Lake Geneva over the period 2013-2021. We observe different responses on the Landsat-8, Sentinel-2, and Sentinel-3 data due to various product sources and processes. The inter-calibration carried out in this work expresses the satellite responses in term of AreaBGR in the same range, which is needed for the time series coherence (Figure 2).
We use the AreaBGR index to detect whiting days in Lake Geneva. Indeed, intense events of CaCO 3 precipitation lead to an increase in the water reflectance, mainly in the green band, resulting in a turquoise watercolor. This result contrasts sharply with the lake's color without precipitation, which appears dark in the visible spectrum [20]. This index responds positively to various suspended particles (sediments and phytoplankton species) that influence the visible spectrum by backscattering sunlight (see Section 4.1). Among these suspended particles, distinguishing the sedimentary contributions from the Rhône (i.e., inputs that reach the surface when the lake is unstratified) and resuspension by near-shore waves, from the precipitation of CaCO 3 particles during whitings can be challenging. The use of specific filters, determined from geochemical knowledge about the whiting process, enables building a conservative database retaining only whiting days. Although empirical, these filters could be further tested on different peri-alpine lakes to build a process chain for validating the AreaBGR index as a proxy of whitings.
Besides, we do not use specific filters related to the presence of phytoplankton in the lake. Indeed, some biological blooms can potentially influence the reflectance used to calculate AreaBGR, without inducing whiting events. However, their abundance in Lake Geneva is never high enough to reach the AreaBGR threshold and we did not find an example of this contamination in our database in line with the study of [11]. The ongoing development of remote sensing monitoring of primary production and phytoplankton species is crucial to better characterize the possible contamination of the AreaBGR index from organic sources.

Spatial and Temporal Occurrences of Whitings in Lake Geneva
The majority of whitings in Lake Geneva tends to occur during early summer while fewer events occurred later during the season (Figure 4). These two types are associated to different spatial patterns. Thus, the determinism of these two classes can be related and explained by diverse environmental drivers, notably identified through machine learning techniques for the majority of them (class 1 whitings), and are probably triggered by different mechanisms of nucleation. Indeed, the spatial extent of the majority of whiting days tends to be related to the Rhône inflow (>95%, see Figure 4b). The turbidity inputs of the Rhône can trigger the nucleation of CaCO 3 particles during high discharge when the lake is stratified, and the surface water temperature is high. This result is in line with the previous works of [11,17]. Authors highlighted the role of the interflow in triggering whiting events when the spread of fine sediments along the whole lake is driven by local hydrodynamics during the high physical stability of the water column [29,51]. Detrital CaCO 3 particles eroded from the watershed could also participate in whitings detection close to the river mouth [17], increasing the reflectance of surface waters and the AreaBGR mean and extreme values (see Figure 4e).
However, fewer class 2 whiting events are detected in the central part of the lake (i.e., approx. 5% in the period 2013-2021), later during the season. The lack of in situ measurements during those whitings and the few events observed do not allow a more refined characterization. They can probably be related to episodes of important primary production, i.e., phytoplankton bloom in early August 2017 [52], and a massive, transient Uroglena sp. bloom in September 2021 [53]. The influence of primary production in triggering whiting events is still under debate and can be considered in several ways. Primary production tends to increase pH and favor calcite supersaturation and potential precipitation. However, the nucleation of calcite particles during precipitation can occur on small picoplankton cells [54] but also on algal-derived exopolymeric substances (EPS) or other suitable heteronuclei (bacteria). Moreover, as discussed before, high levels of chlorophyll a during phytoplankton blooms can also influence the AreaBGR index and potentially bias the corresponding whiting detection. Coupling in situ measurements of primary production and characterization of phytoplankton species in line with CaCO 3 measurements could provide crucial information on the biologically-induced precipitation of calcite. A future study should also compare a lake under the influence of a glacial river, i.e., subject to turbid inputs (such as Lake Geneva), to a lake without glacial inputs but where whiting events are observed (Lake Neuchâtel). The study of the difference in spatial and temporal occurrences could reveal different roles of organic and inorganic processes in the triggering of whiting events.

The Long-Term Evolution of Whitings in Lake Geneva
We reconstruct the class 1 whiting occurrences, as days per year, between 1958 and 2021, based on the RF algorithm ( Figure 7). The number of reconstructed whiting days per year is very variable, with no noticeable trend in its long-term evolution. However, the interannual variability can be partly related to the WeMOi (Figure 8a). This index is causally related to precipitation in northern Italy, which could be at the origin of environmental conditions in Switzerland, especially in precipitation changes over years that could affect Rhone River discharge and related turbid inputs to Lake Geneva. Mediterranean climatic activity thus seems to play a role in changes in the total number of whiting events per year. When the WeMOi is high, whiting days related to Rhone River inputs (i.e., the 95% of total events in our case) are more frequent.
In addition, we observe a seasonal trend with the increase of early whitings since 2000 (Figure 7c). This change coincides with a change in climate regime due to the AMO (Figure 8b). Indeed, the positive values of the index since 2000 and the observed upward trend show the general increase in temperatures measured in Europe [55]. The latter changes the Swiss climate, and the physical conditions of the lake, especially the temperature and stratification of the surface water that warmed and stratified earlier in the year. The conditions necessary for the onset of whitings in Lake Geneva are therefore met earlier in the year, in terms of Rhone River inputs, water temperature, and water column stratification.
Although our study significantly quantified the inter-annual variability in the total number of whiting events and the trend in their phenology (p-values < 0.01), correlation coefficients of only 0.36 and 0.33, respectively, have been obtained (Figure 8). Other environmental, region-specific factors probably actively participate in the inter-annual change in whiting occurrences. Among them, the increase in alkalinity and Ca 2+ concentration of the Rhône over last decades [56], as well as changes in discharge and sediment load related to human activities [57], could be the origin of an additional variability that cannot be quantified from climatic indices.
To go further, future changes in Mediterranean and Atlantic activities related to global warming could influence environmental conditions in Switzerland. The trend in the number of whiting days per year depends on the Rhône discharge, impacted mainly by precipitation, snow, and ice melt. Based on the work of [58], the annual Rhône discharge could remain stable in the future (2020-2100), leading to a total number of whitings that does not follow a specific trend, but from whose annual changes are in line with the WeMOi. However, the contribution of the Rhône discharge could highly change with an increase in rainfall, related to a decrease in the snow and ice melt induced by earlier warmer temperatures. This could cause a change in the peak discharge of the Rhône with maximal discharges met earlier in the year. On the other hand, higher water temperatures may positively act on calcite supersaturation (due to its retrograde solubility). The periods of calcite supersaturation and lake stratification may start earlier and last longer. All this may change the relative influence of the environmental drivers identified in this work, with a change in whiting phenology and abundances of class 1 vs. class 2 whitings in Lake Geneva, in line with changes in AMO.
This shift in whiting phenology could have several consequences on the functioning of the lake ecosystem. First, as whitings increase lake surface turbidity, light-dependent processes such as spring phytoplankton blooms could be altered. Earlier whitings could decrease the intensity of light received during these crucial bloom periods [19,22]. In addition, the carbon transfer to the benthic layer in the form of calcite actively participates in nutrient cycling. It appears crucial to estimate the impact that climate change may have on the future evolution of the frequency of whitings. The role of these events in the annual CaCO 3 precipitation and its transfer to the benthic ecosystem and the burial of carbon remains to be determined.

Conclusions
Building upon machine learning techniques and temporal series analyses, this work leverages the temporal extent of existing remote sensing datasets to provide a first assessment of the long-term spatiotemporal variability of whiting occurrences in Lake Geneva. We show that the by-far dominant Rhone-driven whiting events in the northeastern part of the lake occur in repeatable environmental conditions of both the inflowing river and the lake so that a random forest algorithm could predict the occurrence and timing of whiting events from the lake and Rhône long-term monitoring data retrospectively. The analysis of the reconstructed daily time series of whiting days over 1958-2021 revealed no specific trend in the number of whiting days per year, but rather a large inter-annual variability that was instead partially linked to the Mediterranean activity (WeMOi). The phenology of whitings has yet shifted, especially since the year 2000, with more frequent early spring events correlated to an increase of the AMO index. These results show the influence of the Mediterranean and Atlantic activities on the occurrences of whitings in Lake Geneva. Funding: This study was supported by the CARBOGEN project (SNF 200021_175530) and a studentassistant fellowship from the IDYST to M. Ferrari.

Data Availability Statement:
In All data used in the framework of this research are available throughout the paper. All authors confirm their availability to provide any technical assistance in the framework of similar investigations no matter where they would be undertaken.