Exposing the Fundamental Role of Spectral Scattering in the PFT Signal

There is increasing interdisciplinary interest in phytoplankton community dynamics as the growing environmental problems of water quality (particularly eutrophication) and climate change demand attention. This has led to a pressing need for improved biophysical and causal understanding of Phytoplankton Functional Type (PFT) optical signals, in order that satellite radiometry may be used to detect ecologically relevant phytoplankton assemblage changes. This understanding can best be achieved with biophysically and biogeochemically consistent phytoplankton Inherent Optical Property (IOP) models, as it is only via modelling that phytoplankton assemblage characteristics can be examined systematically in relation to the bulk optical waterleaving signal. The Equivalent Algal Populations (EAP) model is used here to investigate the source and magnitude of sizeand pigmentdriven PFT signals in the water-leaving reflectance, as well as the potential to detect these using satellite radiometry. This model places emphasis on explicit biophysical modelling of the phytoplankton population as a holistic determinant of IOPs, and a distinctive attribute is its comprehensive handling of the spectral and angular character of phytoplankton scattering. Selected case studies and sensitivity analyses reveal that phytoplankton spectral scattering is the primary driver of the PFT-related signal. Key findings are that the backscattering-driven signal in the 520 to 600 nm region is the critical PFT identifier at marginal biomass, and that while PFT information does appear at blue and red wavelengths, it is compromised by biomass/gelbstoff ambiguity in the blue and low signal in the red, due primarily to absorption by water. These findings are hoped to provide considerable insight into the next generation of PFT algorithms.


Introduction
Phytoplankton across the world's oceans represent about half of all primary production on our planet [1,2].Their growth and function are fundamental to sustaining life: they constitute the foundation of the aquatic food web, and serve critical roles in the recycling of essential elements such as carbon and nitrogen, as well as in remineralisation [3][4][5].Being so responsive to nutrient availability and water temperature, these tiny organisms are key indicators of ecosystem change, and understanding their community dynamics is key to answering some of the most challenging earth science questions of our time.It has long been appreciated that phytoplankton have a direct effect on the observable colour of the ocean, and broad scale biomass estimates based on Chl a concentrations derived from satellite radiometry are widely relied upon despite persistent uncertainty in the accuracy of information derived from satellite imagery [6,7].Recently there has been considerable interest in more detailed information on phytoplankton assemblage characteristics [8][9][10][11], but it has not been widely ascertained to what degree Phytoplankton Functional Type (PFT) information can be gleaned from satellite data, and at what level of confidence.Furthermore, descriptions of PFTs differ with context -and the potential for distinguishing their ecological roles from their optical signatures must also be considered.The causal effect of biophysical phytoplankton characteristics on the optical water-leaving signal is at the heart of all of these requirements, and this is undoubtedly an outstanding topic in ocean optics.
Any useable radiometric PFT signal results directly from the interaction of phytoplankton with their light environment, but the physical basis of this interaction is not well understood in terms of observed variability across the wide diversity of aquatic environments and phytoplankton assemblages [12,13].In low biomass it is the strong absorption by phytoplankton which dominates the phytoplankton contribution to the ocean colour signature, and has therefore been identified as a promising signal in terms of PFT identification e.g.[14,15] and others.But distinguishing the effects of variable phytoplankton absorption due to biomass changes, from the effects of differential phytoplankton absorption due to functional type changes, is not straightforward.This ambiguity in the phytoplankton signal is at the core of the PFT problem.It is then overlaid with further complexity, given that a potential PFT signal from the phytoplankton component of a water body's optical constituents must be considered in the context of the bulk optical signal, recognising the contributions from the non-algal sources of optical variability: absorption due to CDOM (Coloured Dissolved Organic Matter) and detrital particles, i.e. a gd (λ), and non-algal backscatter i.e. b bnap (λ) [13].The blue spectral region of maximum phytoplankton absorption is also the region most affected by CDOM absorption 1 .Where biomass and CDOM co-vary, empirical PFT algorithms may be successful.But Brewin et al. [6] acknowledges that as biomass increases, both the abundance-based approaches as well as approaches relying on differential absorption, break down.
The relative contributions of phytoplankton absorption and scatter to the bulk optical signal change with biomass, size and other functional type traits, and as the a gd (λ) and b bnap (λ) components vary.The total water-leaving signal is a delicate balance of the frequently opposing optical effects of biomass and the second order assemblage variability such as size, pigments and ultrastructure, together with the non-algal in-water consistuents.It was observed by Brown et al. [13] that backscatter anomaly maps (i.e.backscatter independent of variability due to biomass) correlate approximately with PFT distribution maps calculated from optical anomalies which are attributed to differences in phytoplankton accessory pigments [14] rather than backscattering characteristics.This leads to their suggestion that the Alvain criteria used to distinguish PFTs, identified as representing absorption signatures [14], are in fact primarily due to backscattering characteristics [13], indicating that phytoplankton groups either directly determine, or perhaps are simply associated with, backscattering variability around the mean.
Brown et al. [13] conclude that these relationships can only be fully explored if a method is applied where the phytoplankton groups are causally linked to the optical conditions.The Equivalent Algal Populations (EAP) model provides exactly such a method, and is used here to investigate the impact of size and pigment based PFT variability on the optical signal, and to confirm the assertion [13,17] that biomass drives the largest part of observed variability in the water-leaving signal, and that the radiometric signal in the blue is ambiguous due to the effects of a gd (λ), and the additional effects of b bnap (λ).

The requirement for a biophysically consistent PFT optical model
The widespread distribution and integral role of phytoplankton in global marine ecosystems means that these fields of study depend heavily on modelling together with satellite data for any large scale analysis.In situ data collection is indispensible for local scale investigations and for ground 1 It should also be noted that in the context of satellite radiometry, blue spectral bands display the largest measurement uncertainties [7,16].This contributes to satellite-derived CDOM products generally underperforming, and implies that any inversion relying on a phytoplankton signal in the blue will likewise perform poorly.
truthing of satellite and model data, but simultaneous large scale direct measurements are logistically impossible.Optical measurements in natural waters are challenging: they are expensive and logistically difficult, technically complex due to large dynamic ranges of the signal, and overall require delicate, rigorously calibrated instrumentation with precise knowledge of sources of error.Remote sensing and moored in situ instrumentation are the only feasible ways to acquire continuous data series, but these largely involve measurements of the bulk optics.Isolating the respective optical components for laboratory assessment is a significant further undertaking.In situ and laboratory measurements are consequently extremely valuable, and models such as the Equivalent Algal Populations (EAP) model provide essential tools for the analysis and understanding of these bulk measurements, whether aboveor sub-surface.
The EAP model was developed to understand the causality-driven impact of different phytoplankton assemblages on the water-leaving optical signal.Optical variability in phytoplankton is known to be driven by particle size (effective diameter D e f f ) [10,18], pigment quantity and type, cellular material, shape, and aggregation [19].The model focuses primarily on the D e f f parameter which is of fundamental importance both optically and ecologically [10,20].
Due to immense species diversity and variability in distribution, the Phytoplankton Functional Type (PFT) approach (e.g.Sathyendranath et al. [8], Alvain et al. [14], Ciotti and Bricaud [21], Nair et al. [22]) groups phytoplankton species according to their biogeochemical function and attempts to relate this to their biophysical characteristics, with size as a major consideration [10,20,23].This approach is important for oceanic waters, characterised by widespread but low biomass, which contribute the largest proportion of global oceanic primary production [1].Cell size governs many biological traits [24]; smaller phytoplankton are widespread and play an important role in nutrient recycling, while larger phytoplankton often display the highest growth rates [24].The dynamics of phytoplankton ecology have profound and intricate influence not only on oceanic biogeochemistry (e.g.acidification, and its effects on both CO 2 uptake and on marine life) but also on higher trophic levels e.g.fisheries, as certain phytoplankton environments promote the development of different fish populations [23].A size-based PFT approach is particularly meaningful in the context of carbon sequestration [20], as particle size in large part determines sinking rates.
But phytoplankton ecology is complex, and modelling PFTs with adequate parameterisation in a biogeochemical context is consequently extremely challenging [12].Following the EAP's conceptual intent to understand the impact of D e f f as the primary second-order optical determinant, other sources of bio-optical variability are intentionally constrained.PFTs can therefore to first order be approached from a size-based perspective, and the EAP model consequently lends itself extremely well to PFT sensitivity studies in terms of its ability to isolate small differences in reflectance resulting only from variability in assemblage size distribution [25].The model does additionally provide scope for varying other biophysical attributes within a population (such as the pigment-determined spectral refractive indices, the shape of the size distribution itself, the ratio of core to shell sphere volumes, and the cellular Chlorophyll a density of the cells in the distribution), as required.It shoud be noted however, that the model is not intended as a full representation of phytoplankton optical complexities, and there is certainly ecologically significant natural variability in phytoplankton IOPs e.g.dependent on their growth state [26].
When applying empirically based models to satellite data there is additional uncertainty in relating biogeochemical parameters to optical ones.Abundance-type approaches, following observed relationships between phytoplankton assemblage taxonomic information (e.g.pigments) and biomass, show good results in low biomass conditions where the covariability of the phytoplankton optical contribution with that of other in-water consituents generally holds [27], but do not address the sources of second order variability or optical causality [13], or the likelihood that these empirical relationships will not withstand the ecological shifts resulting from changing climatic conditions [6].A biophysical approach to PFTs not only allows improved analysis of sensitivity and causality, but is likely to have greater validity in a future ocean.The optical impact of a phytoplankton assemblage interacting with its aquatic environment is by no means straightforward, and a rigorous IOP model such as the EAP can systematically vary phytoplankton biogeophysical attributes in the context of likely additional non-algal absorption and scatter, and can examine the resulting effects on the light field when used in combination with a Radiative Transfer (RT) model.There is a bulk effect attributable simply to biomass, for which Chlorophyll a (Chl a) is used as a proxy 2 , and which for the most part dominates the phytoplankton-related signal in Case 1 waters [28].PFT characteristics generally result in second-order optical effects: accessory pigments dominate assemblage absorption characteristics [29], and particle size is usually the primary determinant of phytoplankton scattering characteristics [30] (excepting the influence of ultrastructure in certain species, e.g.highly scattering liths or vacuoles).Natural waters are also subject to non-algal absorption, frequently referred to as absorption by Coloured Dissolved Organic Matter (CDOM) or gelbstoff, as well as non-algal scatter which can include scatter by detrital matter, sediment, bacteria, and/or bubbles.These quantities absorb and scatter incident light in different spectral regions from phytoplankton, and their subsequent optical interactions and resulting effect on the bulk signal are highly complex.Understanding the interaction between cells' biophysical characteristics and the light field in the presence of these additional optically active consituents is central to determining which parts of the optical signal are useable for PFT diagnostics, and likewise, where signal ambiguity is prohibitive.

Equivalent Algal Populations model: principal attributes
The initial development of the EAP model [31] was driven by the requirement for a model capable of accurately handling very high phytoplankton biomass blooms in the Southern Benguela upwelling system, with the eventual aim of Harmful Algal Bloom detection, identification and monitoring with satellite data.The productive Benguela waters are considered 'extreme Case 1' where the optics are dominated by phytoplankton, with a strong biomass-related in-water signal distinct from the lesser contributions of CDOM (or gelbstoff) and non-algal consistuents e.g.detritrus, sediment, bacteria, and bubbles.Additionally, elevated biomass requires close attention to the spectral backscattering characteristics of phytoplankton [32][33][34] and so for the Benguela and other highly scattering high biomass environments, a model is required that addresses this explicitly.
Phytoplankton-dominated waters with significant biomass are arguably the simplest environments in which to develop a phytoplankton IOP model, due to the overwhelming contribution of phytoplankton to the bulk optical signal; it can be assumed that a successful model addresses the phytoplankton component accurately.Additionally, because of the dominance of phytoplankton IOPs, the additional in-water consituents can be adequately modelled using generalised approximations [35][36][37].When it comes to much lower biomass waters where phytoplankton do not dominate the optics, uncertainty in the optical contribution of any of the consistuents is magnified as they are combined in more comparable proportions.Models designed for lower biomass tend to underperform in higher biomass conditions when phytoplankton IOPs dominate [38].It follows therefore, that the phytoplankton component of bulk water properties is not generally well represented in these models.Good information on the phytoplankton component is a prerequisite for any quantitative comment on the optical contribution of respective PFTs, or identifying changes in the bulk optical properties of seawater as dominant PFTs change.Only when representing the detailed nature of phytoplankton optics, with absorption and scattering biophysically consistent -as they are in nature -is a causal understanding of their interactive effect on the optical signal possible.

2
It is acknowledged that Chl a concentration and biomass are not equivalent, as biomass includes non-pigmented biological matter in quantities which may not be proportional to pigmented matter.However, for the purposes of this study, biomass and Chl a concentration are used interchangeably, as this work is approached from a purely optical perspective and ignores non-pigmented biological matter.From these needs arose the EAP model with its two-layered sphere particle and equivalent size-based community structure [31] which enables the calculation of phytoplankton IOPs from first principles, presenting a valuable opportunity for furthering the understanding of causal relationships between phytoplankton physiology and their optical characteristics based on quantified community structure.
At the core of the EAP model are the phytoplankton particle refractive indices, with the imaginary part of the refractive index approximately representing that portion of light which is absorbed by the cell, and the real part of the refractive index representing that portion of light which is scattered.The imaginary and real parts of the refractive index spectra are numerically linked through the Kramers-Kronig relations, and numerically linked to the specified intra-cellular Chl a concentration [31].For eukaryotic particles, a core sphere represents the cytoplasm (which contains approximately 80% water, and is almost colourless), while an outer sphere represents the more refractive chloroplast, where the pigmented material (generally Chl a in the largest part) is also strongly absorbing.Refractive indices for the chloroplast spheres were derived from samples taken from actual Benguela bloomsdinoflagellate and diatoms, as well as for a phycoerythrin-associated cryptophyte group (based on a Mesodinium rubrum/ Myrionecta rubra -dominated assemblage [31].A cyanobacterial group, with substantially altered geometry to represent vacuolated cells, was added later [39]. A critical feature of the model is that a * φ is constrained at 675 nm to reflect the theoretical maximum absorption by unpackaged phytoplankton of 0.027 mg.m −2 as per Johnsen et al. [40].This is incorporated into the calculation of the imaginary refractive index of the chloroplast layer n chlor (outer sphere), based on the assumption that the cytoplasm layer (inner sphere) has no signficant absorption at 675 nm: where n media = 1.334 and Vv is the relative chloroplast volume, c i is the intracellular Chl a, and a * sol (675) is the Chl a-specific absorption at 675 nm of that pigment in solution, i.e. unpackaged [31].
The effect of constraining the unpackaged absorption in this way is to establish a quantitative relationship between the intra-cellular Chl a and the cell volume; a relationship which is biophysically consistent as the cell size varies [31].This results in an effectively decreasing Chl a-specific absorption with increasing size, observable in the resulting optics as the "package effect" [35].
When coupled with a radiative transfer model -here, Hydrolight-Ecolight (Numerical Optics, Ltd.) is used -the interactions of phytoplankton IOPs (in combination with those of other in-water constituents) with the surrounding light field can be examined systematically.A full physics-based model such as this has the additional advantage of providing not only biophysically interrelated particle absorptions, scattering and backscattering, but IOPs for assemblages that are integrated over the entire assemblage size distribution, and which are fully angularly resolved.This presents the unique opportunity of closely examining simluated phytoplankton phase functions, which are notoriously difficult to measure, and whose behaviour in terms of variability in particle size and wavelength is poorly understood.With no decoupling of absorption and backscattering, and IOPs integrated over the entire size distribution, the model provides an unprecedented opportunity to examine the drivers of variability in phytoplankton optical signals systematically.
The EAP IOPs used here are from two groups, calculated using a generalised set of Chl-a-carotenoid diatom/dinoflagellate refractive indices and a set of phycoerythrin-based refractive indices (originally derived from measurements of Mesodinium Rubra [31]).The IOPs are presented, with explanatory notes, in the Appendix.They are combined, in various proportions as indicated, with appropriate non-algal optical constituents as detailed for each experiment.Water types are considered homogenous with depth, generic atmospheric and geographic conditions, and the full radiative transfer solution is calculated by Hydrolight at a spectral resolution of 5 nm.Given the technical challenges with using EAP phase functions for modelling high resolution spectra [41] Fournier Forand phase function chosen for the backscatter fraction of the combined particulate IOPs is used at each wavelength throughout these experiments.A basic fluorescence efficiency model is included for completeness (detailed in the Appendix) but modelling this spectral region accurately is challenging and outside of the scope of this work, so the features of this spectral region are not discussed in terms of PFT sensitivity.

Results and Discussion
3.1.Quantifying the contribution of phytoplankton to the R rs signal Remembering that the Remote Sensing Reflectance (R rs ) is grossly proportional to b b /a, it should be noted that for a given D e f f and phytoplankton group, b bφ /a φ will be constant for any concentration of Chl a, but the contribution of the phytoplankton IOPs to the total, i.e. b bφ /a φ as a percentage of total b b /a, will vary.The EAP model, used together with Hydrolight, allows the inspection of any component optical quantity of interest, and a study of the proportional contribution of each component to the bulk IOPs is included in the supplementary material.Here, EAP phytoplankton IOPs are used with Hydrolight to calculate a full radiative transfer solution resulting in a new theoretical quantity, R rs φ.This quantity is introduced as an approximate quantification of the phytoplankton contribution to the bulk R rs , in order to more intuitively understand the relative optical contributions in terms of remote sensing.R rs φ is the calculation of reflectance with only water and phytoplankton IOPs.It does not account for any optical interaction between the phytoplankton and other in-water constituents likely to be present in natural waters, such as CDOM or detrital and mineral particles.These interactions are assumed to be secondary to the contribution of phytoplankton, but have not been quantified.It is anticipated that trans-spectral effects are most likely to suffer from this type of subtractive approach, but a full photon tracing model (such as a Monte Carlo model) would be needed to ascertain this.By modelling the phytoplankton contribution to the water-leaving signal we can assess the availability of signal for PFT retrieval.
Being able to identify the spectral regions sensitive to changes in phytoplankton assemblage (focusing on those due to change in assemblage D e f f ) is valuable, especially to identify spectral regions which might be sufficiently independent from the ambiguity introduced by other in-water constituents.This allows the quantification of the phytoplankton signal with confidence, even where these other constituents are not well characterised.The spectral regions of maximum proportional phytoplankton signal are the ones which hold potential for detecting PFT changes from an in-water perspective, as these represent the regions of the largest phytoplankton-related signal variability as the assemblage changes.
The resulting contribution of phytoplankton to the total R rs is shown in Fig. 1, for typical Case 1 waters as a simple illustrative example.As the phytoplankton contribution to the IOPs increases (i.e.generally, as biomass increases), the impact of the other constituents is proportionally less in the R rs .This is observable in Fig. 1 to a greater degree in the R rs with the smaller D e f f of 2 µm as compared with the larger of 12 µm -the higher level of phytoplankton backscatter contributing to brighter R rs which is less sensitive to the addition of scattering from other sources.
For each D e f f , it is evident that the phytoplankton percentage contribution to the bulk R rs increases with biomass.But it can be seen that there is a dependency on D e f f which, when considered in the context of transitioning assemblages, is not straightforward.This observation indicates a requirement to go beyond the Case 1/Case 2 water type distinction for PFT signal analysis and applications.When it comes to retrieving information about the phytoplankton IOPs, their proportional contribution to the bulk water-leaving signal (or IOPs) should be considered.An additional figure representing the proportional IOP contributions is included in Appendix C.

Separating the effects of biomass from the effects of PFT (D e f f ) change
The complex optical interactions of D e f f and biomass, and the question of whether they can be separated into a useable PFT signal from a background environment of further non-algal optical complexity, is best addressed by investigating specific ecological events of interest to the remote sensing community.
As shown in Lain et al. [38] and Lain et al. [41], where the water-leaving signal is phytoplankton-dominated (e.g. in the Benguela system), it is quite reasonable to expect that some PFT information may be derived from the bulk signal.But the challenge for the ocean colour community is determining the PFT signal in low biomass oceanic conditions, for example in the Southern Ocean.
Phytoplankton dynamics in the Southern Ocean are particularly important for their role in uptake of anthropogenic CO 2 (around half of all oceanic uptake), and hence carbon sequestration [3,4].Variability in phytoplankton ecology is directly linked to mineral and nutrient cycles: assemblages of large diatoms drive primary productivity and carbon export, while assemblages of small phytoplankton play a significant role in nutrient recycling although the net productivity is very low [42].
The third Southern Ocean Seasonal Cycle Experiment (SOSCEx III) undertaken on the SANAE 55 cruise (austral winter 2015) provides the phytoplankton size distribution and Chl a data for this experiment [45].Assemblage D e f f were calculated from Coulter Counter measurements, and Chl a determined by fluorometric analysis [5].The additional a gd (λ) and b bnap (λ) components for the R rs in Fig. 6 were estimated guided by observations in [43] and [44] respectively, noting that these are simply used to approximate the bulk R rs in Fig. 2 measured D e f f and Chl a concentrations, and were combined with these estimates and run through Hydrolight to produce the modelled R rs . 3 Fig. 2 presents two distinct events which illustrate the interdependency of the size and biomass signals.Modelled R rs are shown for selected adjacent stations (20 to 21 is marked A; 12 to 13 is marked B) where a nominal threshhold of change detectable by satellite is reached in the blue and green spectral regions, in other words, where a change in R rs would be evident on a satellite image 4 .Both examples display large changes in R rs , but these are causally distinct: (A) represents a large change in Chl a concentration and in D e f f , while (B) represents a large change in Chl a concentration but a negligible change in D e f f .Station 20 to 21 therefore represents a significant phytoplankton community shift, as large changes in both D e f f (from 6 to 16 µm) and Chl a concentration (from 1 to 11 mg.m −3 ) were recorded.To isolate this change in phytoplankton signal, the differences in R rs φ for an assemblage D e f f of 6 µm and an assemblage D e f f of 16 µm are presented in Fig. 3 (A) for the measured range of Chl a concentration.The spectral location of the most promising size-related signal for PFT retrieval is evidently dependent 3 Given that the refractive indices used to model the EAP IOPs for this example are from the generalised Chl a-carotenoid group suitable for diatom and dinoflagellate species, the likelihood of encountering Phaeocystis sp. in the Southern Ocean must be addressed.Given the oceanographic context, as the D e f f of 16 µm is reached, it can reasonably be assumed that the assemblage comprises both diatoms and Phaeocystis.The main accessory pigment in Phaeocystis is 19-hexanolyoxyfucoxanthin, a derivative of fucoxanthin, a dominant light harvesting pigment in diatoms, and so it may be reasonable to model the intracellular absorption properties of individual cells with the generalised eukaryote refractive indices, but this species forms large floating colonies which result in quite different optical effects, and this cannot currently be addressed with the model.So while the likely presence of Phaeocystis is acknowledged, it is not explicitly catered for in the modelling.This does not affect the observations on identifying changes in D e f f in the discussion below.4 Given the ambiguity in the causality of the phytoplankton signal, assessing the magnitude of changes to the water-leaving signal as the in-water constituents vary will give an indication of whether there may be enough radiometric signal at TOA to even detect the change.A threshold in situ measurement resolution of 1 × 10 −4 sr −1 [46] is taken as an indication of sensitivity to detecting change in R rs by direct measurement.Given an average estimated uncertainty in satellite R rs of ± 0.6 × 10 −3 sr −1 across the spectrum [47], here a conservative 1 × 10 −3 sr −1 is used to indicate a potentially detectable change in water-leaving signal from satellite.These thresholds are not definitive and are used purely for the purpose of contextualising the discussion.on biomass, and at low biomass it is positioned near 435 nm, while at higher biomass it is around 570 nm.As this is the phytoplankton-only signal, the question remains to what extent this signal is expressed in the bulk R rs , when the optical impact of the non-algal constituents is also considered.
Working with the change in phytoplankton size signal identified at 435 nm (bottom left corner, Fig. 3 A), a gd (λ) is added at increasing concentrations to simulate a range of bulk R rs at 435 nm in Fig. 3 (B), and b bnap (λ) is likewise added incrementally at 570 nm in Fig. 3 (C).In these plots, horizontal gradients indicate R rs sensitivity primarily to the constituent on the y axis, while vertical gradients indicate that the change in R rs is driven by the biomass, and is not sensitive to variability on the y axis.
Fig. 3 (B) shows that the difference in bulk R rs for the given δD e f f is only detectable at the satellite threshold level (shown in yellow) at low biomass under very low a gd (λ) conditions.As biomass increases, increasing absorption by phytoplankton as well as by additional a gd (λ), reduces the magnitude of the water-leaving signal and renders any δD e f f information ambiguous.When additionally considering the brightening effect of b bnap (λ) in the blue (not quantified here), it can readily be perceived that the water-leaving signal is too complex at 435 nm to retrieve useful size information.
In Fig. 3 (C), the relationship with b bnap (λ) at 570 nm is more straightforward.Change in R rs due to δD e f f is detectable in the bulk R rs at the satellite threshold (in red) from about 2.5 mg.m −3 upwards regardless of the b bnap (λ) contribution, at least for oceanic Case 1 type conditions.The magnitude of this signal is almost entirely biomass driven.(This is in line with the observation made by [13] that the MODIS wavebands at 531 and 551 nm are good indicators of backscatter anomalies because their magnitude is proportional to the addition or removal of particulate backscattering, and the longer wavelength band at 551 nm is less affected by variability in both a gd (λ) and phytoplankton absorption [10].) It should be appreciated, though, that R rs φ in these figures is representing the change in R rs due to size at a particular biomass (i.e.biomass is constant while assemblage characteristics vary).while biomass and size effects combine to form large changes in R rs φ in the blue, it is the smaller signal around 570 nm that contains the most size-driven change as it is not affected by biomass to the same degree.Figure 3 (B) and (C) show that the signal at 435 nm is sensitive to the effects of variable a gd , while the phytoplankton signal at 570 nm remains robust against variability in the non-algal optical contributions.
By contrast, stations 12 to 13 exhibit a large change in R rs -seen first in Fig. 3 (B); shown again in Fig. 5 (A) -with an increase in Chl a from 0.9 to 7.1 mg.m −3 but only a very small change in D e f f from 7 to 8 µm.This is likely, given the location in the lee of the South Sandwich Islands, to reflect a diatom bloom associated with island wake effects, due to fertilisation by terrestrial iron [48].Tracing the signal due to this change in D e f f across all Chl a concentrations in this range in Fig. 5 (B) shows that there is a size related signal between 550 and 600 nm but it is of an order of magnitude less than in the previous example, and so does not show potential for detection by satellite radiometry.This is illustrated further in the lower panel (C), showing the location of this signal, but that it is almost all attributable to biomass -as shown by the R rs φ representing D e f f 7 at 7.1 mg.m −3 i.e. what the higher biomass R rs would look like without the increase in effective diameter as the assemblage changes.It can be seen quite clearly from these spectra that a difference in the blue due only to this δD e f f , with any variability a gd (λ), would not be detectable by any means.
It should be noted that the spectral locations of maximum δD e f f features are a direct consequence of the spectral nature of the IOPs used in the modelling, and that both of these examples use the same Chl a-carotenoid refractive indices to generate the phytoplankton IOPs.The spectral character of the optical effects of assemblage changes will differ as phytoplankton IOPs are varied to accommodate pigment differences, for example.A slight migration in the exact location of the maximum available δD e f f signal is observable with different ranges of D e f f , although within the Chl a-carotenoid group it remains between 550 and 600 nm for any difference in D e f f between 1 and 40 µm.

Addressing pigment variability
The assemblages modelled in the above examples address optical changes due only to biomass (i.e.concentration of Chl a pigment) and size (assemblage D e f f ), as the same set of generalised Chl a-carotenoid refractive indices is used for all phytoplankton particles represented.But this approach addresses only a small subset of important changes in phytoplankton assemblage type, and in the presence of variability in dominant accessory pigments, the EAP model can be set to incorporate different refractive indices as appropriate for phytoplankton displaying accessory pigments other than carotenoids.In the Southern Ocean, the spring bloom is one of the most important ecological events as it signals the change from Synechococcus sp.-dominated very low biomass, and low productivity, to diatom-dominated waters with high productivity and implications for carbon export and sequestration [42].It is also an interesting case study in terms of the EAP model as it presents the opportunity to identify optical changes which are pigment-driven as well as by D e f f .Synechococcus sp.IOPs, detailed in the Appendix, were modelled using phycoerythrin-containing refractive indices with an assemblage effective diameter of 1 µm, and the resulting phytoplankton specific absorption spectrum was compared with those measured by Morel et al. [49] to ensure consistency.The different effects of size and pigment content on the IOPs can be seen in Appendix figures A5 and A6.
As this example is not only size-driven but also features pigment changes, the changing assemblage is modelled with the incremental change in proportional contribution of Synechococcus sp. to diatoms, with the initial population comprising 90% Synechococcus sp.(with D e f f 1 µm) and 10 % diatoms (with D e f f 8 µm), and δR rs calculated at 50% Synechococcus sp., 50% diatoms (Fig. 7, top), and then at 100% diatoms (Fig. 7, bottom).
The influence of differential pigmentation on the location of the δR rs signal can be readily observed when compared with the previous case study.The assemblage change from 90% to 50% Synechococcus sp.reveals a satellite-detectable δR rs at a biomass of about 2 mg.m −3 .This is a similar observation to that in Morel [50], which concludes that detecting the phycoerythrin pigment differential is difficult even in substantial biomass (Chl a ∼ 1.25 mg.m −3 ).
By the time the assemblage comprises only diatoms, the threshold of detection by satellite is reached at about 1 mg.m −3 .Note that this signal is not entirely due to phycoerythrin, as this is an important spectral region for size changes too.Importantly, the spectral regions of maximum δR rs for this biomass range are not impacted by the proportion of Synechococcus sp. to eukaryotes, but the signal is strengthened by increasing the change in both forms of assemblage variability -biomass and PFT composition with respect to both size and pigment variability.

Radiometric sensitivity of EAP size-based PFT detection -magnitude of δR rs φ
Having established that the PFT signal in the blue is easily overwhelmed by the effects of a gd (λ) and b bnap (λ), the PFT signal due to phytoplankton scattering in the 500 to 600 nm region can be evaluated for sensitivity in terms of changes in D e f f and biomass.To this end, the EAP model is again coupled with Hydrolight to simulate expected variability in R rs due to changes in D e f f with the aim of evaluating the sensitivity of the model.
A general allometric approximation of changing D e f f from 2 to 8 µm was chosen for this example (0.1 to 10 mg.m −3 ).It is recognised that this scenario does not represent all possible ecological changes, but is a reasonable approximations for a mid-range biomass diatom and dinoflagellate-dominated environment where there may be a detectable PFT signal.Note also that this experiment addresses only the phytoplankton-related signal, indicating only the minimum optically detectable changes, and that it is necessary to evaluate these in the context of ambiguity with varying non-algal constituents when attempting to identify these signals in the bulk R rs .
Figure 8 (A) demonstrates how the combined effects of biomass and D e f f interact to form the maximum available δR rs signal at low biomass and small size ranges.The figure shows the maximum δR rs φ signal between 520 and 600 nm -the exact wavelength varies with both size difference and biomass.The shifting position of maximum δR rs φ is shown in Fig. 8 (B).Increasing biomass improves the ability to trace the size-related effects.
Using 1 × 10 −3 sr −1 as a threshhold for detection by satellite, it can be seen that an ecologically significant shift in D e f f from 2 or 3 to 6 µm, such as at the onset of an oceanic bloom, looks potentially detectable from about 2 mg.m −3 .By 10 mg.m −3 even a small change in D e f f results in a detectable change in R rs , but as biomass falls below this, the change in D e f f must be increasingly large to be detected.This is consistent with inversion studies of EAP sensitivity [25].The spectrally shifting nature of the δR rs φ signal for oceanic PFT applications provides a strong case for hyperspectral sensors in the 520 to 570 nm wavelength region.The extent to which the δR rs φ signal persists in fixed waveband ratios is investigated in the next section on shape sensitivity.

Spectral shape sensitivity of EAP size-based PFT detection
To further test the sensitivity of the EAP model and the causal IOP variability in terms of identifiable changes in spectral shape from a multi-spectral perspective, R rs φ ratios for 440:560 nm (blue:green), 560:665 nm (green:red) and 665:710 nm (red:NIR) wavelengths were calculated for a range of D e f f and biomass.These are shown in Figure 9, representing corresponding changes in both R rs φ and in the underlying (causal) phytoplankton backscattering and absorption, for these wavelength pairs.The B:G R rs φ ratio shows a strong biomass dependency and a small sensitivity to size at large sizes, for 0.5 ≤ Chl a ≤ 4.5 mg.m −3 .The R:NIR ratio shows some sensitivity to larger sizes from about 3 mg.m −3 but this decreases as biomass increases.The G:R ratio shows a significant size-related feature for small sizes (≤ 6 µm) from biomass of about 2 mg.m −3 upwards (encircled in Fig. 9).This where a peak in the corresponding b bφ ratio appears, suggesting that the large change in magnitude of b bφ between small D e f f (Fig. 10) is directly responsible for the sensitivity in the R rs φ G:R ratio seen in Fig. 9.This is an important finding.There is a marked size dependency in all of the b * b φ ratios, with the greatest rate of change somewhere between D e f f 2 and 8 micron, but it is only in the case of the G:R ratio that the magnitude of the backscatter is sufficient for this signal to be identifiable in the R rs φ.Given that the radiometric signal in the blue is greatly reduced by large phytoplankton absorption and a gd , and the red and NIR wavelengths are similarly affected by the absorption of water, it can be concluded that the main driver of the useable PFT signal in the green and red is phytoplankton backscatter.
Figure 11 shows the rapid increase in the proportional contribution of phytoplankton to total backscatter at 560 and 665 nm.It is known that for typical diatom/dinoflagellate assemblages the 560 nm region is more influenced by backscatter than by absorption.The fact that the magnitude of the total backscatter is much lower at 665 than at 560 nm, together with the strong absorption by water in this region, result in a small useable R rs signal.A contribution of approximately 40% of phytoplankton to total b b at 560 nm corresponds with the limits of detectable δR rs φ (see Appendix C), indicating that this is the proportion at which phytoplankton backscatter starts driving the bulk water-leaving signal around 560 nm.Consequently, this is the minimum contribution for which some δD e f f information may be known.For an oceanic bloom example δD e f f from 2 -6 µm this threshold contribution is reached at about 2 mg.m −3 , while to detect an example δD e f f of 10 to 20 µm in a eukaryotic succession, extremely high biomass is required.δR rs φ occurs at different wavelengths from 500 to 600 nm, and this shows the maximum signal, so there is no exact wavelength information here.Using a difference of 1 × 10 −3 sr −1 as a threshhold for detection by satellite, it can be seen that by 10 mg.m −3 even a small change in D e f f results in a detectable change in R rs .

Considering Uncertainties
Particularly when considering δR rs retrievals from satellite, it is important and necessary to contextualise the magnitude of the PFT signal with respect to uncertainties on the satellite radiometry.A detailed study on model and associated radiometric uncertainty is available in the supplementary material.An important observation is that while the 500 to 600 nm region of promising PFT signal may be mostly insensitive to the effects of non-algal consituents, it is also where variability in R rs due to the different approaches to phytoplankton phase functions is important, emphasising the critical role of phytoplankton scatter in this signal.

Conclusions
Understanding the proportional phytoplankton contribution to the total IOP budget and the resulting water-leaving signal is central to the determination of sufficient phytoplankton-driven signal containing PFT information.The proportional 'net' contribution of phytoplankton i.e. b bφ /a φ as a percentage of total b b /a, has been identified as the driver of PFT sensitivity in the R rs .Given the detectable differences in R rs as size and biomass change, a proportional phytoplankton contribution of approximately 40% appears to a reasonable minimum threshold in terms of yielding a detectable optical change.The proportional contribution always varies with the non-algal optical constituents a gd (λ) and b nap (λ).
Most of the R rs signal that is due to phytoplankton is driven by biomass, and consequently at any significant biomass it is phytoplankton backscatter that dominates the water-leaving signal between 500 and 600 nm where PFT effects are largest.The EAP model shows that the size-related PFT signal is driven by phytoplankton scattering, and that spectral regions where scattering is at its most sensitive to D e f f show the most potential for PFT detection from the bulk water-leaving signal.
Overall, spectral scattering properties of natural waters are not well characterised [51,52], and phytoplankton spectral backscattering charactersitics are underexploited in terms of their impact on the water-leaving signal.The importance of better representing the angular and spectrally variable nature of phytoplankton scattering has been established [41], and it is clear that phytoplankton backscatter is at the heart of the PFT question.Some progress has been made [10,11] but due to the assumption of a Jungian distribution (and the reliance on Mie modelling, which does not adequately represent phytoplankton angular scattering [31]), there are high uncertainties in PSD retrieval where the slope is low, i.e. highly productive and coastal areas [10].This method also assumes that all non-algal scattering is by particles with D e f f less than 0.5 µm, and conversely that there is no non-algal scattering by larger particles.So the scope of application of such an approach is limited.The water-leaving signal in the blue spectral region is highly complex and causally ambiguous, with varied and contrasting effects of the variously absorbing and scattering characteristics of both the algal and non-algal in-water constituents.Size-related signal in the blue is too ambiguous to be useful except in very low biomass waters (< 1 mg.m −3 ) where a gd (λ) and b bnap (λ) are exactly known.Given satellite measurement uncertainty, achieving sufficiently accurate satellite estimates of these quantities is unlikely for this purpose.
This finding exposes a vulnerability in historical approaches to phytoplankton identification and quantification based on absorption characteristics in the blue.PFT approaches based on the features of phytoplankton absorption (where the largest signal is in the blue) all suffer from this shortcoming where phytoplankton relationships with a gd (λ) and b bnap (λ) are not precisely known.So given the ambiguity in the blue, it can be concluded that even where R rs is absorption-dominated (i.e. in low biomass), it is the (back)scattering properties of phytoplankton that show potential for PFT identification, as the b bφ signal is the most pronounced in less ambiguous spectral regions.(Phytoplankton whose prominent absorption features are at longer wavelengths, such as phycocyanin-containing cyanobacteria, present a different case.) Isolating variability in R rs φ as D e f f and biomass vary shows that an example oceanic bloom δD e f f from 2 to 6 µm is only detectable at the satellite measurement threshhold of 1 × 10 −3 sr −1 when the biomass reaches about 2 mg.m −3 (Fig. 8 A).The mid-range biomass sensitivity demonstrated in this paper presents opportunities for identifying higher resolution size classes than the 2 to 20 µm and >20 µm categories currently frequently employed [6,8,10,15].The ability to achieve better  resolution within the 2 -20 µm size class is particularly desirable for marine ecosystem modelling [6].
The location of the maximum δR rs φ size feature shifts between 520 and 570 nm (Fig. 8 B), suggesting strongly that hyperspectral data in this region would add greater capability here.Further analysis is needed to quantify the potential advantages of hyper-over multi-spectral data with respect to this shifting maximum signal, and also with respect to the reduced SNR implicit in narrow waveband measurements.
To conclude, the size-related signals in the 500 to 600 nm region are not always the largest features in the bulk R rs , but they are the most useful for PFT identification as they are sufficiently insensitive to both a gd (λ) and b bnap (λ) (Fig. 3).There is most sensitivity in b bφ in the 1-6 µm size range.At low biomass where the blue signal dominates, PFT changes of sufficient magnitude can only be detected when the non-algal IOPs are known by direct measurement, as satellite a gd retrievals are not sufficiently accurate to resolve the signal ambiguity inherent in this (blue) region.These results indicate the necessity of approaching PFTs from a strongly biophysical perspective, better characterising phytoplankton community structure, and improved handling of the complex spectral and angular nature of phytoplankton scattering.

Figure 2 .
Figure 2. Modelled R rs for stations 20, 21, 12 and 13 of SOSCEX III.The modelled bulk R rs are calculated using EAP generalised Chl a-carotenoid refractive indices and measured Chl a concentrations for the phytoplankton component, and include estimated a gd (λ) and b bnap (λ) contributions appropriate for this region [43,44].Stations 20 to 21 (A) represent a large change in both Chl a concentration and in D e f f .Stations 12 to 13 (B) represent a large change in Chl a concentration only.The centre panel shows the measured D e f f for the cruise track (starting at the ice shelf on the bottom right and continuing in an anticlockwise direction.)Effective diameter image courtesy of SANAE 55 Report [45].

PreprintsFigure 3 .
Figure 3.Southern Ocean stations 20 to 21: δR rs φ is shown for δD e f f of 6 to 16 µm (A).The effect of a gd (λ) at 435 nm is shown in (B), and b bnap (λ) at 570 nm in (C).The units of the colour bars are sr −1 .

Figure 4 .
Figure 4.A simulated transition from 6 to 16 D e f f with biomass 1 to 11 mg.m −3 .Intermediate values of D e f f and Chl a are simply linearly interpolated.The lines highlight 435 nm and 570 nm, regions of maximum size signal, which are (at 435 nm) and are not (at 570 nm) sensitive to the effects of additional optical consituents.

Figure 5 .
Figure 5. Modelled R rs for Stations 12 and 13 (A), with EAP eukaryote phytoplankton IOPs, and a gd (λ) and b bnap (λ) components estimated guided by observations in [43] and [44] respectively.(B) shows δR rs φ for this large change in Chl a concentration (1 to 7 mg.m −3 ) but a small δD e f f of 7 -8 µm.The unit of the colour bar is sr −1 .Note that the results are one order of magnitude less than in the previous example.(C) shows the negligible effect on R rs φ of a change in D e f f from 7 to 8 µm at the measured Chl a concentrations.

PreprintsFigure 6 .
Figure 6.Low biomass R rs φ spectra for Chl a-carotenoid-containing (thick line) and phycoerythrin-containing (thin line) assemblages at varying biomass as indicated.The phycoerythrin-containing assemblage is modelled with D e f f = 1 µm, representing Synechococcus sp..The eukaryote assemblage is modelled with D e f f = 8 µm, representing diatoms.The resulting size and pigment changes approximate those at the onset of the Southern Ocean spring bloom.

PreprintsFigure 7 .
Figure 7. δR rs φ shown for a decreasing proportion of Synechococcus sp. to diatoms, over a range of Chl a concentrations (1 to 5 mg.m −3 ) µm.The δD e f f is from 1 µm to 8 µm at 100% diatoms.

PreprintsFigure 8 .
Figure 8. Maximum δR rs φ for δD e f f from a starting assemblage with D e f f 2 µm, as Chl a varies.Note that the

Figure 9 .
Figure 9. R rs φ ratios for blue:green, green:red and red:NIR wavelengths as shown, for Chl a concentrations of 0.1 to 20 mg.m −3 and D e f f 1 to 40 µm.The B/G ratio shows a strong biomass dependency and a small sensitivity to size at large sizes, for 0.5 ≤ Chl a ≤ 4.5 mg.m −3 .The b bφ ratios all display a strong size signal at 2 -4 µm, and the G/R ratio shows a corresponding size-related feature.

Preprints
Figure 10.b * b φ shown for D e f f 1 to 10 µm.The largest differences in backscatter across the spectrum occur between 1 and 4 µm, with the exception of the overlapping of b * b φ in the red and NIR.

Figure 11 .
Figure 11.Percentage contribution of phytoplankton to total backscatter (including water, and with nominal b bnap (550) = 0.005), shown for D e f f 1 to 40 µm and Chl a from 0.1 to 20 mg.m −3 , at 440, 560 and 665 nm.