Emulation of Leaf, Canopy and Atmosphere Radiative Transfer Models for Fast Global Sensitivity Analysis

Physically-based radiative transfer models (RTMs) help understand the interactions of radiation with vegetation and atmosphere. However, advanced RTMs can be computationally burdensome, which makes them impractical in many real applications, especially when many state conditions and model couplings need to be studied. To overcome this problem, it is proposed to substitute RTMs through surrogate meta-models also named emulators. Emulators approximate the functioning of RTMs through statistical learning regression methods, and can open many new applications because of their computational efficiency and outstanding accuracy. Emulators allow fast global sensitivity analysis (GSA) studies on advanced, computationally expensive RTMs. As a proof-of-concept, three machine learning regression algorithms (MLRAs) were tested to function as emulators for the leaf RTM PROSPECT-4, the canopy RTM PROSAIL, and the computationally expensive atmospheric RTM MODTRAN5. Selected MLRAs were: kernel ridge regression (KRR), neural networks (NN) and Gaussian processes regression (GPR). For each RTM, 500 simulations were generated for training and validation. The majority of MLRAs were excellently validated to function as emulators with relative errors well below 0.2%. The emulators were then put into a GSA scheme and compared against GSA results as generated by original PROSPECT-4 and PROSAIL runs. NN and GPR emulators delivered identical GSA results, while processing speed compared to the original RTMs doubled for PROSPECT-4 and tripled for PROSAIL. Having the emulator-GSA concept successfully tested, for six MODTRAN5 atmospheric transfer functions (outputs), i.e., direct and diffuse at-surface solar irradiance (E di f , E dir), direct and diffuse upward transmittance (T dir , T di f), spherical albedo (S) and path radiance (L 0), the most accurate MLRA's were subsequently applied as emulator into the GSA scheme. The sensitivity analysis along the 400–2500 nm spectral range took no more than a few minutes on a contemporary computer—in comparison, the same analysis in the original MODTRAN5 would have taken over a month. Key atmospheric drivers were identified, which are on the one hand aerosol optical properties, i.e., aerosol optical thickness (AOT), Angstrom coefficient (AMS) and scattering asymmetry variable (G), mostly driving diffuse atmospheric components, E di f and T di f ; and those affected by atmospheric scattering, L 0 and S. On the other hand, as expected, AOT, AMS and columnar water vapor (CWV) in the absorption regions mostly drive E dir and T dir atmospheric functions. The presented emulation schemes showed very promising results in replacing costly RTMs, and we think they can …


Introduction
Since the advent of optical remote sensing, physically-based radiative transfer models (RTMs) have deeply helped in understanding the radiation processes occurring on the Earth's surface and their interactions with vegetation and atmosphere (e.g., [1,2]).RTMs are deterministic models that describe absorption and scattering, and some of them even describe the microwave region, thermal emission or sun-induced chlorophyll fluorescence emitted by vegetation (e.g., [3,4]).They are useful in a wide range of applications including (i) sensitivity analysis; (ii) developing inversion models to accurately retrieve atmospheric and vegetation properties from remotely sensed data (see [5] for a review); and (iii) to generate artificial scenes as would be observed by an optical sensor (e.g., [6,7]).Plant and atmospheric RTMs are currently used in an end-to-end simulator that functions as a virtual laboratory in the development of next-generation optical missions [8,9].
When it comes to vegetation analysis, RTMs have found a wide range of applications to model, study and understand light interception by plant canopies and the interpretation of vegetation reflectance in terms of biophysical characteristics [2,10,11].A diversity of canopy RTMs have been developed over the last three decades with varying degrees of complexity.Gradual improvement in RTMs accuracy, yet in complexity too, have diversified RTMs from simple turbid medium RTMs towards advanced Monte Carlo RTMs that allow for explicit 3D representations of complex canopy architectures (e.g., see the RAMI exercises [12,13] for a meticulous comparison).This evolution has resulted in an increase in the computational requirements to run the model, which bears implications towards practical applications.In general, RTMs can be categorized as either (1) 'economically invertible' (or computationally cheap); or as (2) 'non-economically invertible' models (or computationally expensive).These terms refer to the model complexity and associated processing speed constraining the inversion of such models: Economically invertible models are models with relatively few input parameters and fast processing that enables fast calculations and so applications such as model inversion or rendering of simulated scenes.Examples of this category include the widely used leaf RTM PROSPECT [14], canopy RTM SAIL [15] and atmospheric RTM 6S [16] and OSS [17].
Non-economically invertible RTMs refer to advanced RTMs, often with a large number of input variables and sophisticated computational and mathematical modelling.These type of RTMs enable the generation of complex or detailed scenes, but at the expense of a tedious computational load.In short, the following families of RTMs can be considered as non-economically invertible: (1) Monte Carlo ray tracing models (e.g., Raytran [18], FLIGHT [19] and Drat [20]); (2) voxel-based models (e.g., DART [21]) and (3) advanced integrated vegetation and atmospheric transfer models (e.g., SimSphere [22], SCOPE [23] and MODTRAN [24]).Although these advanced models serve perfectly as virtual laboratories for fundamental research on light-vegetation and atmosphere interactions (see e.g., [18,25]), their high computational cost make them impractical for applications such as inversion.Consequently, when it comes to selecting an RTM for applications that demand many simulations, the current pragmatic approach is to search for a good balance between acceptable accuracy and computational complexity [5].
Regardless whether an RTM is computationally expensive or not, an important requirement is the identification of the key input variables driving the spectral output and variables of lesser influence.By identifying variables of lesser influence, models can be greatly simplified, which facilitates practical applications such as inversion [26].To achieve this, a sensitivity analysis is required.A sensitivity analysis can be simply defined as the process of determining the effect of changing the value of one or more input variables, and observing the effect that this has on the considered model's output [22,27].
In general, sensitivity analysis methods can be categorized as either 'local' or 'global'.Local sensitivity analysis (LSA) methods are often referred to as "one-factor-at-a-time", because they involve changing one input variable at a time whilst holding all others at their default values, then measuring variation in the outputs.A drawback of LSA methods is that they are informative only at the default point where the calculation is executed and do not encompass the entire input variable space.Thus, LSA methods are inadequate for analysis of complex models having many variables, and that may be highly dimensional and/or non-linear [27][28][29].Unlike LSA, global sensitivity analysis (GSA) explores the full input variable space [27].Variance-based GSA methods aim to quantify the amount of variance that each input variable contributes to the unconditional variance (variance across all simulations) of the model's output [29].The approach is able to quantify the sensitivity to each of the model variables and their interactions, which cannot be done using LSA studies.GSA is thus required to identify the driving variables of an RTM.However, a GSA requires many simulations (depending on the number of variables, typically in the orders of 10 thousands), which makes it computationally demanding.It bears the consequence that so far GSA studies have been restricted to merely abstract, computationally cheap models, typically coming from the PROSPECT and SAIL families (e.g., [30][31][32][33]).
Identification and quantification of key variables in realistic vegetation an atmospheric radiative transfer interactions would enable a more sound understanding of atmospheric mechanisms with practical applications towards improved retrieval schemes.In order to bypass the computational burden of costly RTMs, in this paper a computationally effective technique is proposed by only using a limited number of computer code runs.The technique consists of approximating the RTM by a surrogate model with low computation time, also referred to as meta-model or emulator [34].Once having a successful emulator developed, the idea is that it can replace the original RTM and be readily applied to perform sensitivity analysis.Accordingly, we posit here that emulators can provide an efficient mean for doing a GSA on large and expensive models.
Although successfully applied into GSA schemes for quantifying drivers of climate and environmental process-based models (e.g., [22,35,36]), so far this technique is largely unknown in remote sensing science.Only recently Rivera et al. (2015) [37] and Gómez-Dans et al. (2016) [38] started exploring the possibility of emulating RTMs and applying them for retrieval applications.The authors constructed RTM-like emulators with capabilities to deliver accurately and quasi-instantly spectral outputs such as reflectance, fluorescence and radiance.Advancing along this line, when emulators appear to approximate RTMs with high accuracy, consequently, it would become possible to apply GSA to any RTM, including computationally expensive RTMs such as MODTRAN.
In light of the above, the main objective of this study is to analyze the possibility of approximating the functioning of RTMs by emulators, in order implement these emulators into a GSA scheme for enabling analyzing computationally intensive RTMs into its driving input variables.This objective can be broken down into the following sub-objectives: (1) to analyze three machine learning regression algorithms on their accuracy to function as emulators for three RTMs with increasing complexity at the leaf, canopy and atmosphere scale, i.e., PROSPECT-4, PROSAIL and eventually the computationally expensive RTM MODTRAN5; (2) to apply the PROSPECT-4 and PROSAIL-like emulators into a GSA scheme and compare its results against GSA reference results as obtained from original PROSPECT-4 and PROSAIL simulations; and finally (3) to apply MODTRAN5-generated emulators for various MODTRAN5 outputs into the GSA scheme to enable identifying driving atmospheric variables along the 400-2500 nm spectral range.

Emulator Theory
Emulation is a technique used to estimate model simulations when the computer model under investigation is too computationally costly to be run enough times [34].Various emulators based on statistical learning have already been developed in the climate and environmental modeling community for the last few years (e.g., [22,35,36,[39][40][41][42][43]).These emulators have in common that they were developed from adaptive, flexible nonparametric regression algorithms, typically within the family of machine learning regression algorithms (MLRAs) [44].Examples of MLRAs include Gaussian processes regression and neural networks [38,45].The emulator uses a limited number of simulator runs, i.e., input-output pairs (corresponding to training samples), to infer the values of the complex simulator output given a yet-unseen input configuration.With respect to applying emulators into sensitivity analysis, specific advantages of the emulator are [22]: 1.The emulator is derived from a relatively small number of model runs covering a multidimensional input space, and is used to derive a computationally inexpensive and efficient analysis of all the sensitivity analysis computations regarding the original deterministic model code.2. Once the emulator is built, it is not necessary to perform any additional runs with the model, regardless of how many analyses are required to assess the simulator's behaviour.This is a very important advantage in the use of this method compared to other conventional GSA methods (e.g., those reviewed in [27] that typically require a fresh set of simulator runs for each analysis).Therefore, compared for instance to Monte Carlo-type methods, the approach requires far fewer model runs since the original code is only run to build the dataset to train the emulator.
Consequently, a two-step approach is pursued.The first step involves building a statistically-based representation (i.e., an emulator) of the simulator (i.e., an RTM) from a set of training data points derived from runs of the actual model under study.These training data pairs should ideally cover the multidimensional input space using a space-filling algorithm.Only one set of runs of the simulator is used to build the emulator.After the emulator is built, no more simulator runs are needed, no matter how many analyses are required of the simulator's behavior.The second step uses the emulator built in the first step to compute the output that is otherwise generated by the simulator [34].Note that building an emulator is essentially the same as building nonparametric regression models used for retrieval (e.g., [46,47]), but then in reversed order: whereas a retrieval model converts reflectance input into one or more output biophysical variables, an emulator converts input biophysical variables into output spectral data.
An important criterion regarding emulating an RTM is the ability to deliver a full spectrum, i.e., spectral output at multiple bands.This bears the consequence that the MLRA (i.e., emulator) should be able to generate multiple outputs in order to reconstruct a full spectral profile.However, the full, contiguous spectral profile typically consists above 2000 bands when binned to 1 nm resolution.This can take considerable training time, especially when using neural networks [37].To enable the models to cope with large hyperspectral datasets (e.g., reflectance, transmittance, fluorescence), applying a principal component analysis (PCA) dimensionality reduction step [48] prior to training a model has been implemented.This step consists on projecting the RTM output spectra onto the top eigenvectors computed with PCA, p B, where p is the number of selected components (usually one typically takes enough components to ensure that at least 99.9% of variance is retained), and B is the number of original bands of the spectra.The regression methods are then trained to predict the PCA reduced spectra, and the inverse of the PCA transformation is subsequently applied to obtain the reconstructed output spectra.This dimensionality reduction strategy not only greatly speeds up the training phase but also makes the problem better conditioned [36][37][38].An additional advantage of applying the PCA transformation is that it allows converting single-output MLRAs into multiple output methods, i.e., meaning that in principle any multivariate regression algorithm can function as emulator.By first applying a PCA to spectral data the hyperspectral data is reduced to a given number of components.By subsequently looping over the single component multiple models can be trained by a single-output MLRA.Afterwards the signal is again reconstructed.Although the iterative training takes some computational time, it is considerably faster than training a hyperspectral dataset band-by-band.
Based on the literature review above and an earlier comparison study conducted [37], the following three MLRAs potentially serve as powerful methods to function as emulators, being kernel ridge regression (KRR), neural networks (NN) and Gaussian processes regression (GPR).These methods have been amply described in earlier publications (e.g., [5,38,44,46,49,50]).The interested reader can find more details therein.The source codes of each of these methods, along with many other alternatives, are freely available in the simpleR toolbox [51].The Automated Radiative Transfer Models Operator (ARTMO) Graphic User Interface (GUI) is a modular software package that includes these MLRA methods as well [47], and as described later, provides some easy-to-use tools for running RTMs and further processing.

Global Sensitivity Analysis Theory
Various variance-based GSA methods have been presented in the literature, including the Sobol' method [52], the Fourier Amplitude Sensitivity Test (FAST) [53] and a modified version of the Sobol' method proposed by [54].This modification has been demonstrated to be effective in identifying the so-called Sobol's sensitivity indices.These indices quantify both the main sensitivity effects (first-order effects: S i , i.e., the contribution to the variance of the model output by each input variables) and total sensitivity effects (S Ti , i.e., the first-order effect plus interactions with other input variables) of input variables.This method has been applied here.A description according to Song et al. (2012) [55] is given below.
Formally, we have a model y = f (x), where y is the model output, and x = [x 1 , x 2 , ..., x k ] is the input feature vector.A variance decomposition of f (•) as suggested by Sobol [52] is: where is the total unconditional variance; V i is the partial variance or 'main effect' of x i on y and given by the variance of the conditional expectation V i = V[E(y|x i )]; V ij is the joint impact of x i and x j on the total variance minus their first-order effects.Here, the first-order sensitivity index S i and total effect sensitivity index S Ti are given as [27]: where x ∼i denotes variation in all input variables and x i , S ij is the contribution to the total variance by the interactions between variables.Following Saltelli et al. (2010) [54], to compute S i and S Ti two independent input variable sampling matrices P and Q of dimensions N × k are created, where N is the sample size and k is the number of input variables.Each row in matrices P and Q represents a possible value of x.The variable ranges in the matrices are scaled between 0 and 1.The Monte Carlo approximations for V(y), S i and S Ti are defined as follows [29,54]: and where . . . is the estimate; f0 is the estimated value of the model's output; we abused notation by defining f (P) as all outputs for row vectors in P; P Q represents all columns from P except the i-th column which is from Q, using a radial sampling scheme [56].Matrices are generated with a Latin hypercube sampling of size N × 2k where P and Q are the left and right half of this matrix, respectively [54].In order to compute S i and S Ti simultaneously, a scheme suggested by [57] was used which reduced the model runs to N(k + 2).

Applied RTMs
As a proof-of-concept, the utility of emulators into the GSA framework will be demonstrated for three different RTMs at three scales with increasing degree of complexity, i.e., PROSPECT-4 at the leaf, PROSAIL at the canopy, and MODTRAN5 at the atmospheric scale.A brief description of the selected RTMs is given below.

PROSPECT-4
At the leaf scale, the PROSPECT-4 model [14] is currently one of the most widely used leaf optical models and is based on earlier PROSPECT versions.The model calculates leaf reflectance and transmittance as a function of its biochemistry and anatomical structure.It consists of four parameters, those being leaf structure (N), equivalent water thickness (Cw), chlorophyll content (Cab) and dry matter content (Cd).PROSPECT-4 simulates directional reflectance and transmittance over the solar spectrum from 400 to 2500 nm at the fine spectral resolution of 1 nm.

PROSAIL
At the canopy scale, SAIL is the most popular canopy RTM; also likely due to its availability and rather low number of input variables [15].SAIL is based on a four-stream approximation of the RT equation, in which case one distinguishes two direct fluxes (incident solar flux and radiance in the viewing direction) and two diffuse fluxes (upward and downward hemispherical flux) [58].SAIL inputs consist of leaf area index (LAI), leaf angle distribution (LAD), ratio of diffuse and direct radiation (skyl), soil coefficient (soil coeff.),hot spot and sun-target-sensor geometry, i.e., solar/view zenith angle and relative azimuth angle (SZA, VZA and RAA, respectively).Spectral input consists of leaf reflectance and transmittance spectra and a soil reflectance spectrum.The leaf optical properties can come from a leaf RTM such as PROSPECT.By coupling PROSPECT with SAIL (PROSAIL), a leaf-canopy model is generated that allows analyzing the impact of leaf biochemical variables at the canopy scale [2].PROSAIL generates hemispherical and bidirectional top-of-canopy (TOC) reflectance as output.

MODTRAN5
At the atmosphere scale, MODTRAN5 [24], the moderate resolution transmittance code, is one of the most widely used radiative transfer codes in the atmospheric community [59].MODTRAN solves the radiative transfer equation by including the effects of molecular and particulate absorption/emission and scattering, surface reflections and emission, solar/lunar illumination, and spherical refraction.The absorption and scattering processes, considering the atmospheric extinction both for aerosols and molecules, are rigorously calculated using DISORT [60] and Correlated-k [61] algorithms.In addition, MODTRAN uses a spherically symmetric atmosphere, consisting of homogeneous layers, each one characterized by its temperature, pressure and concentration of atmospheric compounds.Here, the Snell's law is used to calculate the curvature of light beams due to refraction between each atmospheric layer.
The large number of MODTRAN inputs (see [62] for a detailed description) allows the user to define the spectral range, illumination/viewing geometry and atmospheric state.A smaller subset of key input variables for general-purpose applications [63] (see Table 1) describe the main interactions between atmosphere and solar radiation in all spectral range.With respect to the geometric conditions, the key variables are SZA, VZA and RAA.Regarding the atmospheric state, the key input variables are reduced to two subsets: (1) Aerosol optical properties, described by means of the Aerosol Optical Thickness (AOT), the Angstrom parameter (AMS), and the asymmetry parameter of the Henyey-Greenstein scattering phase function (G); and (2) total columnar water vapor (CWV).In addition, the surface pressure, determined by the topographic elevation (ELEV), completes the set of key inputs variables.Depending on the application, other variables (e.g., vertical temperature profile, vertical aerosol distribution, concentration of CO 2 ) might be added.In our study, these variables are set to their default values.MODTRAN spectral outputs include direct and diffuse transmittance, top of atmosphere (TOA) radiance fluxes, solar/lunar irradiance, horizontal fluxes, cooling rates, etc.These outputs extend from the UV into the far-infrared (0.2-10,000 µm) spectral range and are provided at a resolution as fine as 0.1 cm −1 (0.01-0.1 nm in the VIS-SWIR spectral range).However, in most MODTRAN remote sensing application (e.g., [9,[64][65][66]), these outputs are decomposed in a set of key atmospheric transfer functions [63,67] that allows to invert the atmospheric TOA radiance equation.The standard formulation under the Lambertian assumption is given by: where L 0 is the path radiance (i.e., the radiance scattered by the atmosphere); E dir and E di f are respectively the direct and diffuse at-surface solar irradiance; T dir and T di f are respectively the surface-to-sensor direct and diffuse atmospheric transmittance; S is the spherical albedo, which accounts for the fraction of radiance that is back-scattered to the surface from the atmosphere; µ s is the cosine of SZA; and ρ is the Lambertian surface reflectance.Hence, these six atmospheric transfer functions (L 0 , E dir , E di f , T dir , T di f and S) are critical to understand the radiative transfer through the atmosphere and, thus, were selected as MODTRAN outputs to be decomposed into its driving variables through a GSA study.

Experimental Setup
In this section, we summarize the experimental setup for running the emulator-GSA experiments in the next section.We first discuss on the training and validation strategies for developing the emulators, and then on how we apply the GSA methodology.The large majority of the work was undertaken within the in-house developed ARTMO framework [68].ARTMO consists of a suite of leaf and canopy RTMs, retrieval toolboxes, and recently a GSA toolbox [69] and an Emulator toolbox was added [37].The Emulator toolbox is based on an earlier developed MLRA retrieval toolbox, which is equipped with a diversity of nonparametric regression models, mostly from the family of MLRAs [47].In the Emulator toolbox several of those MLRAs can be trained by RTMs, but instead of delivering biophysical variables as outputs they are used as input in the regression model, and spectral data is generated as output.For this study, the Emulator toolbox (v.1.04) has been expanded with various new utilities, including GPR, dimensionality reduction techniques and various goodness-of-fit indicators.The GSA toolbox (v.1.04) has been made compatible with emulated RTMs.The toolboxes are downloadable at http://ipl.uv.es/artmo/.

Emulator Training and Validation
For each of the three RTMs, a look-up table (LUT) was first generated by means of Latin hypercupe sampling [70] within the RTM variable space with minimum and maximum boundaries as given in Table 1.A sampling size was sought for that is balancing between striving for high accuracies while keeping computational burden to the minimum.Earlier emulation studies suggested that using just 400 samples would suffice for GSA studies [22,71].On the other hand, [37] demonstrated that a larger LUT may lead to the development of a more accurate emulator.Nevertheless, increasing input simulations goes at a computational cost, especially for computationally expensive RTMs such as MODTRAN.As an acceptable compromise a LUT size of 500 samples was built.
PROSPECT-4 and PROSAIL LUTs were generated with ARTMO.Their processing time took 12 and 33 s on a contemporary computer (Windows-64 OS, i7-4550 CPU 1.50 GHz, 8 GB RAM).The MODTRAN5 LUT was generated on another computer (Windows7-64 OS, i3-2100 CPU 3.10 GHz, 4 GB RAM) based on [63], and it took about 57 h to generate 500 samples ranging from 400 nm to 2500 nm at a spectral resolution of 5 cm −1 (0.08-3 nm).In fact, in some cases MODTRAN5 was malfunctioning, i.e., delivering empty profiles, therefore somewhat more random simulations were generated to fill up these gaps.Also several simulations produced zero or no values in some bands specifically in the 1400 and 1900 nm water vapor absorption regions.To enable further computing, zero values were set close-to-zero, i.e., 0.1 × 10 −6 , while the no values were filled up through interpolation.The LUTs were subsequently passed to the Emulator toolbox.To enable training the three selected MLRAs (KRR, NN and GPR), and in order to speed up the model development, a PCA was first applied, and the first 30 principal components were used for MLRA training and data projection.This way we better pose the problem and feed all regression models with the same amount of information.The number 30 was chosen after some initial tests with the PROSPECT-4 and PROSAIL datasets.Although for each of the 3 MLRAs the first 5 components explained already 99.96% of the variance, developing the regression models with less than 10 components led to inaccurate reconstructions in regions of strong absorption.In general, more robust reconstructions with more precise profiles in the strong absorption regions were achieved with having more components added in the regression algorithms.However, looping over more components slows down processing time and gain in accuracy is not always clear.Therefore, 30 components (i.e., 100% explained variance) were evaluated as an acceptable trade-off between accuracy and processing time.
Since emulators only produce an approximation of the original model, such an approximation introduces a source of uncertainty referred to as "code uncertainty" associated with the emulator [34], in such a way the sensitivity measures computed using the emulator are "uncertain".Therefore, validation of the generated model is an important step in order to quantify the emulator's degree of accuracy.To validate the accuracy of the emulator, part of the original data is kept aside as reference dataset.Various training/validation sampling design strategies are possible with the Emulator toolbox.In an attempt to keep processing fast, for this study only a single split was applied using 70% samples for training (350#) and the remaining 30% for validation (150#).
The next step involved validating the developed MLRA emulators.The toolbox analyzes one-by-one the validation accuracy of the emulators by calculating the root-mean-square-error (RMSE) difference between emulated spectra and validation RTM spectra.The processing time of model training and validation is also tracked.The toolbox further offers a few visualization tools to inspect the emulation accuracy along the spectral range, such as plotting the RTM vs. emulated output spectra and plotting the residuals, i.e., the difference between RTM-generated and emulated-generated spectra in absolute or relative terms.These plottings will be summarized into some general statistics, i.e., mean, standard deviation (SD), min-max boundaries and percentiles.

Applied GSA Strategy
The GSA experiments were conducted in the GSA toolbox where the Saltelli's GSA method [54] has been implemented.For each GSA experiment, (N(k + 2)) model simulations were run, where N is the sample size and equals to 1000, and k is the number of input variables and equals to 4 variables for PROSPECT-4, 11 for PROSAIL and 8 for MODTRAN5 (cf.Table 1).This led to 6000, 13,000 and 10,000 model runs, respectively.A Latin hypercube sampling to fill up the variable space according to the min-max boundaries given in Table 1 was used to sample the P and Q matrices.In order to gain insight into the actual influence of a given variable, both the main effect (S i ) and the interaction effects have to be considered.Therefore, only total order index results (S Ti ) are shown.However, the sum of all S Ti can be greater than 1, due to the interactions, and its magnitude varies as a function of wavelength.Therefore, the S Ti results were normalized as a percentage.
Given that PROSPECT-4 and PROSAIL are relatively fast RTMs, the sensitivity analysis was first applied with the original models and then repeated with the 3 evaluated MLRA emulators (KRR, NN, GPR).The original GSA results serve as reference to validate the GSA results obtained by the MLRA emulators.For brevity, only PROSPECT-4 leaf reflectance and PROSAIL TOC bidirectional reflectance results are shown, but in principle the analysis can be likewise applied to leaf transmittance or canopy hemispherical reflectance.Also for brevity, for each of the six MODTRAN5 outputs only GSA results obtained by the most accurate emulator are shown.For each analysis processing time is given.

Results
Results are organized as follows: first the ability of the MLRA models to function as a accurate emulator are evaluated, then their performances into the GSA scheme are given.

Validation of Emulators Accuracy
To answer the question whether the statistical regression models can function as accurate emulators, first general goodness-of-fit statistics of emulated spectra against the RTM validation spectra are provided.Table 2 displays the RMSE for the LUTs coming from the three RTMs that were used to train the MLRAs.The normalized RMSE (NRMSE) indicates that relative errors fell well below 0.5%, but significant differences across the three MLRAs and RTMs can be observed.KRR performed generally poorest in accuracy with exception of E dir .NN was validated as most accurate for the majority of LUTs with exceptions for a few MODTRAN5 outputs.However, optimizing its architecture goes at a computational cost, because the number of hidden neurons from 3 to 40 are internally tuned, as well as the learning rate, momentum term, regularization constants and initialization of the weights.NN training took several times longer than the two other MLRAs, usually between 3 and 14 min.Note that the PCA transformation considerably improved the computational efficiency of the training phase; without PCA it took up to a few hours to develop the NN model (results not shown).GPR seems to be more promising: the model consistently delivers high accuracies and is trained relatively fast (it typically took between 1 and 2 min time to train).The cost of GPR is higher than for KRR because the kernel function used includes a length-scale per input feature to be optimized and we developed a GP model per output.KRR provides extremely fast training for this low-size problems, but this goes at the expense of a somewhat poorer accuracy.For all MLRAs, training the models is the only computational cost; once trained, outputs were generated and validated quasi-instantly (less than 0.01 s).When comparing the relative errors across the three RTMs, from the NRMSE results it can be observed that PROSPECT-4 was most accurately emulated, closely followed by most of the MODTRAN5 outputs.The reason for their excellent performances is due to the low number of input variables, 4 and 8, respectively.Emulating PROSAIL was achieved with relative errors of about 10 times poorer than PROSPECT-4, which can be attributed to the higher number of input variables considered.

Validation of Emulated Spectral Profiles and Residuals
Because the RMSE validation statistics only give general information, a closer look across the emulated spectra is required.Summarizing profiles (mean, SD, min-max) of the RTM validation spectra and corresponding emulated spectra have been plotted.From these two spectral datasets, the absolute and relative residuals were derived and plotted.These figures will help to interpret the GSA results.

PROSPECT-4
To start with PROSPECT-4 (Figure 1 Top), for all three MLRA emulators the mean of the emulated spectra (blue line) is precisely on top of the mean of the original PROSPECT-4 spectra (red line underneath), which suggests a close match.It can be observed that for each of the three MLRAs the SD of the emulated spectra are seamlessly overlapping with that of PROSPECT-4 validation spectra.Only at the lower boundaries close to zero, i.e., in the visible and around 2000-2500 nm some small anomalies took place.Also the min-max boundaries are mostly similar.Only the GPR emulator does not fully reach the high RTM dynamic range in the 2000-2500 nm region.The specific regions where anomalies occur may be better visualized in the residuals (middle panels) and relative residuals (bottom panels).Excellent residuals can be noted for NN and GPR; the mean and SD keep stable around zero.The percentiles show only fluctuations in the visible and SWIR regions related to regions with strong absorbances.KRR also keeps the mean residuals close to zero, but is characterized with considerably larger fluctuations in the visible and at the atmospheric water bands around 1900 nm, and to a lesser extent at 1400 nm.These anomalies are magnified when looking at the relative residuals, where the SD can be above 10%.Note that the larger residuals take place in the strong absorption regions where true or emulated reflectance values can approach close to zero, thereby causing large relative residuals.

PROSAIL
The same RTM vs emulator and residual graphs are produced for PROSAIL (Figure 2).The top panels indicate the accuracy of the reconstructed PROSAIL reflectance relative to the PROSAIL validation dataset.Mean signatures are seamlessly overlapping, although in some regions the mean RTM signature becomes partly visible next to the emulated one.Also the overlapping SD suggests that the large majority of emulated spectra closely match the original RTM spectra.Further, it can be observed at the min-max boundaries that they are not perfectly overlapping.At the max boundary, PROSAIL simulations reaches generally further than the emulators for KRR and GPR, but reverse for NN, while the emulator delivers slightly lower values at the min boundary for KRR and NN.Regarding the residuals, although the mean residual is around zero, the broader SD and percentiles indicate that emulating PROSAIL bidirectional reflectance is not always perfectly matching the original RTM.Similar to the observations for PROSPECT-4, NN and GPR keep consistent performances, with absolute inaccuracies lower than 0.1.For KRR, anomalies can be larger, as expressed by a broader SD and some pronounced extreme cases shown by the percentile values.The relative residuals reflect larger differences, with peaks again especially in the visible and also at the atmospheric water bands centered around 1400 and 1900 nm and towards the SWIR.This is especially evident for KRR.Comparison of the residuals against the original spectral shapes (top panel) reveals that reflectance in these absorption regions is close to zero, making relative differences expanding rapidly.

MODTRAN5
For each of the six MODTRAN5 outputs only the most accurate emulator (see Table 2) is used.The same type of graphs are shown in Figure 3, with the overview signatures in the top panel.Contrarily to leaf and canopy simulations, atmospheric profiles are not smooth but spiky due to some atmospheric absorption features.In all cases, the mean emulated data overlap the mean RTM data suggesting that emulators are generally capable of reconstructing the MODTRAN5 profiles with great precision.The SD and min-max boundaries indicate again that the emulators are able to reconstruct the same output information as MODTRAN5 for most of the simulated spectra.When inspecting the residuals, however, larger errors can be observed.Overall, absolute residuals are close to zero, as suggested by the mean and SD, though not for all MODTRAN5 outputs emulators perform equally stable.Particularly E dir and T dir generated somewhat more fluctuating mean residuals.In absolute terms, there are generally larger anomalies in the visible than in the SWIR spectral region (with exception of absorption regions) due to the generally higher values.
Also noteworthy are the outliers, which are especially apparent in the percentiles.These peaks are most pronounced at the deep atmospheric water vapor absorption regions centered at 1400 and 1900 nm.In these regions, as observed in the top panels, profiles are reducing to zero.Other spiky outliers are also observed thanks to the fine spectral sampling.They are related to narrow atmospheric absorption features from different molecular compounds such as O 2 .
The strong atmospheric water vapor absorption regions are most noteworthy in the relative residuals.In these regions the relative residuals exceed out-of-range.This can be attributed to the very close-to-zero (0.1 × 10 −6 ) values for various bands in these regions, as well as the noisy behavior, which were mot followed by the emulator.In remote sensing applications these regions are typically discarded because of very noisy, but we decided to show them for sake of completeness.
Apart from these extreme regions, there are also remarkable general difference among the six MODTRAN5 outputs.Whereas E di f , T di f , S and L 0 emulators generated relative residuals that can exceed 1000% for E dir and T dir these relative residuals kept 10% (Note scale of the Y-axis on the relative residuals plots).On the one hand, the first outputs, E di f , T di f , S and L 0 , are atmospheric functions highly influenced by aerosols and molecular scattering and multiple scattering.On the other hand, direct fluxes E dir and transmittance T dir are not particularly affected by atmospheric scattering.Therefore, it seems that multiple scattering atmospheric mechanisms at high spectral resolution are more difficult to emulate.

PROSPECT-4
For PROSPECT-4, total sensitivity (S Ti ) results along the 400-2500 nm spectral range are displayed in Figure 4.The reference S Ti result is displayed in the top left panel.Because the relative contributions sum up to 100%, the role of each input variable along the spectral range can be easily inspected.For instance, S Ti results clarify that chlorophyll content (Cab) drive leaf reflectance in the visible part, and the structural variables N and dry matter content (Cm) play a role across the whole spectral range.Leaf water content (Cw) only governs leaf reflectance from around 1200 nm onwards.When comparing the reference PROSPECT-4 S Ti results against the S Ti results of the emulators (KRR, NN & GPR) their similarity is remarkable.Although all three emulators deliver very similar patterns, some small differences can be noted.Particularly for KRR some anomalies can be observed.While in the residuals small imperfections were observed in the visible and around 1900 nm, this is also reflected in the S Ti results, with tiny anomalies of Cw in the visible and Cab around 1900 nm.Further differences between PROSPECT-4 and the NN and GPR emulators are more subtle.For instance, it can be noted that the structural variable N of the emulators has a somewhat more pronounced small peak in the blue than PROSPECT-4.Overall, the similarity of the figures affirm using emulators into GSA.Moreover, the processing time indicates that the emulated models generated the S Ti results about twice as fast as the original PROSPECT-4 model (which is already running fast).

PROSAIL
By coupling the leaf PROSPECT-4 model with the SAIL model, i.e., PROSAIL, then both leaf and canopy input variables govern the variability of TOC bidirectional reflectance.S Ti results are displayed in Figure 5 with original PROSAIL results in the top left panel.It can be noted that the prime driving variable is the canopy structural variables LAI and LAD.LAI quantifies leaf density and is a key variable in many surface and climate studies (e.g., [2]).The reference PROSAIL S Ti results indicate that LAI can explain up to 50% of the total variability (i.e., with interactions), especially in the SWIR (1400-3000 nm).Its dominance can be explained by two factors.Firstly, LAI controls the amount of leaf elements and consequently controls the amount of spectrally-distinct soil reflectance propagating through the canopy (in case of low LAI).And secondly, the introduction of leaf elements enables the interactions with leaf optical properties, i.e., a higher LAI implies more light absorption due to the properties of leaf elements and scattering effects.LAD controls the orientation of the leaves [15].LAD, like LAI, regulates to an extent whether bidirectional TOC reflectance originates from vegetation only (e.g., in case of erectophile leaves; LAD: 90 • ) or is also mixed with soil reflectance (e.g., in case of planophile leaves; LAD: 0 • ).Other drivers are soil coefficient (regulates the contribution of wet and dry soil portions) and to a lesser extent geometry (solar and view zenith angle).Also the leaf optical properties play a prominent role.The patterns of the leaf biochemical variables are similar as those for PROSPECT-4 alone, but their relative importance is suppressed by the dominance of the aforementioned key canopy variables.When subsequently comparing reference PROSAIL S Ti results against those of the emulated S Ti results, their similarity is encouraging again.The patterns originating from the emulators match closely the results of the original PROSAIL, and are generated about three times faster.Nevertheless, likewise for PROSPECT-4, the KRR emulator delivers various small anomalies compared to the reference PROSAIL model.Similar as in case of PROSPECT-4, small contributions of Cw and Cab emerge at meaningless spectral regions, e.g., some tiny Cab portion in the SWIR and tiny Cw portion in the visible.Regarding the SAIL variables, the variables SZA, RAA and also skyl are given a bit too much weight.Conversely, NN and GPR emulators performed very much alike and similar to the reference PROSAIL GSA results, although GPR slightly underestimates the role of the geometry variables.A closer inspection suggests that NN preserves a slightly higher precision than GPR, e.g., no Cw residual in the blue.Overall, these results are encouraging for using emulators in GSA experiments.

MODTRAN5
Evidence with PROSPECT-4 and PROSAIL suggest that emulators can perfectly replace RTMs into a GSA scheme.Likewise, the most accurate emulators for the six MODTRAN5 output functions (see Table 2) were implemented into GSA.Its processing time did not take more than a few minutes.The total sensitivity index (S Ti ) allowed to decompose the MODTRAN5 outputs into its driving input variables, as shown in Figure 6.Although a careful interpretation is required in the deep atmospheric water absorption regions, the overall trends of the atmospheric drivers across the 400-2500 nm spectral range emerged continuously in general.In all these figures, the relative importance of the atmospheric variables became apparent.Aerosol optical variables, i.e., AOT, AMS and G seems to be key TOA radiance drivers, but some points must be outlined.
Firstly, direct components, E dir and T dir , are not influenced by variations in the G variable.Since G modules the anisotropic degree of the scattering phase function, it does not affect those atmospheric components that are not affected by the scattering process.Instead, the direct flux and transmittance are influenced by the light absorption caused by aerosols (driven by the AOT and AMS) and molecules, e.g., in the water absorption regions through CWV.In fact, the amount of gasses passed through the atmospheric optical path is directly linked with SZA and VZA.This can clearly be seen in the continuum across the wavelength range and within molecular absorption (e.g., O 3 at 550 nm and O 2 A at 760 nm).
Thus, direct components present a stronger dependence on the illumination and viewing geometry as well as a stronger spectral variation in AOT and AMS as compared to the analogous diffuse components, E di f and T di f .Additionally, direct components are also stronger affected by the ELEV variable.However, due to the large wavelength range covered in this study, from 400 to 2500 nm, this cannot be appreciated in Figure 6.The ELEV variable especially affects molecular absorption bands, since changes in surface elevation causes variations in the atmospheric optical path length, therefore modifying absorption bands depth.Nevertheless, in the special case of water vapor, simulations were originally done by disentangling the CWV variable from the ELEV variable to avoid imposed correlations between these two parameters.
Secondly, the diffuse components, E di f and T di f , which account for the multiple scattering produced in the atmosphere, are basically driven by the aerosol load.In a physical sense, the AMS exponent is inversely related to the average size in the aerosols, being the smaller the particles the larger the exponent.In a mathematical sense, the Angström law describes the spectral dependence of AOT, relating the AOT at a fixed wavelength (in this study at 550 nm) to the AOT at the rest of wavelengths.Thus, in TOA radiance the effect of varying the AMS exponent is similar to vary the slope of the TOA radiance spectrum around a fixed wavelength (550 nm).This trend can be observed in each of the atmospheric output functions, where at 550 nm AMS has no influence, and its impact increases as it moves away from this wavelength.Therefore, while the impact of AOT and G in S Ti decreases with wavelength, AMS increases as it moves away from the reference wavelength.
Thirdly, the spherical albedo term, S, accounts for the back-scattering from the atmosphere to the surface.Driving variables in S are those regarding the scattering effect, i.e., the aerosols optical variables (AOT, AMS and G), as it was expected.Finally, as the S is diffusely transmitted, it has no dependence on the illumination nor acquisition geometry.
Finally, the path radiance L 0 , describes the scattering produced by the atmosphere before light arrives to the surface.Contrarily to the absorption, scattering is the dominant effect in L 0 .Consequently, the G variable has a strong influence.However, the importance of the observation and illumination geometry, which determines the optical path length, seems to be equal or even more important.Not only on SZA and VZA, but also on the relative azimuth angle (RAA) between the sensor and the illumination source is absolutely crucial to describe L 0 .
Apart from the general trend, the figures are characterized by local tiny spikes.This especially holds in the atmospheric water vapor absorption regions.The emulator showed unacceptably poor relative residuals in these areas so these spikes is treated as noise and should be ignored.

Discussion
Let us now discuss on the most important messages conveyed in this work, and the implications on further studies for remote sensing data analysis, sensor characterization, scene generation, and model inversion.

Interpreting Emulator Results
Emulation is a technique used to estimate model simulations when the model under investigation is too computationally burdensome to be run many times [34].Following on initial successful experiments that MLRAs can function as emulators to approximate the behavior of physically-based RTMs [37,38], here the concept of emulating RTMs has been further explored.Specifically, emulators approximating popular RTMs at the leaf, canopy and atmosphere scale were used as input into a GSA scheme.Variance-based GSA methods are excellent tools to identify driving variables in process-based models such as RTMs, but they require many simulations.When models are computationally costly, this method can become cumbersome, and so far only fast RTMs were subject to GSA studies.The excellent emulation-GSA results as compared to the original RTM-GSA results for PROSPECT-4 and PROSAIL suggest that emulators can open many practical applications.It demands for a closer inspection of their properties.
When it comes to emulating RTM spectral ouptuts, an important aspect involves dealing with a large amount of multi-outputs, e.g., up to 2101 wavebands to construct the 400-2500 nm profiles at 1 nm.It made that existing emulation packages (e.g., BACCO (Bayesian Analysis of Computer Code Outputs) or GEM-SA (Gaussian Emulation Machine for Sensitivity Analysis) [34,72,73]), as they process only one output, are impractical for RTM sensitivity analysis.To cope with such a high degree of multiple-output, we had implemented a PCA dimensionality reduction procedure into ARTMO's Emulator toolbox and linked it with the GSA toolbox.The Emulator toolbox is essentially nothing more than a collection of multivariate regression algorithms that builds a relationship between RTM input variables and spectral output.Any multivariate regression algorithm can potentially function as emulator.However, the regression methods should be able to implement nonlinear and smooth (regularized) functions, in order to deal with complex, possibly ill-posed input-output relations.Earlier experiments on emulation revealed that nonparametric regression methods limited to linear transformations, such as partial least square regression (PLSR), are unable to reconstruct spectral signals with sufficient accuracy [37], while nonlinear regression methods in the families of NNs and kernel-based MLRAs have proven to be powerful adaptive estimators (e.g., [5,37,46]).
In this study three potentially powerful MLRAs (KRR, NN and GPR) were evaluated to act as emulators on their accuracy accuracy, computational burden and correlation of the residuals.Goodness-of-fit validation results indicated that relative errors (i.e., NRMSE) were below 0.5%.Nevertheless, although trained extremely fast (about 2 s), KRR performed mostly worse, partly due to the low number of hyperparemeters used in defining the kernel function.NN performed generally superior but this method requires a considerably longer training phase, here up to 14 min, which may be unacceptably long when dealing with computationally cheap RTMs.Consequently, for applications based on PROSPECT and SAIL, most likely it does not pay off to emulate them with NN, e.g., here the original PROSPECT and PROSAIL GSA procedures were completed faster than the NN training plus NN-GSA running.In turn, a training phase in the order of minutes may not be trivial when it comes to costly RTMs.For instance, one single MODTRAN5 simulation at finest spectral resolution of 0.1 cm −1 may take over 10 min on a contemporary computer.GPR may function as an sensible compromise; it is relatively fast in training (here less than 2 min), fast in testing, and with accuracies generally close to those of NN.
Moreover, GPR possesses additional properties which were not explored in this paper and that explain its first choice for emulation (e.g., [34,74]).GPR can give some information about the relative relevance of the inputs, it can provide associated uncertainty intervals (referred to as code uncertainty of the emulator) for the predictions, and the Jacobian and Hessians of the prediction function can be explicitly derived [50,75].The uncertainty estimations can be of interest as they provide a reliable indication of the trustworthiness of the generated outputs.For instance, in regions where the RTM is not evaluated, we are uncertain about what the true simulator would introduce, and thus uncertainties are larger than within the training variable space.Accordingly, code uncertainty can be reduced by increasing our knowledge of the RTM, i.e., by increasing the training sample size [76].
In fact, the uncertainty intervals can function as a quality check, e.g., accepting only emulations within a confidence threshold.Nevertheless, it should be noted that preparatory experiments with PROSPECT and SAIL revealed that these uncertainties are generally so small that they were hardly apparent (results not shown), as has also been recently observed by [38].
However, this does not mean that emulators are free from limitations.Emulators are meta-models, i.e., model of a model, implying that they only approximate the functioning of the simulator, i.e., the RTM.It bears the following consequences: (1) the emulator can only be as good as the simulator, i.e., imperfections in the simulator are mirrored to the emulator; (2) the accuracy of the emulator depend on the type of machine learning regression and the number and representativity of the training samples; and (3) emulators uncertainty increases when moving beyond the boundaries of the training dataset so accurate extrapolation is compromised.All in all, a careful training and validation of emulators is strictly necessary, and associated uncertainty intervals may help deducing whether the generated output is within reasonable and physically meaningful levels.

Interpreting Sensitivity Analysis Results
Machine learning emulators provided accurate estimations and reproduced the underlying mechanistic principles encoded in the RTMs, and allowed for intensive simulations and remote sensing applications.By applying Saltelli's GSA scheme, we studied the total sensitivity provided by the emulators compared against RTM reference results for PROSPECT-4 and PROSAIL.Results demonstrated that the emulator-GSA setup delivers identical sensitivity patterns as the original RTM GSA scheme.Nevertheless, given that these two RTMs are computationally cheap, the gain in speed to deliver the sensitivity results is not very large.If computationally cheap, obviously, applying GSA to the original RTMs is always preferred.Conversely, the use of emulators can become a powerful tool to bypass the computational burden of expensive RTMs.For instance, the emulator-GSA setup allows for the first time decomposing MODTRAN5 outputs into its driving input variables across the spectral range.Here, for multiple MODTRAN5 atmospheric transfer functions (i.e., MODTRAN5 outputs) the emulator-GSA setup quantified the key variables within reasonable time, i.e., on the order of minutes.To put in perspective, conducting the same analysis with original MODTRAN5 simulations would have taken about 47 days on the same computer.
The MODTRAN5 emulation-GSA S Ti results showed that the relative importance of each input variable in the atmospheric transfer functions were physically meaningful: e.g., effect of water vapor in these absorption bands, effect of scattering variables in diffuse irradiance and transmittance.However, unlike leaf and canopy surface reflectance, the spectrum of the atmospheric transfer functions is affected by many narrow atmospheric and solar absorption bands on top of a smooth curve.The analysis of the residuals indicates that the emulators performed well in the smoother regions of the spectrum, while they showed large relative errors within the narrow atmospheric absorption regions with values close to zero.Most noticeable are the dominant water vapor absorption regions around 1400 and 1900 nm, which are typically discarded in further processing.In fact, we had repeated the emulation-GSA analysis but then without those absorption regions (results not shown).It led to the same sensitivity patterns but then with gaps where the data was removed.However, the relevant information that these regions are largely driven by CWV (especially for E dir and T dir ) would then be omitted.Despite high uncertainties, we therefore preferred to include the whole spectrum for sake of completeness.To analyze atmospheric drivers further, we consider as further work to dedicate different emulators (with dedicated principal components) for distinctive absorption regions to better capture high frequency spectral details.Another future sensitivity analysis would include more MODTRAN input variables.While here only the most relevant atmospheric variables are studied, it is of interest to disentangle the role of other variables such as single scattering albedo or even the variables that allow modelling the aerosol vertical distribution as a bi-modal, log-normal or gamma function [77].
For practical remote sensing applications, the PROSAIL and MODTRAN S Ti results suggest that not all input variables are equally important in shaping spectral output.A few variables possess hardly any sensitivity, which implies they can be safely kept to default values e.g., skyl and RAA for PROSAIL.In the particular case of MODTRAN5, the driving input variables must be separately studied for each atmospheric function.On the whole, ELEV and also RAA (apart from L 0 ) have hardly impact.However, more importantly, some variables play only a role in specific absorption regions, such as CWV.This suggests that CWV can be kept as constant e.g., when interested in analyzing scattering effects in the blue region.In fact, ELEV and sun-target-sensor geometry (VZA, SZA and RAA) are typically known and thus kept fixed in applications.Similarly, by discarding insensitive variables from the sampling scheme it is possible to simplify the computational load and inversion problem for mapping applications (e.g., as in [26,78]).
Finally, it must be noted that the RTM input variables are independent, i.e., no physical links between them have been introduced, although in reality they can be correlated.For instance, in case of MODTRAN5 the impact of the ELEV variable appears negligible for all atmospheric functions.However, in reality surface elevation drives the atmospheric optical path, and ELEV is therefore to some extent correlated with the amount of the atmospheric absorbers (gaseous components) such as water vapor.Similarly, in reality leaf and canopy variables are to some extent correlated, e.g., Cab occurs with Cw, and the existence leaf optical properties automatically imply the existence of leaves thus LAI (e.g., [79]).Accordingly, when aiming for practical applications from sensitivity studies it is recommendable to filter out input combinations that are unlikely to occur in reality.

New Processing Opportunities with Emulators
Probably the most significant advantage of emulators is the tremendous gain in processing speed.Each of the tested emulators delivers outputs quasi-instantly, which implies boosting in processing speed in the orders of tens to ten thousands depending on the speed of the original RTM.For instance, for the computationally intensive atmospheric RTM MODTRAN, the gain is about 130,000.At the same time, they hardly take memory space, only a few model coefficients are stored for prediction.Consequently, emulators can become an attractive technique for a diversity of remote sensing applications beyond GSA studies, which are briefly outlined in what follows.
Numerical inversion.Iterative optimization is a classical technique to invert RTMs in RS [80,81].The optimization consists in minimizing a cost function, which estimates the difference between measured and estimated variables by successive input variable iteration.The iterative optimization algorithms are computationally demanding and hence time-consuming when large remotely-sensed datasets are inverted.This method has never been a viable option for computationally intensive RTMs, and even for computationally cheap RTMs the method is slow and not appealing.Making use of pre-generated LUTs is preferred in inversion schemes (see [5] for a review).For instance, canopy RTM LUTs are used in the MODIS LAI retrieval products [82] and also atmospheric correction algorithms make use of LUT-based inversion methods (e.g., [83][84][85][86][87]).Accordingly, when replacing the original RTMs by their emulated counterparts then numerical inversion can again become an attractive alternative; numerical inversion using a RTM-like emulator can go very fast.It would bypass the need to invest in large and heavy LUTs.To explore this research line further, a so-called Numerical Inversion Toolbox is currently under construction that will make use of emulators.
Scene generation.Forthcoming optical satellite missions increasingly make use of end-to-end mission performance simulators (E2ES) to generate simulated scenes as would be detected by the optical sensor (e.g., [9,88]).A drawback is that scene rendering can take a long processing time, especially when based on computationally expensive models such as SCOPE or MODTRAN [89].Accordingly, when replacing the original RTMs by their emulated counterparts then scenes can be generated rapidly.Within the framework of ESA's forthcoming 8th Earth Explorer FLEX, currently it is being investigated whether some tedious parts of the FLEX-E2ES scene generation can be replaced by emulators.Data assimilation schemes.Data assimilation schemes where multiple data sources and models are coupled are more commonly applied in climate and Earth system studies.Often they use advanced data fusion and processing techniques, e.g., Ensemble Kalman Filter and Markov Chain Monte Carlo methods (e.g., [90][91][92]) that can take considerable computational time and so the limiting factor in the processing chain.To mitigate the processing bottlenecks in assimilation schemes, it has been proposed to replace parts of deterministic input-output processing chain by emulators.Typically the most computationally burdensome parts, although in does not have be one emulator attempting to emulate the complete system, but rather replacing tedious parts by multiple smaller but highly accurate emulators.Efforts in this direction are undertaken by [38,93].
Improved emulators.From a pure machine learning point of view, emulators boil down to developing regression algorithms that should be accurate enough when training with relatively few-to-moderate available observations.The scenario calls for the concept of regularization, which is tightly related to invariance encoding and incorporation of prior knowledge and the definition of sensible cost functions.Many opportunities appear here to improve the performance of emulators: one could think of including multiple pieces of information in the regression algorithm with multimodal/multiresolution regression, e.g., by combining RTMs for the same problem, to accommodate spatial or temporal relations in the emulation [44,94], and to implement better dimensionality reduction techniques beyond linear PCA to deal with the multi-output problem [95].Apart from these improvements in the regression algorithm, we raise here the important issue of assessment of the emulator function, e.g., by looking at the Jacobian and Hessian of the transformation [38,96], Bayesian sensitivity analysis [34,97], as well as developing emulators that may deal with coupled RTMs and transformations of coefficients [50].

Conclusions
Emulators are statistical constructs that approximate the functioning of process-based models such as RTMs.They provide great savings in memory and tremendous gains in processing speed, while yielding competitive accuracies in reconstructing RTM outputs.This emulation technique opens many new research and practical opportunities, not in the least enabling to convert computationally expensive RTMs into computationally fast surrogate models.
As a proof-of-concept, we demonstrated the use of emulated RTMs into a global sensitivity analysis (GSA) study.We analyzed three MLRAs (NN, KRR and GPR) on their ability to function as an emulator to approximate spectral outputs of the leaf optical RTM PROSPECT-4, the canopy RTM PROSAIL and the computational expense atmospheric RTM MODTRAN5.A PCA preprocessing step was introduced to speed up processing and enable deconstructing the full spectral profile.Particularly NN and GPR proved to act as accurate emulators, with normalized root-mean-squared errors below 0.2% for the majority of RTM validation datasets.The PROSPECT-4 and PROSAIL-like emulators were subsequently implemented into a GSA scheme.Because of delivering accurate RTM reflectance outputs, NN and GPR emulators delivered GSA total order sensitivity (S Ti ) results (including interactions) identical as the reference S Ti results coming from original PROSPECT-4 and PROSAIL simulations.
These two studies successfully demonstrated the benefit and reliability of using emulators into GSA studies.As a next step, the most accurate MODTRAN5-like emulators for six MODTRAN5 atmospheric transfer outputs (E di f , E dir , T di f , T dir , S and L 0 ) were implemented into the GSA scheme.The analysis along the 400-2500 nm spectral range did not take more than a few minutes.Key driving atmospheric variables were identified.Particularly the aerosol optical variables are mostly driving the entire spectral range.While AOT and AMS govern all atmospheric outputs, the asymmetry coefficient of scattering G, only impacts those outputs diffusely transmitted.On the contrary, molecular compounds such as CWV, which only affects specific absorption regions, present a stronger effect on those outputs that are directly transmitted.
Altogether, these GSA studies at the leaf, canopy and atmosphere scale demonstrated that the emulator technique, in which both consistent physical assumptions and statistical learning live together, can open many new opportunities in the use of advanced RTMs for practical applications.

Figure 1 .
Figure1.Top panels: general statistics (mean, standard deviation (SD), min-max) for PROSPECT-4 validation and corresponding emulated spectra (Top).The SD and min-max areas are presented as semi transparent so that their shapes for both datasets can be distinguished (red: RTM, blue: emulator).Consequently, the blended purple (SD) and pink (min-max) colors indicate the degree of RTM vs. emulated data overlapping.Middle and bottom panels: general statistics (mean, SD, 97.5 percentile) of derived residuals (Middle) and relative residuals (Bottom).

Figure 2 .
Figure 2. Top panels: general statistics (mean, standard deviation (SD), min-max) for PROSAIL validation and corresponding emulated spectra (Top).The SD and min-max boundaries areas are presented as semi transparent so that their shapes for both datasets can be distinguished (red: RTM, blue: emulator).Consequently, the blended purple (SD) and pink (min-max) colors indicate the degree of RTM vs emulated data overlapping.Middle and bottom panels: general statistics (mean, SD, 97.5 percentile) of derived residuals (Middle) and relative residuals (Bottom).

Figure 3 .
Figure 3. Top panels: general statistics (mean, standard deviation (SD), min-max) for MODTRAN5 validation and corresponding emulated spectra of best performing emulator (Top).The SD and min-max boundaries areas are presented as semi transparent so that their shapes for both dataset can be distinguished (red: RTM, blue: emulator).Consequently, the blended purple (SD) and pink (min-max) colors indicate the degree of RTM vs emulated data overlapping.Middle and bottom panels: general statistics (mean, SD, 97.5 percentile) of derived residuals (Middle) and relative residuals (Bottom).

Figure 5 .
Figure 5. PROSAIL GSA S Ti reference results (a) and PROSAIL (279 s) GSA S Ti results as generated by the emulators KRR (73 s) (b), NN (80 s) (c) and GPR (79 s) (d).On top: the used RTM or emulator and processing time.s: seconds.

Table 1 .
Range and distribution of input variables used to establish the PROSPECT-4, PROSAIL (PROSPECT4 + SAIL) and MODTRAN5 look-up tables.SAIL hot spot variable was fixed to 0.01.