Satellite and In Situ Observations for Advancing Global Earth Surface Modelling: A Review

In this paper, we review the use of satellite-based remote sensing in combination with in situ data to inform Earth surface modelling. This involves verification and optimization methods that can handle both random and systematic errors and result in effective model improvement for both surface monitoring and prediction applications. The reasons for diverse remote sensing data and products include (i) their complementary areal and temporal coverage, (ii) their diverse and covariant information content, and (iii) their ability to complement in situ observations, which are often sparse and only locally representative. To improve our understanding of the complex behavior of the Earth system at the surface and sub-surface, we need large volumes of data from high-resolution modelling and remote sensing, since the Earth surface exhibits a high degree of heterogeneity and discontinuities in space and time. The spatial and temporal variability of the biosphere, hydrosphere, cryosphere and anthroposphere calls for an increased use of Earth observation (EO) data attaining volumes previously considered prohibitive. We review data availability and discuss recent examples where satellite remote sensing is used to infer observable surface quantities directly or indirectly, with particular emphasis on key parameters necessary for weather and climate prediction. Coordinated high-resolution remote-sensing and modelling/assimilation capabilities for the Earth surface are required to support an international application-focused effort.


Introduction
The era of information technology and advances in high-performance computing are opening up possibilities to realistically simulate the evolution of the Earth system at scales not thought possible before. In hydrology, this has been labeled hyper-resolution, referring to a finer spatial discretisation than what was previously attempted at global scale [1][2][3] and it has triggered a fruitful debate on the importance of information-driven advances [4,5]. The debate is partly about the resolution and model complexity needed to resolve and represent regional-scale water-cycle processes [6][7][8]. In weather and climate studies, calls for higher resolution have also been made [9,10], as currently unresolved cloud processes are one of the obstacles for improved climate predictions. Attaining these finely resolved scales with the entire Earth System including the land, atmosphere, oceans and ice surfaces involves major technological advances in scalability [11] due to computing energy barriers, time-to-solution requirements, and hardware/software limitations. In this review, we emphasize the essential role of Earth Observations data from polar orbiting and geostationary satellites as a key driver for advancing models, and we present a number of examples of how satellite data combined with in situ observations have successfully driven model improvement. A global kilometric representation of the Earth surface that can feed into the next generation of weather and climate models will have to be driven by satellite observations [12,13]. The expected increased use of subseasonal-to-seasonal (S2S) forecasts [14] makes it even more important to improve the representation of the land and ocean surface in current models, given the essential role of land-ocean-atmosphere interactions in S2S predictability. Such an ambitious goal can be achieved only via the synergistic development of models and model-data fusion techniques that make use of the wealth of satellite observations, ingested into the modelling chain, to genuinely characterize the details of land physiography, coastal and inland water and all surface heterogeneity that impacts water and energy fluxes. The drive to achieve physical realism in Earth surface modelling is common to hydrological and environmental applications [15] and resolution and model complexity are key ingredients. Not all surface variables can be observed routinely and constraining most sub-surface properties remains challenging. Therefore, the representation of uncertainties is essential in forecasting applications [16].
Information on the parameters necessary to describe the thermal, radiative, and hydrological properties of the surface must also be extracted, and uncertainties evaluated, so that the models can predict the main reservoirs and fluxes for water, energy and mass exchanges. The availability of stored energy and water within the Earth surface is further modulated by land use (such as irrigation, agricultural practice, forest management) and sub-surface properties which drive the high spatial heterogeneity. That is why there is a growing realization that human activities, impacts and feedback need to be represented in models [17]. The projected increase in the occurrence of extreme weather events (heatwaves, but also droughts and floods in some regions [18]) in connection with global warming in the 21st century, combined with the proven fact of observed warming trends since pre-industrial time [19], are both of global-scale and local relevance. The latest IPCC (Intergovernmental Panel on Climate Change) assessment [18] indicates that further warming, even for only 0.5 • C of global warming, would further increase the occurrence or intensity of warm spells on global scale, of heavy precipitation in several regions, and of droughts in some regions; heavy precipitation associated with tropical cyclones is also projected to increase with increased global warming. Similarly, the fast-paced change observed for the cryosphere including the significant trends in both snow and sea-ice cover [20][21][22] means that we need an improved account of surface thermodynamic processes at the relevant scales in interaction with the diurnal cycle and synoptic variability. Observing and simulating the response of land biophysical variables and the cryosphere to extreme events is a major scientific challenge in Numerical Weather Prediction (NWP) and environmental applications, which is relevant also in the context of climate change adaptation.
The modeling of terrestrial variables can be improved through the integration of observations. Integrating observations into models covers several aspects: (1) the dynamic integration of observations into models through data assimilation techniques, (2) the use of observations for model validation and evolution and (3) the mapping of the model parameters used to characterize the representation of land properties within the model (e.g., soil properties, land cover). Although the assimilation from satellite data is becoming widely adopted in Earth system modelling also at the surface [23], limited use tends to be made of the wealth of Earth observations (EO) data available for model validation, evolution and parameterisation. There is a thus a great opportunity to improve upon current development strategies. For example, it is well known that the use of indirect observations such as the near surface temperatures to constrain model soil moisture [24] can lead to compensating errors. Remote sensing can also greatly support model development for the cold processes associated with seasonal snow cover. Snow is a key component of the Earth system due to the unique surface "shielding" properties and its presence over large parts of the Earth's surface. The radiative (albedo), thermal (insulation) and mechanic (roughness) characteristics of a snow covered surface differ greatly from those of a snow-free surface. While the particular microphysical characteristics of snow are challenging to represent in models, they provide unique radiative signatures that can be explored by EO data. Remote sensing observations are particularly useful in this context because they are now available for all parts of the globe. However, most snow remote sensing products greatly benefit from in situ observations [25]. Many satellite-derived products relevant to the hydrological and vegetation cycles are already available and the use of combined observing platforms is increasingly adopted in order to enable the move towards the kilometer-scale and support global Earth surface modelling.
The focus of this review is to cover relevant successful examples, inspired by the National Aeronautics and Space Administration (NASA) decadal survey [26] and by the World Meteorological Organisation (WMO) Global Climate Observing System (GCOS) essential climate variable observing capabilities [27,28], without aiming to be exhaustive. We present recent use of Earth Observations data pertinent to constraining the surface energy, water and carbon cycles in models. We discuss their relevance for the next generation of surface models for Earth system monitoring and forecasting applications, and we present options for closer collaboration and more rapid information uptake between data providers and model developers. In Section 2, satellite and in situ observations of the earth surface are listed with the relevant instruments information. Section 3 presents the Earth surface model development examples over land and ocean. Section 4 presents some of the pathways in which EO data guide enhanced global-scale local relevant monitoring and forecasting applications. Section 5 discusses how to increase EO data uptake in earth system modelling moving towards global km-scale, including international coordination activities. Section 6 presented a summary and the outlook.

EO Satellite and In Situ Observations for Earth Surface
This section reviews the satellites and in situ platforms that have dedicated sensors for Earth surface monitoring and that have been used to inform model development. Both surface dedicated and surface sensitive instruments are considered.

SMOS Soil Moisture Ocean Salinity Mission
The Soil Moisture and Ocean Salinity SMOS mission [29][30][31] was launched on 2 November 2009. This is an interferometric synthetic aperture radiometer mission that can measure the Earth's natural emission at a relatively low microwave frequency of 1.4 GHz in full-polarization and covering multiple incidence angles from 0 to 60 degrees with angles in the 40-45 range accessible all across the swath. SMOS has 69 small static antennae that, thanks to interferometry, are equivalent to a filled antenna of 8 m, achieving a spatial resolution on the ground ranging from 27 to 55 km. The satellite follows a sun-synchronous polar orbit with 6:00 a.m. (ascending half-orbit) and 6:00 p.m. (descending half-orbit) Equator overpass times. The mission provides global brightness temperature and retrieves soil moisture fields in near real time [32,33]. The land products are also generated globally at 15 km resolution with a latency of 24 h. These products are surface soil moisture, vegetation opacity, and surface dielectric constant [30]. Level-3 and Level-4 are also produced [34] and cover a large range of products [35,36], e.g., root zone soil moisture and drought index, L-Band Vegetation Optical Depth (VOD, [37][38][39]), thin sea ice [40], global rainfall estimates [41], to name a few.

SMAP Soil Moisture Active Passive Mission
The Soil Moisture Active Passive SMAP mission [42], launched on 31 January 2015 carries both an L-band (1.4 GHz) passive microwave radiometer and a radar (although the latter is no longer functioning). Using a 6-m diameter spinning antenna, the observatory provides near-global coverage every two to three days. The primary focus of the SMAP mission is on soil moisture retrieval and freeze/thaw detection, but SMAP observations have also been used for ocean salinity [43] and wind retrievals [44]. To provide estimates of root zone soil moisture and complete spatio-temporal coverage, SMAP radiometer observations are routinely assimilated into the NASA Catchment land surface model to generate the SMAP Level-4 Soil Moisture (L4SM) data product [45]. The L4SM product is generated globally at 9 km resolution every three hours and distributed to the public with a latency of 2.5 days from the time of observation.

TERRA/AQUA-MODIS
Terra and Aqua (Latin for land and water) are NASA Earth Science satellite missions equipped with multi-sensors aimed at studying the Earth's water cycle including evaporation from the oceans, water vapor in the atmosphere, clouds, precipitation, soil moisture, sea ice, land ice, and snow cover on the land and ice. Terra was launched in December 1999 and initiated data transmission in February 2000, with its five instruments, ASTER, CERES, MISR, MOPITT, and MODIS (see acronyms list for details, page 55). Aqua was launched on 4 May 2002, and had six Earth-observing instruments on board, AIRS, AMSU, AMSR-E, CERES, HSB, and MODIS, which are collecting a variety of global data sets. Aqua was originally developed for a six-year design life but has now far exceeded that original goal. About half of the sensors continue transmitting high-quality, in particular from AIRS, CERES, and MODIS, with reduced functionality from AMSU-A channels. AMSR-E became inactive since 2011, with HSB, collected approximately nine months of high quality data but failed in February 2003. For both satellite platforms MODIS (Moderate-resolution Imaging Spectroradiometer) is a key instrument with its 36 spectral bands ranging in wavelength from 0.4 µm to 14.4 µm and at varying spatial resolution (two bands at 250 m, five bands at 500 m and 29 bands at 1 km). Together, the instruments image the entire Earth every one to two days. A large amount of MODIS-based Earth's surface products are currently used in weather and climate modelling, as detailed in Section 2.

LANDSAT and Its Legacy
The Landsat program initiated with the launch of the first satellite in July 1972 and the latest Landsat-8 launched in February 2013 is the longest-serving EO platform of high resolution mapping satellites. Landsat-9 foreseen to launch in 2020 will continue this legacy that has allowed to monitor key surface changes over time, including water-bodies changes, glaciers retreat, large fires, urbanization and agricultural land expansions. Landsat 8 ensures the continued acquisition and availability of Landsat data utilizing a two-sensor payload, Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS). These provide imagery from 15 m to 100 m resolution and up to 700 scenes per day. Landsat data volume is prohibitive for most research applications at global scale and often used in mapping applications to produce more accurate medium resolution (e.g., at 1 km) as mentioned in Section 4.

SEASAT and Its Legacy
The Seasat was the first ocean dedicated satellite launched in June 1978 although it operated only for 106 days. Earth-orbiting satellites such as TOPEX/Poseidon, and scatterometers on NASA Scatterometer (NSCAT), QuickSCAT, all launched in the 1990s and Jason-1, launched in 2001, Jason-2, launched in June 2008, and Jason-3 launched 17 January 2016 followed the initial mission and continued its legacy. The MetOp and its successor MetOp-SG programmes are the current and future legacy to enhance and expand the European EO capability.

Copernicus Sentinels
The Copernicus Sentinels program developed by the European Space Agency consists of a family of missions designed for the operational monitoring of the Earth system with continuity up to 2030 and beyond ( Figure 1 and Table 1). The Sentinels' concept is based on two satellites per mission necessary to guarantee a good revisit and global coverage and to provide more robust datasets in support of the Copernicus Services.
On-board sensors include both radar and multi-spectral imagers for land, ocean and atmospheric monitoring: Sentinel-1 is a polar-orbiting, all-weather, day-and-night radar imaging mission for land and ocean services. Sentinel-1A was launched on 3 April 2014 and Sentinel-1B on 25 April 2016.
Sentinel-2 is a polar-orbiting, multi-spectral high-resolution imaging mission for land monitoring to provide, for example, imagery of vegetation, soil and water cover, inland waterways and coastal areas. Sentinel-2 can also deliver information for emergency services. Sentinel-2A was launched on 23 June 2015 and Sentinel-2B followed on 7 March 2017.
Sentinel-3 is a polar-orbiting multi-instrument mission to measure sea-surface topography, sea-and land-surface temperature, ocean colour and land colour with high-end accuracy and reliability. The mission will support ocean forecasting systems, as well as environmental and climate monitoring. Sentinel-3A was launched on 16 February 2016 and Sentinel-3B has been launched on 25 April 2018.
Sentinel-5 Precursor-Sentinel-5P-polar-orbiting mission is dedicated to trace gases and aerosols with a focus on air quality and climate. It has been developed to reduce the data gaps between the Envisat satellite-in particular the Sciamachy instrument-and the launch of Sentinel-5. Sentinel-5P is orbiting since 13 October 2017.
Sentinel-4 is devoted to atmospheric monitoring that will be embarked on a Meteosat Third Generation-Sounder (MTG-S) satellite in geostationary orbit and it will provide European and North African coverage.
Sentinel-5 will monitor the atmosphere from polar orbit aboard a MetOp Second Generation satellite.
Sentinel-6 will be a polar-orbiting mission carrying a radar altimeter to measure global sea-surface height, primarily for operational oceanography and for climate studies.  Future Sentinels missions include polar-orbiting satellites dedicated to support an anthropogenic CO 2 monitoring capacity. In the longer term, thermal Infrared imagers with focus drought monitoring and on polar regions are being planned, and a hyper-spectral instrument will enable more precise agricultural monitoring.
Special focus here is given to the Sentinel-3 mission [46], jointly operated by the European Space Agency (ESA) and the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT), and which tackles precisely the scales of interest of our study and has operational requirements. The objectives of Sentinel-3 are to measure sea-surface topography, sea-and land-surface temperature and ocean-and land-surface color in support of ocean forecasting systems, and for environmental and climate monitoring.
The Sentinel-3 satellites series ensures global, frequent and near-real time ocean, ice and land monitoring, with the provision of observation data in routine, long term (up to 20 years of operations) and continuous fashion, with a consistent quality and a high level of reliability and availability. The Sentinel-3 mission addresses these requirements by implementing and operating the following instruments, building on experience and heritage from the ERS and ENVISAT missions: A dual frequency, delay-Doppler Synthetic Aperture Radar Altimeter (SRAL) instrument supported by a dual frequency passive microwave radiometer (MWR) for wet-tropospheric correction, and a Precise Orbit Determination package. This combined package provides measurements of sea-surface height and topography measurements over sea ice, ice sheets, rivers and lakes.
A highly sensitive Ocean and Land Colour Imager (OLCI) delivering multi-channel wide-swath optical measurements for ocean and land surfaces. With 21 bands (compared to the 15 on Envisat's MERIS) and a design optimised to minimise sun-glint, OLCI marks a new generation of measurements over the ocean and land attaining a resolution of 300 m over all surfaces.
A dual-view Sea and Land Surface Temperature Radiometer (SLSTR) delivering accurate surface ocean, land, and ice temperature, with an accuracy better than 0.3 K. SLSTR measures in nine spectral channels and two additional bands optimised for fire monitoring. SLSTR has a spatial resolution in the visible and shortwave infrared channels of 500 m and 1 km in the thermal infrared channels. The swath of OCLI and nadir SLSTR fully overlap.
Sentinel-3A and 3B were launched in February 2016 and April 2018, respectively. Full performance will be achieved once both Sentinel-3A and 3B will be in routine operations at the end of 2018, with a revisit time of less than two days for OLCI and less than one day for SLSTR at the equator. The satellite orbit provides a 27-day repeat for the topography package, with a 4-day sub-cycle (defined as the minimum number of days after which the ground track of the satellite nearly repeats itself within a small offset).
The Sentinel-3 ground segment systematically acquires, processes and distributes a set of pre-defined core data products to the users (see https://earth.esa.int/web/sentinel/missions/sentinel-3/data-products).
An example of the sensor capability is given in Figure 2 for land surface temperature, vegetation state (chlorophyll index) and snow cover.

GOES, METEOSAT and Other Geostationary Satellites
Geosynchronous satellites have the great advantage of orbiting with the Earth's rotation and therefore of providing observations of the same location at high temporal frequency. Such capability is particularly relevant for the estimation of variables rapidly changing with time over land and ocean surfaces. EUMETSAT currently operates four satellites of the Meteosat Second Generation (MSG) series over Europe and Africa (Meteosat-9, Meteosat-10 and Meteosat-11) and over the Indian Ocean (Meteosat-8). The redundancy over the former (nominal 0 • E) disk allows a full disk service together with a rapid scan service over Europe, keeping one satellite as backup. MSG series observations are available for more than 14 years, since 2004. If combined with Meteosat First Generation satellites, these can be extended back in time to the early 1980s, despite the poorer spatial and temporal samplings, as well as lower spectral resolutions of that series.
The National Oceanic and Atmospheric Administration, NOAA's operating strategy calls for two GOES satellites to be active at all times, one satellite to observe the Atlantic Ocean and the eastern half of the USA, and the other to observe the Pacific Ocean and the western part of the country. GOES-16 (or GOES-East) positioned at 75 • W longitude and GOES-15 (or GOES-West) positioned at 135 • W longitude are currently, the two operational meteorological satellites in geostationary orbit over the equator operated by NOAA. GOES-16 replaced GOES-13 on 8 January 2018 and GOES-15 replaced GOES-11 on 6 December 2011. Additionally, GOES-13 located at 60 • W supports Central and South America to prevent data outages during the GOES-16 rapid scan operations, while GOES-14 is maintained as on-orbit spare to replace either, GOES-15 or GOES-16, in the event of failure.
Russia's new-generation weather satellite Elektro-L No.1 (GOMS-2) operates at 76 • E over the Indian Ocean. India also operates geostationary satellites called INSAT-3D which carry instruments for meteorology. China currently maintains three Fengyun geostationary satellites (FY-2E at 86.5 • E, FY-2F at 112 • E, and FY-2G at 105 • E) in operations, while Japan maintains MTSAT-2 (very similar to GOES satellites prior to GOES-16) located over the mid Pacific at 145 • E (although inactive since 10 March 2017) and the Himawari-8 (with instruments similar to GOES-16/17) at 140 • E. Table 2 provides an overview of geostationary satellite series launched until this day and their main coverage area. Accurate derivations of land surface temperature and land surface emissivity from satellite measurements are difficult because the two variables are closely coupled. At the very surface, the land surface temperature is characterised by a strong diurnal cycle that is driven by solar radiation and is dependent on the exact sensing depth and thermal properties of soil and vegetation. Geostationary observations with temporal samplings ranging between 30-min (earlier satellites) and 10-min are particularly suitable for monitoring variables with pronounced diurnal cycles, with the exception of the polar regions where revisit times of polar orbiters are generally short. The Spinning Enhanced Visible and Infrared Imager (SEVIRI) onboard MSG takes top-of-atmosphere measurements within 12 bands, every 15-min and with spatial sampling of at least 3 km at nadir. SEVIRI-based retrievals of Land Surface Temperature (LST) can therefore capture the high spatial and temporal variability (see Figure 3, as example), which in turn are closely linked to surface conditions, such as vegetation cover and state, or soil moisture availability. LSTs (or equivalently Sea Surface Temperatures) are usually derived from top-of-atmosphere thermal infrared measurements in the atmospheric window, by correcting for the atmospheric absorption and emission and surface emissivity. Such surface temperature estimates are fairly close to actual observations [47], though they are restricted to clear-sky conditions and to the disk field of view. Multiple geostationary satellites can be combined to extend coverage, maintaining a good temporal sampling.
The use of satellite LSTs to understand and improve the modelling of land-surface processes is still largely under-utilised [13,48]. The LST spatial and temporal variability reflects changes in the surface energy balance. The amplitude and phase of the LST diurnal cycle are strongly linked to net radiation and turbulent energy flux partitioning at the surface [49], and to surface properties such as surface wind, the surface type and vegetation state, soil moisture availability, amongst others.

Ground-Based Networks
A great wealth of Earth system observations are available to constrain land surface state. These observations continue to grow in volume and diversity but gaps still exist for certain key parameters (e.g., soil moisture, snow depth and river discharge). Despite their limited coverage and limited expansion, in situ observations continue to be the backbone of the observing system used by NWP land-surface applications. Some of the key in situ parameters used in NWP are: surface pressure, 2 m temperature, 2 m humidity, snow depth, soil temperature and wind (see Figure 4). They are mainly provided by the Synoptic Operations (SYNOP) and METeorological Aerodrome Report (METAR) networks and by radiosondes. These observations are used to constrain the atmospheric model, land-surface and snow analyses together. Large spatial gaps and representativity issues [50] continue to affect soil moisture and snow observations. In addition, most variables are retrieved in the atmosphere so that the constraint on the land surface is only indirect. Indeed, for instance, the soil moisture updates reflect changes in the surface energy partitioning, which affects the atmosphere (e.g., air temperature and humidity). Any incorrect physical representation between the surface (soil moisture) and the atmosphere (fluxes) would lead to incorrect soil moisture inversion. In the past five years, the number of observations available from the SYNOP (Synoptic observations) and METAR (Meteorological Report at the Airports) has increased ( Figure 5) thanks to the inclusion of automatic data reports not previously considered.
Long-term, large scales networks of ground measurements are also of paramount importance. For instance, detailed ground based soil moisture information is provided by in situ soil moisture databases such as the International Soil Moisture Network (ISMN) [51]. Ground-based measurements of soil moisture and temperature from ISMN [52][53][54][55] have been used to evaluate the European Centre for Medium-Range Weather Forecasts (ECMWF) operational analyses, forecasts and reanalyses ability to represent soil moisture and temperature. Verification studies using in situ networks contribute to a better understanding of the deficiencies in models and point to processes that need a better representation.   Particular attention must also be devoted to rainfall measurements, for which numerous datasets exist combining in situ rain gauge observations from the SYNOP and SYNOP-like national networks with satellite and ground-based remote sensing to obtained gridded products, as recently compared in a study [56]. The combination of multi-sources observations with models to ensure meteorological consistency is advocated as a way to increase accuracy. However, precipitation estimation for hydrological applications remains a challenge and motivates the continuous effort of the International Precipitation Working Group (IPWG), a permanent Working Group of the Coordination Group for Meteorological Satellites, active since 2001 [57]. Despite the effort of data collection of long term precipitation observations e.g., [58], a precipitation dataset that can support climate change studies and can be readily considered in reanalysis, remains largely unavailable. Earth system models also rely increasingly on non-atmospheric ground-based observation networks including information on river discharge from river gauging stations such as the databases held by the WMO Global Runoff Data Centre (GRDC) and groundwater levels from wells. Such networks are notably at risk from decline in network coverage and restricted national-scale data access [59]. More comprehensive coverage of the in situ networks availability is included in the status report of the global observing system for climate by GCOS in 2015 [60].

Ocean-Based Networks
The ocean in situ observations coordinated under the Global Ocean Observing System (GOOS, http://www.goosocean.org) is a sustained observing system, unified by the Framework for Ocean Observing [61], designed to provide access to timely ocean observations. In many cases, data are telemetered to data centers in near real-time for assimilation into NWP. Higher quality data are also available in delay mode and can be used for validation and evaluation of processes. With its unique status within the United Nations system, GOOS has been able to gather a network of independently-managed and independently-funded observing elements. The network includes Argo floats [62][63][64], surface drifters [65], OceanSITES moorings (http://www.oceansites.org), and ships (http://www.jcommops.org/sot). GOOS surface Essential Ocean Variables include sea state, ocean surface stress, sea ice, sea surface height, sea surface temperature, surface currents, sea surface salinity, and ocean surface heat flux, with the turbulent heat flux typically measured as a bulk flux from state variables.

Earth Surface Modelling Advances and Links with EO Datasets
In this section, we describe some of the progress that has been made over the last few years in modelling that can be directly linked to EO data availability. For instance, advances in different land-surfaces components introduced in the ECMWF operational forecasts can be linked to availability of key observational datasets. It is acknowledged that surface related developments need to be discussed in the wider context of Earth system modelling. Common areas of development in surface parameterisation going towards environmental applications are presented by clustering the following groups: land-surface reservoirs, land-atmosphere fluxes, land-surface properties, inland and open waters.

Land-Surface Reservoirs
The land surface is an important component of the Earth system and numerical weather prediction and climate models are evolving continuously to include more of its natural complexity. The relevance of an accurate description of the land surface for atmospheric modelling has been widely established in the scientific community for over thirty years [66][67][68][69]. The land controls the partitioning of available energy at the surface between sensible and latent heat fluxes, which in turn has a strong impact on the atmospheric heat and moisture budgets, especially on diurnal time scales. The land also determines the partitioning of available water between evaporation, drainage and runoff. The water fluxes and storage terms interact with the land morphology forming lakes and rivers. The land surface influences weather and climate on all time and space scales [70,71], and responds actively to modifications of weather patterns and climate change [72]. The role of the land surface in Earth system models is to provide a consistent description of the water, energy and carbon exchanges (between atmosphere, biosphere, hydrosphere, and cryosphere) at various time scales ranging from hours to decades. Many sensitivity studies have shown that the description of physical processes of continental surfaces can significantly affect the prediction of meteorological variables such as precipitation, wind or temperature in the lower troposphere [73][74][75][76][77]. Evapotranspiration directly affects weather parameters such as temperature, humidity, boundary layer development and clouds [78][79][80][81][82]. Furthermore, a strong feedback between evaporation and precipitation exists which appears to be negative at the convective scale [83][84][85][86] and positive at the continental scale [84]. This feedback involves soil water in the root-zone layer, which is one of the most important variables controlling large-scale continental summer temperature extremes [87,88].
The way land surfaces store and regulate water and energy fluxes is firstly controlled by soil moisture in the unsaturated zone, snow, ground water, lakes and open water. Such water bodies are de facto reservoirs of energy and water and have a "memory" much longer than atmospheric components (on the order of a few days). Secondly, the energy and water fluxes are also controlled by land use/management and biosphere [72], which are complex, heterogeneous and difficult to characterize in practise. Modelling upgrades can be clustered in two areas of development: • Enhanced realism of the representation of water and energy stocks in soil, snow and inland water bodies, via parameterisations and physiography revisions. • Improved fluxes for land-atmosphere energy and water exchanges, inclusion of natural and anthropogenic carbon emissions, and improved river discharges.
These developments are summarised in the following subsections: soil, snow, vegetation and river hydrology development are illustrated in light of the EO dataset that have guided the model improvements.

Soil
Soil is a porous medium that can store water, energy and carbon and these can be exchanged with the atmosphere and the oceans via transport mechanisms. The observations within the soil layer are typically in situ measurements, while remote sensing data can only penetrate the top few centimeters; therefore, a combination of satellite based and in situ based information is essential in constraining soil models. The amount of water in the soil and its vertical distribution in the column are important for the regulation of heat and water vapor fluxes towards the atmosphere and involves a range of time scales from minutes to years in the coupled land-atmosphere system.
For example, at ECMWF, the Hydrology-Tiled ECMWF Surface Scheme for Exchanges over Land (H-TESSEL, [89])-has included a revised soil hydrology developed and tested for the global scale. These model developments were a response to known weaknesses of the TESSEL hydrology as used in the ERA-Interim reanalysis: specifically the choice of a single global soil texture, which does not characterize different soil moisture regimes, and a Hortonian runoff scheme which produces hardly any surface runoff. Therefore, a revised formulation of the soil hydrological conductivity and diffusivity (spatially variable according to a global soil texture map) and surface runoff (based on the variable infiltration capacity approach) were introduced in the Integrated Forecasting System (IFS) in November 2007, and is used in weather and sub-seasonal-to-seasonal forecasting applications and climate reanalysis products at ECMWF.
The soil hydrology revisions involved the use of selected field sites as well as global atmospheric coupled simulations and data assimilation experiments [89]. Figure 6 illustrates the impact of these hydrology changes on the water budget of a number of European river catchments. H-TESSEL increases the seasonal amplitude of the Terrestrial Water Storage (TWS, Figure 6) change due the increased water holding capacity of the soil resulting from the new hydrological parameters and soil texture. The H-TESSEL scheme compared better than TESSEL with the terrestrial water storage dataset [90], which was derived as the residual of reanalysed atmospheric moisture convergence and observed river catchment runoff and referred to as the Basin Scale Water Budget (BSWB) dataset. Bare ground evaporation was improved by adopting a lower soil water content threshold for evaporation than the one used for vegetation. The most right panel of Figure 7 shows the difference in Root Mean Square Error (RMSE) of soil moisture between the revised model and the old model. RMSE is computed with respect to in situ measurements of soil moisture over the United States of America [54]. The new bare ground H-TESSEL evaporation formulation resulted in more realistic soil moisture values when compare to in situ data, particularly over dry areas. Considering ground measurements sites where fraction of bare ground was greater than 0.2, The Root Mean Squared Difference (RMSD) of IFS's soil moisture with the observations decreased from 0.110 m 3 m −3 to 0.088 m 3 m −3 . Figure 7 illustrates the impact of the new parameterisation on soil moisture over the USA. The old and new schemes (first and second panel from the left) shows the August soil moisture in the western part of the USA is much lower than with the old scheme. These changes correlate with the bare ground fraction (third panel). This is clearly beneficial, as can be demonstrated by verification based on the Soil Climate Analysis Network (SCAN) over the USA (see Figure 7 for the location of the stations). The positive differences in the right panel of Figure 7 indicate a reduction of the root-mean-square error (RSME) of soil moisture particularly at high bare ground fractions. A better match with Soil Moisture and Ocean salinity (SMOS) satellite observations [53] reinforced the generality of the results obtained from the in situ comparison, and confirmed once again the synergy of satellite and ground-based observations for the model development.
At NASA, the Catchment land surface model [91] has been continuously developed for use in the Goddard Earth Observing System NWP and reanalysis products [92] and the SMAP L4SM product. The latter product in particular depends critically on the skill of the land surface model. Some of the model developments prior to the launch of SMAP were focused on a revised set of soil texture and soil hydraulic parameters [93], as well as on the calibration of the microwave radiative transfer model that converts the modeled soil moisture and temperature values into L-band passive microwave temperatures [94]. Within the L4SM assimilation system, these modeled radiances were then confronted with the SMAP observations, with deviations between the modeled and observed radiances resulting in adjustments to the modeled soil moisture [95].
Since the first release of the L4SM product in 2015, new input parameter datasets for land cover, topography, and vegetation height were derived based on recent, high-quality satellite observations [96,97]. Land cover inputs were updated using the GlobCover2009 product, which is based on satellite observations from the Medium Resolution Imaging Spectrometer. Topographic statistics now rely on observations from the Shuttle Radar Topography Mission. Finally, vegetation height inputs are derived from space-borne Lidar measurements. Additionally, SMAP Level-2 soil moisture retrievals [98] were used to calibrate a particular Catchment model parameter that governs the recharge of soil moisture from the model's root-zone excess reservoir into the surface excess reservoir [99]. Specifically, the replenishment of soil moisture near the surface from below under non-equilibrium conditions was substantially reduced, which brought the model's surface soil moisture better in line with the SMAP Level-2 soil moisture retrievals (not shown).
The benefits of this change are illustrated in Figure 8, which shows that the surface soil moisture dynamics from the new model better agree than the old model with independent in situ measurements at SMAP core validation sites [45]. Figure 8a shows an example of surface soil moisture time series for one such site at Little River, Georgia, USA. Figure  , bias, and unbiased root mean squared error (RMSE) (or standard-deviation of the error). Skill is measured vs. in situ measurements from 18 SMAP core validation sites [45] and then averaged across the sites. The improved model version (NRv7.2) uses updated land cover, topography, and vegetation parameters based on remote sensing observations and was further calibrated using SMAP soil moisture retrievals.

Seasonal Snow Cover
Snow acts as a fast climate switch [100] with implications ranging from weather forecasts to climate change projections. Among the main evidence of the importance of snow in the Earth System, we can enumerate: (i) the snow albedo feedback, (ii) driver of sub-seasonal to seasonal atmospheric predictability, (iii) important role in the recent Arctic amplification and (iv) its impact on future water resources [101]. The snowpack lying on top of the soil affects the evolution of atmospheric temperatures via its high albedo and its thermal insulation capacity that can create a decoupling between the soil and the atmosphere [102][103][104]. This snow insulating effect causes strong temperature inversions near the surface in winter, which represents a challenge for daily minimum temperature forecasting. Snow also affects the freezing of water in the soil, with an impact on the hydrology in spring, on near-surface temperatures and on the stable boundary layer development [105][106][107]. Snow cover also acts as a water reservoir, which is released by snowmelt in the spring, influencing runoff, soil moisture, evapotranspiration and thus precipitation and the entire hydrological cycle [108]. Until 2009, the snow pack was parameterised in H-TESSEL following Douville et al. [109]. The snow pack was represented with a single layer of dry snow (i.e., neglecting liquid water) with four snow prognostic variables: mass, albedo, density and temperature. Snow albedo decreased in time at an exponential or linear rate, for melting and normal conditions respectively, and snow density increased with time according to an exponential relaxation.
With the participation of H-TESSEL in the snow model inter-comparison project 2 (SnowMIP2, [110]), and after the soil hydrology revision presented in the previous subsection, several shortcomings of the Douville snow pack representation were identified such as the lack of liquid water representation with freeze/thaw cycles during the melting season, unrealistic evolution of snow density, and unrealistic albedo of shaded snow (i.e., snow under high vegetation). These shortcomings were partially addressed with a full revision of the snow pack parameterisation in 2009 [111] including: (i) a new parameterisation of snow density, (ii) a liquid water reservoir and (iii) revised formulations for the sub-grid snow cover fraction and snow albedo. In offline mode, forced with near-surface observations, the revised scheme reduced the end of season ablation biases from ten to two days in open areas, and from 21 to 13 days in forest areas. The evolution of snow mass, depth and density during one winter season at the Fraser forest and open stations is shown in Figure 9, a research location in the Rocky Mountains, CO, USA. The results show the improvements of the snow scheme revision (NEW) with respect to the version used in ERA-Interim, labelled as control (CTR). The new snow density parameterisation increased the snow thermal insulation, reduced soil freezing, improved the hydrological cycle, and substantially reduced a warm winter bias compared to Siberian screen-level (2m) observations (not shown).
The dramatic change in surface reflectivity when snow is present has been widely explored using EO data, in particular in the visible range, and is mainly limited by cloud presence and contamination. The detection of snow on the surface was one of the first applications of EO data associated with surface characteristics and it is currently one of the surface variables with longer observed datasets using remote sensing. Furthermore, it is the main direct data source in land-surface assimilation in several operational weather forecasting centers [112]. In addition to snow presence, snow mass is an important water reservoir and there have been many studies focusing on the retrieval of snow mass from satellite [113].
Monitoring of snow cover area is well established based on optical data, whereas current Snow Water Equivalent (SWE, i.e., the total water mass in liquid equivalent) products mostly rely on passive microwave measurements [114]. Active microwave measurements at high frequency (Ku-band) are sensitive to volume scattering of the dry snow as well as the wet or dry status of the snow cover [115]. They are therefore also relevant for future mission concepts dedicated to accurate snow water equivalent retrieval from space. Dry snow is, however, largely transparent at lower frequencies (C or L-bands), making it a challenge to observe snow from space [116,117].
Physically-based forward models that rely on a good understanding of the electromagnetic interactions with the snowpack are best suited to obtain consistent retrievals of snow macrophysical and microphysical properties or for data assimilation in NWP systems. The new generation of snow forward models, such as the Snow Microwave Radiative Transfer (SMRT), account for multi-layer snow emission and backscatter [118]. Using satellite data from microwave sensors with appropriate multi-layer forward modeling and retrieval approaches for the snow microwave properties characterisation is of great interest to support physically based multi-layer snow model developments. Recent studies showed that SMOS data [119] and ground-based L-band measurements [120] could give insight on snow density and liquid water in snow using a two stream approach. This is promising for constraining model errors in snow depth that are not associated with snow mass but with its state of compaction.

Permanent Snow and Ice
In glaciers and over ice-sheets such as Antarctica and Greenland, the representation of snow for weather and seasonal forecasting has slightly different challenges. The spatial variability is somewhat reduced and open-snow is the dominant cover. There is also less emphasis on the mass balance as this is often not directly relevant. It is, however, essential to correctly represent the albedo to resolve the snow energy balance with a particular focus on melting. EO snow albedo, along with snow cover, plays a crucial role in model evaluation and development [121] with some important examples of data assimilation [122].
The surface skin temperature provides information particularly over Antarctica where polar orbiting satellites have multiple passages per day. The presence of snow largely affects the diurnal cycle of LST, for which current state-of-the-art models show some limitations. Recent results have also shown the potential for monitoring internal temperature of the ice sheet in Antarctica [123,124] using SMOS data. EOs can provide reliable information of LST over snow and ice in clear-sky conditions [125]. However, its application for model evaluation has been limited so far because of the poor coverage of geostationary satellites in high latitudes and because of the low temporal resolution of polar satellites far from the poles. LST data have been used to evaluate the diurnal cycle of skin temperature in the ECMWF ERA-Interim reanalysis over Antarctica, showing a homogeneous and persistent (throughout the year) warm bias over the Antarctic plateau [126]. Numerical experiments by [127] indicated that this could be associated to the high thermal inertia of deep snowpacks in the bulk (single-layer) snow scheme used in the reanalysis model (see Figure 10), indicating the potential of more sophisticated multi-layer snowpack schemes in future operational weather forecasting models. snow scheme (CTR) and numerical experiment with the thermal depth of the snow reduced by a factor of 10 (DSN), the latter being a proxy for the potential impact of a multi-layer snow scheme. It is clear that DSN has a larger amplitude LST diurnal cycle than CTR, and a better representation of strong cooling events [127].

Vegetation and Carbon Cycle
The biosphere plays a prominent role in regulating the flux of gases, energy and momentum into the atmosphere, so it is important to properly represent it in models. Parameterisations of the biosphere are simplified representations of the natural processes in which the spatial scales of interest such as the plant, field or watershed remain largely sub-grid in the foreseeable future and can only describe the main feedback mechanisms sometimes marred by sizeable systematic errors. A key characteristic for water vapour and carbon fluxes modeling is the so-called canopy resistance, which is a bulk representation of stomatal resistance, vegetation type and leaf area. The stomata are leaf pores through which the plants absorb carbon dioxide and transpire water vapour.
There are continuing efforts to calculate large-scale evapotranspiration using EO data. Globally available products include MODIS [128], GLEAM [129,130], SEBS [131] and PT-JPL [132], FLUXCOM [133] or WECANN [134]. Since evapotranspiration cannot be observed directly from radiative, observable properties of the surface and atmosphere, typically the models use EO data such as radiation, rainfall and soil moisture to train or derive simple models.
Refs. [135,136] focused on evaluating these products using flux tower and river streamflow under the LandFlux-EVAL initiative. They found significant differences between the products, which limited their use for model evaluation. More recently, Refs. [137,138] have assessed the latest products. They conclude that (i) the models tend to overestimate soil evaporation, (ii) the way the models treat soil moisture stress is important and that (iii) finding ways to evaluate the split of evapotranspiration into its three components (transpiration, soil surface and interception evaporation) is currently not possible but a critical task for the future. For the latter, Ref. [139] showed that the flux attribution is a complex issue.
Despite these caveats, it is imperative that developers use observed data to evaluate their models. To that end, iLAMB (integrated Land Atmosphere Model Benchmarking, [140]) was developed as a shared framework for large-scale model evaluation. It was used by the EartH2Observe project [141] to evaluate a suite of ten models including the evaluation of evaporation against the GLEAM EO product. As other researchers, they found that the comparison had limited value due to limited accuracy of the EO product.
In the ECMWF land surface scheme, the Leaf Area Index (LAI) expresses the phenological phase of vegetation (growing, mature, senescent, dormant). Initially it was kept constant [142] and assigned using a look-up table depending on the vegetation type. Thus, vegetation appeared to be fully developed throughout the year. In November 2010, Ref. [143] modified this and introduced a seasonality of vegetation via a LAI monthly climatology based on the MODIS (collection 5) satellite product by [144]. The new monthly LAI climatology was shown to affect particularly the spring season when the radiative forcing is already strong, but when the vegetation not yet fully developed. The sensitivity generally indicated a warming of the spring as a consequence of the lower LAI and reduced evaporation (and consequently more sensible heat flux). This resulted in a reduction of the systematic 2-m temperature errors in the spring [143].
Recently, the Copernicus Global Land Service (CGLS) products of surface albedo and LAI based on observations from the SPOT-VEGETATION and PROBA-V sensors [145] have become available in near real-time (NRT) with an operationally-maintained production chain. However, the direct use of these products within a NWP system is not possible without quality checks given the spatial and temporal discontinuities they may contain. A direct assimilation of LAI and surface albedo products using optimal interpolation analysis with the ECMWF NWP system was explored and evaluated for anomalous years [146]. It was shown that: (i) the assimilation of these products enables detecting/monitoring extreme climate conditions where the LAI anomaly could reach more than 50% and the albedo anomaly could reach 10% ( Figure 11), (ii) extreme LAI anomalies have a strong impact on the surface fluxes, while for the albedo, the impact on surface fluxes is small, (iii) neutral to slightly better agreement with in situ surface soil moisture observations and surface energy and CO 2 fluxes from eddy-covariance towers is obtained, and (iv) in forecast mode, the assimilation of LAI reduces the near-surface air temperature and humidity errors both in wet and dry cases, while the albedo has a small impact, mainly in wet cases, when albedo anomalies are more noticeable.

CO 2 Natural Ecosystem Exchange
More recently, the ECMWF land surface scheme has been extended with a carbon dioxide module based on the A-gs model [147]. The reason for adopting simple vegetation and carbon dioxide schemes, is that these are suitable for the NWP setup where environmental factors are controlled by meteorological forcing and constrained by data assimilation. The model relates photosynthesis to radiation, atmospheric carbon dioxide (CO 2 ) concentration, soil moisture and temperature. Ecosystem respiration is based on empirical relations dependent on temperature, soil moisture, snow depth and land use [148]. The CO 2 module parameters are optimised by vegetation type considering the Gross Primary Production (GPP) and the Ecosystem Respiration (Reco). Together, they compose the Net Ecosystem Exchange (NEE) of CO 2 between biosphere and atmosphere. The FLUXNET-based in situ observations from different climate regimes (http://www.fluxdata.org/), combined with a benchmarking systems similar to iLAMB [140], was used for parameter optimization by minimizing flux errors. Subsequently, a different year of the FLUXNET data was used for verification. The seasonal cycle of NEE is illustrated in Figure 12 for six sites with different biomes. Two model configurations are shown: the first uses a stomatal resistance formulation for evaporation that is controlled by the photosynthesis module (C-TESSEL), and the second uses the Jarvis-based stomatal resistance for evapotranspiration (CH-TESSEL). In addition, the CASA climatology (Carnegie-Ames-Stanford Approach, [149] ) is shown because it is extensively used in the community and it was previously used as a boundary condition for atmospheric CO 2 in the MACC (Monitoring Atmospheric Composition and Climate) project. Although it is difficult to draw firm conclusions, it is clear that the errors in NEE are large and vary dramatically from site to site, and differences between C-TESSEL and CH-TESSEL are small compared to the errors. The correlation between model NEE and observations averaged over 34 flux tower sites is 0.37 for CASA, 0.68 for C-TESSEL and 0.65 for CH-TESSEL. Both TESSEL versions have a correlation of about 0.80 for sensible and latent heat fluxes [148]. The substantial improvement of C-TESSEL/CH-TESSEL with respect to the CASA climatology is significant because it suggests that the real-time meteorological variability is a key driver of the NEE variability. Correlations coefficients of the two components of NEE, GPP and Reco, compared with tower observations (0.8 for both fluxes, on average over 34 sites) indicate much higher skill and confirm the robustness of these results. NEE is a small residual of GPP and Reco; therefore, a correlation coefficient above 0.6 is highly relevant. Again, C-TESSEL and CH-TESSEL show similar performance, with site to site variability attributed to representativity of the model grid-box [148]. CH-TESSEL has also been evaluated in coupled integration mode for the 2003 to 2008 period. The operational CAMS (Copernicus Atmosphere Monitoring Service) atmospheric CO 2 analysis and forecast system has been using the online NEE fluxes from CH-TESSEL ever since its introduction in the IFS, replacing the offline CASA fluxes from the GFED dataset [151]. Extensive testing of this online configuration shows that the global CO 2 atmospheric inter-annual variability is well simulated [152]. The correlation of global CO 2 with observationally based estimates is 0.70. The global CO 2 forecast has high skill in simulating day-to-day synoptic variability, which is crucial in order to be able to assimilate atmospheric CO 2 observations. Figure 13 illustrates the spatial and temporal CO 2 synoptic anomalies associated with the passage of synoptic weather systems over North America. The CO 2 forecast can represent the peaks of CO 2 observed at Park Falls (Wisconsin, USA) originating mainly from the advection of high CO 2 anomalies generated at the surface within the warm conveyor belt of synoptic low-pressure systems. Modelling day-to-day variability of the CO 2 fluxes from vegetation compared to using equivalent monthly mean fluxes with a diurnal cycle enhances significantly the atmospheric CO 2 variability and skill. Again, this illustrates the advantage of modelling the CO 2 fluxes inside the IFS with real-time meteorology. Despite the synoptic skill provided by the NEE synoptic variability, the model suffers from substantial biases in the biogenic CO 2 fluxes (GPP and Reco). These biases are diagnosed and corrected by comparing the model budget with a reference budget from a flux inversion system [153] based on in situ observations of atmospheric CO 2 at the surface [154]. The biogenic flux adjustment scheme of [153] addresses the important task of reconciling the bottom-up and top-down estimates of CO 2 fluxes. The benefits are a substantial reduction in the atmospheric CO 2 biases as well as a useful diagnostic to guide model developement. New data sets based on FLUXNET data [155] or satellite observations of Solar Induced Chlorophyll Fluorescence [156] can also be used to improve the attribution of the errors associated with GPP and Reco.

CH 4 Natural Methane Fluxes
Recent trends in atmospheric CH 4 are not well understood and attribution using surface networks is underdetermined [157]. The relatively short CH 4 atmospheric lifetime of 9.8 years [158] and the combination of anthropogenic and natural sources make CH 4 a suitable species for spatial and temporal analysis using EO. Near-surface CH 4 sensitivity provided by the GOSAT satellite has been available since 2009 and provides retrievals using the light-path proxy approach from shortwave infrared radiances. To improve attribution and aid model development, retrievals of methane isotopologues, such as 13CH 4 , could be used in the future. These retrievals are currently limited to stratospheric retrievals using the Atmospheric Chemistry Experiment (ACE) satellite [159] and remain a challenge for tropospheric instruments.
Wetlands are the most dominant source of CH 4 , contributing about 30% to the total flux [160]. Large uncertainties have been identified in both the spatial and temporal distribution of wetland CH 4 fluxes provided by land surface models [161]. Major impacts can be seen when simulating long-term CH 4 emissions taking into account different vegetation classes [162]. Ref. [163] used GOSAT column CH 4 (XCH 4 ) to evaluate wetland emissions from the Joint UK Land Environment Simulator (JULES, [164]). The detailed spatial and temporal resolution of GOSAT data permitted evaluation of the model performance over specific wetlands, which provided insight into some of the causes of model uncertainty. Bias in the modelled fluxes highlighted a failure to capture large scale, river fed, wetlands in the Pantanal region during high rainfall years. This is most likely caused by a lack of river routing in the model or by a misrepresentation of methanogenesis in high flood waters. In 2009/2010, the Paraná River region experienced a flood event that caused a spike in CH 4 concentration retrieved from GOSAT; this was not reproduced in JULES, highlighting the need to include representation of over-bank inundation in the model. Figure 14 shows the model failure to transport surface moisture, via river runoff, from the rainfall region in Southern Brazil down the Paraná River towards Paraguay and Argentina. The lack of river routing within the model causes a spatial bias in wetland area in JULES when compared to the GRACE satellite water storage anomaly, resulting in misrepresentation of modelled CH 4 fluxes.

Vegetation Water Fluxes
Transpiration dominates the continental water cycle [165][166][167]. It is therefore crucial to better represent and constrain vegetation activity and its contribution to the continental energy and water cycles. Historically, continental microwave remote sensing has been focusing on soil moisture retrieval [168,169]. Vegetation was mostly considered during the retrieval of surface soil moisture as a by-product as it affects the signal penetration to the surface [170,171]. The attenuation of the microwave signal when passing through the vegetation layer depends on the so-called Vegetation Optical Depth (VOD). The VOD depends on the frequency of the sensor, on the water content of the plant (trunk, branches, leaves) as well as on the biomass [172][173][174][175][176][177][178]). As such, there has been recent interest in using VOD to assess biomass changes [37,39,[178][179][180][181] or plant water stress strategies [177]. The VOD might prove especially useful in regions where soil moisture retrieval is especially complicated by dense vegetation coverage (e.g., forested landscape). C or X-band do not penetrate much into the canopy and are more sensitive to the top part of the canopy. Figure 15 shows that L band is sensitive up to very dense canopies (300 Mg/hectare) enabling the probing of almost all vegetation canopies [182]. A further advantage of the VOD is that data can be acquired even in the presence of clouds. C-band senses a thicker layer of the canopy and branches so that the signal somewhat differs from the X-band signal especially in humid regions with dense canopy coverage such as the Amazon rainforest ( Figure 16). The lower frequency signal, in L-band, senses a thicker vegetation layer corresponding to the above ground biomass such as the ones derived in [183] (Figure 15). Most of our observations of the land surface relate to surface variables (soil moisture, biomass, vegetation coverage among others) but ultimately one of the key objectives is to better constrain the vegetation water, energy and carbon fluxes. Recently, direct observations of a proxy for photosynthesis have been made possible, at the global scale using Solar-Induced Fluorescence (SIF) [184][185][186][187][188][189][190][191][192]). SIF is a by-product of photosynthesis, which is directly related and even nearly-proportional to photosynthesis in most conditions and is a direct indicator of vegetation stress [156]. One of the advantages of SIF is that it provides information that is not included in the other datasets (e.g., vegetation indices or VOD). Indeed, SIF is able to correctly capture the seasonal cycle across diverse climates dominated by phenology and temperature (Figure 16 top panel the northeastern US), by temperature and water stress ( Figure 16 middle panel Mediterranean Spain) or in rainforests, where vegetation indices struggle to correctly define the seasonality which is imposed by leaf aging [193][194][195]) and radiation structure [196] ( Figure 16 lower panel, the Tropical Amazon). SIF thus provides unique information to constrain land-surface fluxes and its response to droughts (e.g., [197]). It is, however, particularly noisy [186] so that averaging in both time and space is needed to extract a meaningful signal. This noise is also a major road block for direct use in either a data assimilation context or to directly assess surface fluxes. As a result, recent approaches have tried to use SIF not directly but as a target of machine learning strategies using MODIS broadband reflectances (Reconstructed SIF, [198]). The advantages of such approaches is that it gets rid of most of the large amplitude noise in SIF, while conserving to a large extent the photosynthetic activity dependence across biomes and climates. An alternative to SIF is the Photochemical Reflectance Index (PRI) [199] or combined products that can improve the signal-to-noise ratio. Mediterranean climate in Spain, bottom: rainforest in the Amazon) as assessed using different sources of data: Vegetation Optical Depth (VOD) at high (X-band), medium (C-band) or low frequency (L-band), Enhanced Vegetation Index (EVI) and Solar-Induced Fluorescence (SIF) using the GOME-2 satellite data.

CO 2 Anthropogenic Fluxes and Co-Emitters
In the atmospheric source inversion community, the use of satellite data to constrain trace gas fluxes was made possible through the adoption and adaptation of data assimilation techniques already in use at operational NWP centers. As a result, two methods have emerged: Four-Dimensional Variational (4D-Var) and Ensemble Kalman Filter (EnKF) [200] The computational efficiency of 4D-Var to solve high-dimensional inverse problems and its ability to accommodate a large amount of remote-sensing measurements over long periods of time, made 4D-Var a method of choice in the source inversion community. However, the development and maintenance of adjoint models in variational approaches can be cumbersome. On the other hand, EnKF-based methods use a small ensemble of perturbed model simulations to represent and propagate the error statistics explicitly, While similar methodological difficulties characterize DA in NWP and source inversion problems, it is worth noting some specific aspects of the latter. First, while the prior (or background) information in NWP DA systems is associated to prognostic variables of the model that are often directly observed, the source inversion problem involves model parameters that are usually measured only indirectly, making the quality of the optimised sources strongly dependent upon the accuracy of the transport model itself. Additionally, the prior error statistics prescribed for the bottom-up emission sources are often very uncertain owing to the sparsity of the available observational network [201]. These uncertainties in the transport model and the prior error emissions can translate into large errors in the inferred posterior estimates. Although current methodologies used for top-down atmospheric source inversions have reached a high level of sophistication enabling efficient and scalable computation of the posterior estimates along with their information content (i.e., posterior errors, observational constraints), the lack of observations to provide source estimates at accuracies and spatio-temporal resolutions compatible with environmental policy requirements is a key remaining limitation. The advent of satellite missions dedicated to air quality and carbon cycle observations (Sentinel-5P, GeoCarb) will be a milestone toward building such policy-relevant monitoring systems and will provide an unprecedented opportunity to improve the synergy between Earth surface models and numerical simulations of atmospheric dynamics across time scales, from NWP to climate modeling. In turn, such inverse modeling tools can be exploited to conduct instrumental design experiments that can inform mission strategies and requirements [202], and optimize scientific outcomes.

Land Surface Properties
Land surface processes and parameters strongly depend on geomorphology, land use, water body distribution, vegetation cover and soil type, and therefore the climatological fields describing these characteristics are a key part of any land surface scheme.

Orography
For every grid point, models have associated values for surface elevation (orography), sub-grid orography statistics, land cover (used as land/sea mask), lake cover and depth, glacier cover, low and high vegetation type, low and high vegetation cover, albedo, LAI, and soil texture. Within the ECMWF system, these fields are derived from different external sources (e.g., albedo and LAI from MODIS, vegetation type and cover from GLCC/AVHRR, water bodies from Globcover).
Global datasets come in different resolutions, data formats and projections and need to be interpolated, or merged in the case of non-global coverage. In light of the changing climate signal also affecting surface physiography, multi-year climatological data at 1 km resolution for all surface fields is desirable. One-km resolution represents a challenge for current global NWP and Earth system models, but it is relevant to allow direct comparison with present and future Sentinel missions monitoring the Earth system. Figure 17 shows the orography, ocean and lake bathymetry used operationally at ECMWF. The lake depth is based on [203] and the ocean bathymetry is based on ETOPO1 by [204].
The land-sea mask and orography are based on the following raw data information: ESA's Globcover V2.2 based on Envisat MERIS (300 m resolution) mapping 2005/2006 (ESA, 2010), the Shuttle Radar Topography Mission (SRTM [205] at about 90 m resolution, the Global Land One-kilometer Base Elevation (GLOBE [206], only north of 60N and south of 60S), and specialised DEMs of Greenland [207], Iceland (Icelandic Meteorological Office, 2013) and Antarctica [208], replacing the corresponding data points on the 1 km latitude/longitude grid. The lake mask has been created from the land sea mask and quality controlled by consistency algorithms. All spatial resolutions that are used by the IFS run in fully coupled forecasts [209]. Figure 17. Global map of land orography and ocean+lakes bathymetry in m (above and below sea level or lake shore respectively) at T1279 resolution ( 9km) as used in the ECMWF-IFS operational system.

Soil Depth
While earth-observing satellite data have been widely used in modeling land processes (e.g., global land cover [210]), in situ measurements as constrained by satellite observations are needed to estimate other variables. One such variable is the Depth To Bedrock (DTB), which is essential for accurate land surface modeling of the energy, water, carbon cycle, and dynamic vegetation. In general, a lower boundary condition is needed to solve the governing equation for vadose zone soil moisture. The presence of bedrock is a player in soil hydrology as can change the water flow and storage [211]. Earlier land models assume a free drainage bottom condition (i.e., without considering groundwater), while more recent models include an unconfined aquifer which implicitly assumes a globally constant bedrock depth. However, no conditions are satisfactory without a DTB estimate [212].
Ref. [213] has recently developed the world's first global 1 km soil and alluvial thickness dataset ( Figure 18) by combining geomorphologic theory with the best available data for topography (using satellite elevation measurements), climate, ecosystem [210], and geology as input. This dataset utilised different approaches to separately estimate soil and alluvial thickness for upland hillslopes, upland valley bottoms, and lowlands, as the character of soil depth is fundamentally different for these units: upland hillslopes have relatively shallow soils (∼1 m), while valleys and lowlands have relatively deeper soils/alluvium (10 s of meters or more) (Figure 18). These three units are distinguished at the 90 m pixel scale using a valley network extraction algorithm as well as criteria related to geologic age. On hillslopes, the data set is calibrated and validated using independent data sets of measured soil thicknesses from the U.S. and Europe and on lowlands using DTB observations from groundwater wells in the U.S. This dataset is publicly available at: https://daac.ornl.gov/SOILS/guides/Global_ Soil_Regolith_Sediment.html. A similar product based on global spatial ensemble prediction models with various input datasets is also available [214]. The impact of this DTB dataset on land surface modeling has been demonstrated [215] implementing variable DTB in the Community Land Model (CLM4.5) with 0.9 • latitude x 1.25 • longitude grid boxes. The greatest changes in the simulation with variable DTB are to baseflow, with the annual minimum generally occurring earlier in the year comparing to fixed soil depth simulations. Smaller changes are seen in latent heat flux and surface runoff primarily as a result of an increase in the annual cycle amplitude. These changes are related to soil moisture changes that are most substantial in locations with shallow bedrock. Total water storage anomalies are not strongly affected over most river basins because they tend to contain deep soils. These anomalies are substantially different for river basins in more mountainous terrain. Additionally, the annual cycle in soil temperature is partially affected by including realistic soil thicknesses resulting from changes in the vertical profile of heat capacity and thermal conductivity. However, the largest changes to soil temperature are introduced by the soil moisture changes in the variable soil thickness simulation.

Soil Texture
Soil properties, particularly the texture of the soil in the near-surface horizon, obviously affect the evolution of soil moisture by controlling the ability of water to move in the soil column. Water retention and conductivity, as governed by soil texture, have secondary effects such as controlling the persistence of soil moisture anomalies and helping determine what type of vegetation can flourish in a location. Less evident is how subsurface properties affect surface soil moisture evolution. This is not well appreciated because it is very difficult to discern the structure of the terrain below the top meter or two of the soil. Weather and climate models typically treat the vadose zone below vegetation roots as terra incognita, assuming no spatial variability except that related to the slope of terrain. If a model simulates the water table, it is often connected directly to the shallow soils without regard to the spatial variability in between. Lumped hydrologic models infer the effect of the vadose zone in their calibration, but this is not possible in ungauged or arid basins. In reality, there can be preferential pathways of drainage that quickly remove water from the reach of direct evaporation or plant transpiration [216]. These correspond to areas of fractured or unconsolidated bedrock, called karst that underlie a significant fraction of the land surface, about a quarter of the continental US [217] and western part of Kazakhstan, where landscape can change quite dramatically, including emergence/disappearance of lakes, in a very short period [218]. Observational and modeling evidence suggests regions overlying shallow karst formations experience very different soil water evolution, affecting surface moisture and heat fluxes, vegetation distribution, and even impacting the evolution of convective precipitation [219]. A direct way to discern the effect of karst is to examine time series of in situ soil moisture measurements from networks using consistent instrumentation spanning both karst formations and less permeable bedrock. In the southern Appalachian region around the Middle Tennessee River Basin, there are extensive shallow carbonate karst formations and a fairly dense network of US Department of Agriculture (USDA) soil moisture stations spanning karst and non-karst substrates. A strong correspondence between weak soil moisture memory (lagged autocorrelation of anomalies) and the presence of karst has been found across a dozen stations [220]. Comparison among closely-located stations helps control for the confounding effects of other climatic, vegetative and hydrologic factors. The methodology developed for in situ data has been applied to remotely sensed soil moisture georeferenced against high-resolution karst information from the US Geological Survey [217] across the southeastern and south-central US. Small-scale (within 2700-3000 sq.km) spatial variability in karst coverage correlates significantly with small-scale variability in soil moisture memory [221]. Figure 19 shows that, where there is strong spatial variation in karst coverage, there is also greater variation in soil moisture memory as derived from NASA/SMAP satellite soil moisture. Correspondingly, where there is little or no variation in bedrock permeability, soil moisture memory is also more uniform.
The fact that the signal of karst appears in satellite soil moisture data yields promise that remote sensing could contribute important hydrologic information in poorly-observed regions of the globe. Radiances in near-infrared and other parts of the electromagnetic spectrum have been shown to correlate to surface soil properties in unvegetated areas [222,223]. Remote sensing of soil moisture is achieving levels of quality and information content that can allow inference of surface and subsurface properties otherwise unobserved or unavailable. Careful assays of remote sensing data could enhance global datasets used for Earth system modeling, whose quality is currently skewed towards developed wealthy nations.

Inland-Waters
Globally, there are approximately 117 million lakes (>0.002 sq.km) with a combined surface area of 5 × 10 6 sq.km, which is approximately 3.7% of the Earth's non-glaciated land surface [224]. Despite their relatively small spatial extent, lakes can have a disproportionately large influence on the climate, in particular within their near-vicinity [225,226]. This occurs as a result of lakes influencing surface energy exchange with the atmosphere, which differs to those from soil or vegetated surfaces [227]. Some lakes also freeze seasonally, resulting in considerable changes in surface albedo and thermal capacity and, as a result, alterations in lake-climate interactions [228].
Although traditionally neglected in land surface models within NWP, primarily because of the computational expense, community efforts in recent years have improved our understanding of the importance of lakes and have highlighted their added-value in simulating accurately regional climatic variations [225,229,230]. Research aimed at introducing inland water bodies (lakes, rivers and coastal waters) into the operational model at ECMWF has started by considering a medium-complexity scheme that satisfies the operational constraint of having low computational cost. FLake [231], a shallow-water scheme originally applied to lakes, was introduced into the IFS in progressive steps. This model is a particularly appropriate choice for NWP as it predicts the vertical temperature structure and mixing conditions in lakes of various depths and on time scales from a few hours to multiple years, while maintaining a relatively low number of prognostics variables (seven in total). The model is intended for use as a lake parameterisation scheme in NWP, climate modelling and other prediction systems for environmental applications. FLake has been implemented in the operational regional weather forecast models of Deutscher Wetterdienst (the German weather service), Finnish Meteorological Institute (FMI), Norwegian Meteorological Institute (Met.no) and Swedish Meteorological and Hydrological Institute (SMHI), and is used for research at several meteorological services across Europe including Météo-France [232] and UK Met Office [233], and is included in several climate models. The most important external variable for lake model FLake is depth (bathymetry or mean depth at least). There were several attempts to derive lake depths from satellite observations. Already in [234], an approach for determining lake depths by using spaceborne Synthetic Aperture Radar (SAR) images was developed. For those lakes that freeze completely to the lake depth some time in the winter, the simulated ice growth curve providing the ice thickness allows to estimate the lake's maximum water depth [234]. In [235], the same idea was used and lake depth of shallow sub-Arctic lakes and ponds was determined by using Landsat Thematic Mapper (TM) and European Remote Sensing (ERS)-1 Synthetic Aperture Radar (SAR) data. Landsat TM image is used to map lake bathymetry and multi-date ERS-1 images acquired during winter are utilised to determine when and which lakes freeze to the bottom during winter. Lake depth estimates computed by this approach correspond well with in situ measurements, especially in the tundra zone; however, the approach needs to be further tested and improved for automatic use [235]. While at the moment still no complete global lake depth dataset exist, local bathymetry dataset can be found at least for large lakes (see Section 6.3.4 of the GCOS report [60]). Beyond a few large lakes, the collection of local datasets is a difficult task; therefore, the most common practise for lake modeling in global NWP is to use in situ measurements and indirect estimates of lake mean depths based for instance on the geological origin of lakes [236].
Prior to its implementation in the IFS, the influence of FLake was evaluated in a series of preparatory studies: first, in an offline experimental framework [227,237], and then extended to fully-coupled lake-atmosphere simulations [230]. More recently, the possibility of treating sub-grid water bodies (lakes and coastal waters) using the land surface tiling methodology has been included.
An important step in the inclusion of lakes in NWP is that the simulated lake surface temperature and lake ice conditions are validated by comparing with observations. The use of remote sensing data for validating lake model outputs in NWP is essential, in particular as in situ observations at a global scale are currently lacking.
With this approach, each grid box is divided into fractions of different land use, each with their own tile. Merits and limitations of the tiling methodology when accommodating lakes and forest within the same model grid-box have been assessed by [238]. They conclude that the tiling method captures well the influence on surface fluxes of the contrast between the "lake" and "forest" surface boundary conditions, on seasonal and diurnal time scales (see Figure 20, upper left and upper right panels, respectively). The behavior of simulated lake surface temperatures ( Figure 20, bottom panel) and the period of ice cover in large inland water bodies worldwide was verified using satellite-based products (see [230]). Particularly relevant for validating lake simulations (temperature and ice cover) in NWP is, for example, data provided by the ATSR Reprocessing for Climate, ARC-Lake lake surface water temperature and ice cover dataset [239]. Satellite-derived observations from ARC-Lake have been used in recent years to validate simulations from the Canadian small lake model within the Canadian land surface scheme [240], global lake temperature simulations performed within the CNRM-CM5 climate model [232], and in validating the coupling of FLake to the UK Met Office Unified Model [233], among others.
The impact of introducing interactive inland water bodies in ECMWF's IFS has been examined by a set of dedicated analysis experiments. However, the lake model (FLake, [231]) does not currently consider a mass balance of water (only for the ice components), whereby a static dataset is used to represent the extent and bathymetry of the world's lakes. Monitoring lake depth globally is not currently achievable although inversion methods (e.g., Ref. [237] demonstrates that an effective lake depth can be derived on the basis of the remotely-sensed lake water-surface temperature).

River Discharge and Hydrological Forecasting
Continental and global scale information on river discharge, and upcoming floods and droughts can be used in many applications including disaster risk reduction initiatives, particularly in preparing for severe events and providing early awareness where local flood models and warning services may not exist [241]. The European and Global Flood Awareness Systems (EFAS and GloFAS) are part of the European Commission's Copernicus Emergency Management Service (CEMS-flood) and provide complementary flood forecast information to relevant stakeholders supporting flood risk management at national, regional and global level at several different timescales ( [242][243][244][245]; http://www.globalfloods.eu). CEMS-flood is based on an Earth system modelling approach, and the forecasts are derived using in situ and satellite data as well as hydro-meteorological and Earth system models. It is only very recently that improvements in the resolution, precipitation processes, and the improvements in land surface representations discussed in the above subsections have meant that the forecasts of hydrological variables such as precipitation, soil moisture and more recently also runoff, can be considered good enough to make effective forecasts of river flow using Earth System models [241,246]. GloFAS uses the ECMWF Ensemble Prediction System, the operational ensemble forecasting product of ECMWF which consists of a 51 member ensemble of global forecasts. The ECMWF land surface scheme, H-TESSEL, then calculates land surface response to atmospheric forcing, and estimates the surface water and energy fluxes and the evolution of snowpack, soil temperature and soil moisture. Operational ensemble forecasts of surface and sub-surface runoff are resampled to 0.1 degree resolution to be used as input by the river routing model (for more information see [242,245]). River flow climatologies, against which forecasts are compared, have been created using corrected reanalysis products such as ERA-Interim Land [247] which includes precipitation adjustments based on monthly GPCP v2.1 (Global Precipitation Climatology Project). Similar corrected climatologies are planned for the next generation reanalysis product ERA5 (i.e., ERA5-Land). Future needs for hydrological forecasting in terms of data assimilation, model validation/evolution and parameterisation are better information on river flow rates and flood extent (particularly in real time), groundwater contributions, and high resolution river catchment and river channel topography from EO data [248][249][250].

Land-Atmosphere Coupling
A promising method of model evaluation is to study the dynamics of the system rather than its states. The response of the land surface to atmospheric drivers repeats after each rainfall event. The repeating wetting-drying cycle can be studied using different observable properties to identify the time response of the surface to the driver which is dependent on evapotranspiration. Refs. [251,252] developed a systematic way of analysing the time-response of evaporation after a rainfall event by analysing the land surface temperature increase. The theory is that a dry soil with low evapotranspiration will warm up more quickly than a wet surface with high-evapotranspiration. Since the dynamical time-response of the system is an inherent property of the land surface, rather than a result of the driving data, the method can be applied to coupled model simulations. The method was applied to evaluate a model in coupled mode by [253]. Two model configurations were tested and Figure 21 shows how the data favours one of the model configurations, at least for the spring months. In a similar vein, but at a much slower time-scale, EO observations of Total Water Storage (TWS) have been used [254] at the seasonal time-scale to evaluate evapotranspiration in a suite of land-surface models. Key properties of the models can be obtained. Similarly, a dynamic soil moisture global dataset based on EO data [255] has been obtained by characterizing the dry-down trends. Since EO soil moisture only captures near surface water, it can only be used for two of the three evaporation components (soil surface and interception) and not the larger transpiration. Ref. [256] quantified the power spectra of rainfall, EO soil moisture and modelled soil moisture allowing for inferring the low-pass filter effect of soil hydrology. Neither of these analyses have yet been used to evaluate land-surface model systematically, but the principle shows great promise. These new methods can further be backed up with in situ observations such as drying rates from Flux Tower or soil moisture sensors.
Producing accurate forecasts of near-surface parameters becomes increasingly important as near-surface temperature, humidity and wind are used in a widening range of applications and are themselves highly relevant during extreme events (e.g., wind-storms, droughts and heat-waves). The representation of land-atmosphere interactions depend on a large number of processes. The exchange of energy or moisture between land surface and the atmosphere plays an important role for near-surface temperatures and humidity [70] while the roughness of the surface exerts a major control on near-surface winds. However, the parameterisation of these interactions is hindered by difficulties in estimating the land-atmosphere coupling strength from theory or observations. Thus, the parameterisation relies on a number of parameters, set to poorly constrained values, that depend at most on the vegetation type or sub-grid information (tile fractions). The skin layer conductivity, the roughness length for heat or momentum, the minimum canopy resistance, the dependence of the canopy resistance on the water vapour pressure deficit and the root distribution over the soil layers are just a few examples. In the past years, efforts have been made to (i) evaluate the sensitivity of near-surface variables to the values chosen for such parameters, and (ii) find ways in which to better constrain them [106]. The first obvious candidate was the roughness length for momentum, which has a strong influence on the representation of 10-m wind speeds. For many years, ECMWF forecasts overestimated the near-surface (10 m) wind speed over land. The mean forecast errors, with respect to routine SYNOP observations, range roughly between 0.5 and 1 m/s for various regions/times of the day. A stratification of the 10-m wind speed forecast errors by vegetation type showed that these errors depend on the vegetation type, or more precisely on the momentum roughness length of each vegetation type.
The overestimation of the near-surface wind speed for most of the vegetation types suggests that the values used for the momentum roughness length were too low. This motivated the revision of these values based on theoretical considerations and SYNOP observations of wind speed at 10 m. The basic idea was to search, for each vegetation type, for a new value of the momentum roughness length for which the mean 10-m wind speed forecast error with respect to SYNOP observations drops to zero in near-neutral conditions. This calibration showed that the momentum roughness length values should be increased for nine and decreased for one of the 18 vegetation types characterizing land areas. The newly derived values were introduced in IFS cycle 37r3 (November 2011). As the roughness length for momentum was on average increased, the roughness length for heat was decreased in order to account for terrain heterogeneity. The use of the new momentum roughness length values reduced significantly the 10-m wind speed forecast errors, especially for the predominant vegetation types, but also on average over continental areas. The improvement for Europe was clear both in the pre-operational testing and in the operational verification afterwards. The reduction of the heat roughness length had also a positive impact on the 2-m temperatures, leading to a small warming during night-time and cooling during daytime over the continental regions for both winter and summer, which reduced the forecast biases over Europe [106]. Near surface temperature and humidity biases in weather are handled by assimilating SYNOP temperature and relative humidity and therefore optimising the 2 m forecast performance on given weather conditions [112].

Ocean-Atmosphere Coupling
Both satellite and in situ observations show evidence of a diurnal cycle in Sea Surface Temperature (SST), with a amplitude in excess of 1K over large regions of the ocean extending from the Tropics to high latitudes (see review by [257]). There is a growing appreciation for the need to resolve coupled physics at time scales shorter than a day. In terms of representing the diurnal cycle in models, a key factor is the difference between the temperature at a depth of 10m (foundation SST), the temperature at shallower depths which is generally warmer (bulk SST) and the temperature in the ocean's thin skin layer (<1 mm thick) which can be cooler at night and under certain wind conditions.
Observations of the diurnal cycle have motivated the development of ocean models to represent the diurnal cycle. This can be achieved in one of two ways: the first approach is to provide sufficient vertical resolution (O(1m)) in the near surface to explicitly resolve the diurnal cycle [258,259] which models the bulk SST. The second approach is to use coarser vertical resolution but provide a sub-model within the mixed layer scheme which parameterises the bulk and skin temperature [260]. Both approaches agree well with the observed diurnal amplitude of the bulk SST (equivalent to 1 m). There is a subtle difference in the approaches which is taken into account when calculating surface fluxes from bulk formulae; [260] were able to parameterise the skin temperature of the ocean while [259] resolve the bulk SST with a 1 m vertical resolution and three hourly coupling. Ref. [260] show differences in the air-sea fluxes of up to 10 W/m 2 in the Tropics when the diurnal cycle of SST is included.
The diurnal cycle of solar radiation can lead to stabilization of the surface ocean during daytime and to nighttime convective mixing. As a consequence, when winds are weak, daytime solar warming can create near-surface temperature stratification, referred to as a "warm layer", with skin temperatures up to 2C (or more) warmer than the foundation SST below the warm layer. This stratification and the peak skin temperature, however, cannot be simulated in an ocean model forced by daily-averaged air-sea fluxes [261]. Critically, the very large daytime surface heat fluxes associated with solar radiation, acting on the thin daytime surface mixed layer, result in a rectified SST that is warmer than the SST produced by the averaged fluxes. Without the diurnal cycle coupling, the skin temperature would be equivalent to the bulk foundation SST. The rectified SST can then feedback to affect the air-sea fluxes. Ref. [262] show that, when foundation SST is used instead of the diurnally varying skin temperature, the computed net surface heat flux can differ by more than 5-10 W/m 2 in parts of the tropics when averaged over the 10-year period. The afternoon near-surface stratification can also cause trapping of wind-momentum resulting in a thin diurnal jet [263] and enhanced surface freshening associated with daytime rainfall under light winds [264]. The resulting rectified SST, enhanced near-surface stratification, shear, and advection can lead to a cascade of errors if not properly represented.
The benefits of coupling a dynamical ocean to the ECMWF medium-range forecasts have been evaluated in a set of dedicated forecast sensitivity experiments and from June 2018 all forecasting systems running operationally now include the ocean model (NEMO v3.4) in the ORCA025Z75 configuration with 75 levels in the vertical and a quarter of degree horizontal resolution to represent the 3D ocean and includes a dynamical sea-ice model (LIM2) and a wave-model (EC-WAM). These coupled forecasts, in which the ocean interactively responds to the meteorology, are better performing particularly for the prediction of tropical cyclones and their intensity [265]. The use of all-weather SST from AMSR-2 microwave sensor is shown to validate the cold wake 5-day forecast of the Pacific's tropical cyclone Neoguri in July 2014 with the ECMWF high resolution model coupled and uncoupled with ocean ( Figure 22).
Moreover, the use of in situ SSTs observations has allowed to assess the diurnal variability in the ECMWF coupled forecasts and the advantages of ocean-atmosphere coupling [266].
Indeed, GCMs that resolve the air-sea flux diurnal variability tend to perform significantly better than ones which do not [259,[267][268][269]. In the tropics, the mean SST cool bias [270] is reduced, with warmer SST leading to increased precipitation that compares better to climatology [259]. Further motivations for the inclusion of the diurnal cycle in SST is the potential improvement it might provide in predictions of the Madden-Julian Oscillation (MJO, [271]). The results of [259] actually show a reduction in the performance of the MJO index although there is indication that the diurnal coupling produces a stronger and more coherent MJO. Increasing the coupling frequency to resolve the diurnal cycle also enhanced the intensity of MJO convection [272] and changed the magnitude of the El Niño-Southern Oscillation (ENSO) amplitude [268,273]. In the Southern Ocean, resolving diurnal and intra-diurnal variations leads to a colder mean SST, and stronger westerlies [269], probably due to the resolution of fast-moving storms. It is therefore likely that resolving the diurnal cycle in SST and coupling on these timescales may provide at least part of the picture in improved extended-range predictions.
Air-sea exchanges of properties are highly sensitive to wind speed. However, if the wind is gusty or associated with a fast-moving storm, so that the wind direction is variable over the course of the day, then the magnitude of the mean wind, computed from the average zonal and meridional wind components, will be less than the average "scalar" wind speed computed from the variable wind speed magnitude. In the state-of-the-art Coupled Ocean Atmosphere Response Experiment (COARE) bulk flux algorithm [274], averaged hourly wind speeds are increased by a gustiness parameter to account for this additional wind speed variance at hourly time scales [275]. Using high resolution Tropical Atmosphere and Ocean (TAO) mooring data [276] showed that the diurnal 'mesoscale' gustiness was more than twice as large as the parameterised 'convective' gustiness and that neglecting this diurnal wind variance could cause a mean bias of up to 8 W/m 2 in the latent heat flux component alone. For scaling purposes, a 20 m layer subject to 10 W/m 2 net surface heat flux error over a 1-year period would produce a temperature error in that layer of 4C. In an operational context [277], the wave forecasting and the coupling with atmosphere and ocean models is progressing with detectable benefits both on near surface winds and sea-surface temperatures. The impact of waves on the coupled climate system is also highlighted in [278] for the continuous exchange of heat, energy, water vapor, and carbon dioxide that are relevant from diurnal cycles to extended prediction ranges.

Ocean Cryosphere: Snow and Ice
The presence of sea ice on the ocean drastically alters air-sea fluxes of heat and momentum. Hence, sea ice is an important component of Earth system models at high latitudes. As operational centres incorporate the ocean cryosphere within seamless prediction systems (i.e., in forecast systems spanning days to decades), there is an increasing demand for high-latitude coverage of observations to compare models for assessment of forecast performance and support ongoing development of the model, its coupling and data assimilation systems. As the polar regions have relatively sparse in situ observations, model development relies heavily on the earth observation data. It is important to note that this also requires the EO products to have a good estimate of the uncertainty as there are limited independent data sets to validate against. Recent advances in EO products are allowing model developments to be compared across the entire Arctic domain rather than just using limited in situ data at a single point. For the Arctic cryosphere, this is also important as the region is changing so rapidly. Field campaigns that have provided large datasets for model development such as SHEBA [279] may well be less representative of the ice conditions in recent years as much of the multi-year ice has been lost.
There are several products that are available operationally to describe the sea ice field. They are derived from brightness temperatures from various instruments and scatterometer data. These describe: the cover in terms of concentration, type, and edge; physical properties such as emissivity, skin temperature and thickness; and the dynamic properties in the form of motion (drift).
Of these EO measurements, the most established for use in NWP is sea ice concentration. It is one of the few fields routinely assimilated by operational centres (e.g., [280][281][282]) and also the main type of observation that the models are verified against (e.g., [283][284][285][286][287]). An example case study is shown in Figure 23 of an unusual event occurring to the north of Greenland that was captured in the model and also observed. Here, the performance of the coupled ECMWF HRES forecast is compared to the sea ice concentration as shown in the SAR image.
In recent years, it has become increasingly clear that thickness, i.e., the vertical distance from the air-ice interface at the top to the ice-water interface at the bottom, is essential for sea-ice modelling and prediction [288][289][290]. This is because thin ice allows much stronger conductive heat fluxes than thick ice, and it evolves much more quickly than thick ice because it is more susceptible to dispersion or compression by winds, and it can melt or grow more rapidly.
One way to observe the thickness of sea ice from space is to measure the surface emissivity in the L-band, which is sensitive to sea-ice thickness if the surface is dry, the ice temperature is well below freezing, and the ice is thinner than about 1 m [291]. This allows detection of areas which are covered by thin ice even though the sea-ice area fraction might be close to 100%. Typically, this occurs close to the ice edge, where new ice forms during seasonal ice advance. However, thin ice is also present in polynyas and fracture zones within the ice pack. These are frequent in winter where the old ice breaks up under the force of winds, and new ice forms rapidly on the exposed water surface (see Figure 24). This thin ice is difficult to model because it requires a good representation of the complex mechanical properties of sea ice combined with a good representation of its sub-grid-scale thickness distribution. Commonly used sea-ice models, like the LIM2 sea ice model currently operational at ECMWF, feature a simple visco-plastic rheology and a single sea-ice thickness category. This can be improved upon e.g., by introducing multiple sea-ice categories that allow treatment of sub-grid-scale variability in the ice thickness, but large-scale observations of sea-ice thickness are urgently needed to validate these new model developments. At present, there are large discrepancies in the representation of thickness of thin sea ice between observational products derived from L-band and ocean analyses with a prognostic ice model, assimilating sea-ice concentration but not sea-ice thickness [292]. This is not entirely due to limitations in the sea-ice model or data assimilation methods, but systematic biases in the observational data set also play a role. Further improvements in the retrieval methods are clearly needed [293], but the current sea-ice thickness data sets from L-band radiance have already proven highly valuable for exposing weaknesses in currently used sea-ice models. Hence, sea-ice thickness observations from L-band are essential to further improve prognostic sea-ice models to the benefit of advancing characterization and prediction of polar regions. Another sea ice model development has been the parametrization of melt ponds, a feature essential for correct representation of the surface albedo of the ice and therefore capturing the correct evolution of the sea ice melt during the summer. In the past melt pond fractions have been estimated from field campaigns (e.g., [294]), but coverage of the whole Arctic basin are now possible, such as the data set produced by [295], which enables assessment of model parametrization schemes (e.g., [296]) For processes and parameters that are particularly important for Earth system model development, such as near surface temperature over polar oceans, there is a reliance on reanalysis data which can only be assessed in terms of performance with relatively limited in situ observational data within the ice [297] or using land based in situ data [298]. Sea ice and snow surface temperature data sets derived from different satellite products are starting to be used in data assimilation and are allowing the ongoing development of the sea ice model themselves [299].
Another example from the coupled system development is that data from CALIPSO has helped to identify model weakness in representing marine stratus cloud. It also helped to identify areas of model physics that would benefit from further development [300].

Towards Enhanced Global-Scale Local-Relevant Monitoring and Forecasting
Global monitoring and forecasting for environmental, weather and climate applications needs an accurate specification of the Earth surface as boundary condition to the atmosphere. This need can be satisfied at various levels of precision and complexity. However, some properties do not receive enough attention or they are simply neglected and therefore introduce errors in the surface representation. Others are approximated, due lack of information, by drastically simplified biogeophysical processes, typically when satellite datasets are not used. To enhance global-scale forecasting that maintains high relevance at local scale, it is necessary to pay more attention to the mapping of surface properties and parameters, in association with the parameterisation schemes used. In this section, we highlight some of the important mapping efforts and their relevance for modelling capabilities.

Anthropogenic Surface Modifications
NWP and climate models need lower boundary conditions to calculate the evolution of dynamic processes in the atmosphere and to produce a usable weather forecast, such as skin temperature, surface fluxes of heat, moisture and momentum. To compute them accurately, detailed plant functional maps are required. Previously, ecosystem maps were produced by collecting regional geodesic information at the national level and then merging the data in global maps. The main issue was that those maps were rapidly outdated because of changes in land use. Nowadays, human activity influences the Earth surface very rapidly by changing the landscape from forest to crop fields, breeding cattle in more profitable locations, providing water for arid places, or building homes for an expanding population.

Urban Areas
Although urban areas occupy 2.7% of the world's land excluding Antarctica (according to the Columbia University Socioeconomic Data and Applications Center Gridded Population of the World and the Global Rural-Urban Mapping Project (GRUMP)), they hosted 54% of the world's population in 2014 (according to United Nations report World Urbanization Prospects, 2014 revision) and forecasts for these areas are particularly crucial. The surface energy balance in urban areas is quite different from the one over vegetated surfaces, due to the complex three-dimensional building structure and construction materials, and therefore a special treatment for heating processes related to urban canyons is needed [301]. An urban model can be implemented as an additional tile or as a geo-located high-resolution subsurface. Inter-comparison studies involving several urban parameterisations [302,303] with simplified approaches to urban modelling were shown to be well suited for NWP and permit to represent the heat-island effect that characterize the modified diurnal cycle in large cities [304]. This is another good example where continuous satellite observations are extremely useful to detect anthropogenic influence in an urban or human settlement ecosystem type. In places of their habitat, humans develop built-up areas. The main issue with built-up areas is that they do not release their heat as fast as the surrounding country side, which leads to increased daytime temperatures, reduced night-time cooling and higher air pollution levels-Urban Heat Island effect (UHI) [304]. For instance, over Berlin the mean annual difference in UHI between simulations with without the city is 0.5 • K, and in winter daily differences are as large as several degrees [305]. Significant advances in monitoring urban areas temperature with remote sensing have recently allowed to characterise the different fluxes including the urban anthropogenic heat flux [306].

Changing Water Bodies and Irrigated Areas
Inland waters have been dramatically modified by humans in the last decades. Recent satellite observations demonstrate that the world's surface water bodies are far from static [307,308]. In particular, by analysing more than 3 million satellite images between 1984 and 2015 by the USGS/NASA Landsat satellite programme, new global maps of surface water occurrence and change have been produced with 30-m resolution [307]. These global maps demonstrate that, over the past 30 years, 90,000 square kilometres of what was thought to be permanent surface water has disappeared from the Earth's surface, linked with climatic variability and human influence (e.g., river diversions or damming and unregulated withdrawal). Elsewhere, 184,000 square kilometres of new permanent bodies of surface water have appeared, some man-made (e.g., reservoirs) and others as a result of climate change (e.g., melting of glaciers). Satellite observations provide a means for defining spatially varying lakes in NWP, but this concept has not yet been realised fully. Future studies should investigate how NWP can benefit from the real-time monitoring of inland surface water bodies. A more dynamic representation of lakes should be a future ambition of forecast centres. One of the recent possibilities to observe changes in ecosystems and take them into account in NWP is to use the 30 m (1 ) horizontal resolution Global Water Surface Dataset (GWSD) created by Joint Research Centre (JRC). JRC used Landsat 5, 7 and 8 individual images over the past 32 years to map the spatial and temporal variability of global surface water and its long-term changes. Figure 25 shows different places on the globe and their evolution in recent years. Upper plots show Aral Sea area. After massive diversion of water for cotton and rice cultivation in the 1960s shrinking of water surface area accelerated and in 1998 became 28,990 sq.km (less than a half of its initial size) [309]. Due to major recovery program launched in 2001 by Kazakhstan President Nursultan Nazarbayev and supported by the World Bank, Aral Sea water surface area started to stabilise and in 2014 became 7660 sq.km (almost 10 times less of its initial size) [309].
Lower plots show emergence of the new, largest in Western Europe, man-made Alqueva reservoir ( Figure 25). It was built in 2002, and reached its present water surface area of 210 sq.km in 2006. Newly built reservoirs can be found all over the globe. Influence of inland water bodies on local climate is over 1 • K in temperature [225,230]. Local weather influence can be more pronounced-correct lake surface state (ice/no ice) in winter conditions can lead up to 10 • K [310]    Irrigated areas have also been considered in preliminary tests, following the approach by [311,312]. Ref. [313] have demonstrated the relevance of including irrigation in a 20th century reanalysis. Figure 26 illustrates the fractional cover of irrigation and urban areas as dominant surface anthropogenic land use modifications, that are yet neglected in most global NWP models.

Relevance for Atmospheric Composition
Satellite retrievals of gases and aerosols in the global atmosphere carry the imprint of the surface fluxes of these compounds. The latter can therefore be inferred from the former by inversion of the atmospheric transport and of the atmospheric chemistry within a Bayesian approach. In recent years, Earth observations from space-based instruments have played a key role in improving our understanding of surface processes driving air composition, like for air quality [314].
The need to complement or evaluate the diverse bottom-up estimations of these fluxes by a top-down approach that would be inclusive of all source and sink processes has dramatically increased the interest in inverse modelling over the last decade (see, e.g., item 59 in SBSTA 2017, https://unfccc. int/sites/default/files/resource/docs/2017/sbsta/eng/07.pdf). In some cases, like for greenhouse gases, surface flux estimation has become a primary motivation for some satellite programmes, like NASA's Orbiting Carbon Observatory. A step further consists in optimizing some key parameters of Earth surface models based on inferred surface fluxes. For carbon monoxide, such an optimization could target fire emission factors to resolve the long-lasting mismatch in the peak fire month in Africa between models and satellite observations [315]. For ammonia, it could improve the representation of fertilisers spreading practices in inventories [316]. For methane, it could allow wetland models to better represent the response to extreme events [317]. For carbon dioxide, it could facilitate the sophistication of carbon processes representation outside the peak growing season, such as leaf flushing [318]. For aerosols, it could improve binding energy or other parameters in mineral dust emission models [319].
However, in practice, detecting enhanced or depleted plumes from space does not imply that the associated surface sources and sinks can be quantified to sufficiently useful accuracy and localization. The modesty of many plume signals when averaged vertically (particularly in the case of long-lived tracers), their fuzzy boundary within their background and with other plumes, joined to systematic errors (for given atmospheric or surface conditions) in satellite retrievals, the coarse resolution of EO systems, and in atmospheric transport and chemistry models, have slowed down progress and keep inverse modelling for Earth observation as a major scientific challenge. Some of the solutions found so far are trivial: an empirical bias correction of the retrievals before or within the inversion systems; a rise in computer resources allocated to chemistry-transport models in inversion systems in order to increase horizontal and vertical numerical resolutions. More robust solutions require dedicated spectroscopic or transport-process studies. For chemistry, measurements of new tracers that could serve as proxies may be necessary. In the next decade, satellite imagers for greenhouse gases and pollutants are expected and will bring more emphasis on flux estimation at the local scale for usage outside the scientific community. They will likely require optimizing local meteorological variables together with surface fluxes. Similarly, there is a growing need to merge information from various sensors for a given species, or for species emitted by common processes, like fire or fossil fuel burning. Properly quantifying the uncertainty of each data-stream, of the correlations between these, and of underlying models is prerequisite for these developments.

Improved Diurnal Cycle for Assimilation Purposes
The availability of new satellite data informative about superficial water over continental surfaces, particularly at low microwave frequencies (L-band and C-band), requires improvements in the description of land surfaces in atmospheric models for the simulation of satellite products, as a preliminary step towards land data assimilation. This is illustrated in the following by few examples.
Satellite pixels in the microwave frequency range have footprints of several kilometres typically encompass many surface types. It requires horizontal heterogeneities to be accounted for in the forward modelling since most NWP models have horizontal resolutions below 10 km. Indeed, at low microwave frequencies, the emission of a vegetated surface is very different from that of an open water surface (e.g., lake, river) or of a snow covered area. Radiometers dedicated to probe water in the soil are also sensitive to liquid water present in other media (vegetation, lakes, oceans, snow).
An improved description of inland water bodies has been proposed by [230] for the ECMWF model by using the prognostic lake scheme «FLake». Another important constraint on land surface schemes with this type of observation comes from the fact that remote-sensing instruments probe at most the first 5 cm of soil. However, the water contained in the top soil layer is not the main driver of surface evapotranspiration that is of primary interest in NWP models. What dominates the strength of water losses in the atmosphere or in rivers is the water content over a deeper layer including the root-zone (tens of centimetres to few metres). The usefulness of these satellite data relies on the capacity of land surface schemes to transfer background departures located in the superficial soil layer into relevant increments in the root-zone layer. The simplicity of force-restore equations from the ISBA scheme [320] allows the estimation of analytical Jacobians that relate sensitivity of the superficial soil moisture at a given time T to changes in deep soil moisture at the beginning of an assimilation window [321].
However, a contradiction appears between the "analytical expression soil moisture Jacobians" and the physical understanding of how information can be transferred from the superficial to the deep layer. This unexpected behavior is explained by the use of a single energy balance for bare soil and vegetation, together with the nonlinear behavior of vegetation transpiration near the wilting point. Indeed, an increase in root-zone soil moisture, when above wilting point, enhances vegetation transpiration that cools the surface. A decrease in surface temperature reduces bare soil evaporation and in turn reduces water losses of the superficial layer. Therefore, in a data assimilation system, satellite observations sensitive to the surface soil moisture could appear more informative than they actually are. Studies described in [322] and in [323] reveal that two-layer schemes artificially enhance deep layer corrections because the actual physical transfers (through vertical gradients of water potential) are not explicitly described.
In conclusion, multi-layer surface schemes with separate surface energy balances are expected to lead to a more realistic extraction of information from remotely sensed superficial soil moisture in land surface models. This work is particularly relevant to illustrate that approaching data assimilation with simplified models may impair the transfer of information from and to the deeper soil layers that control predictive skills in medium and extended range forecasting.

Towards Enhanced Use of High Resolution EO Data in Earth System Modelling
In this section, an analysis of data availability with the most common challenges in utilising the data is presented with the aim of identifying virtuous pathways to data inclusion in global modelling for forecasting and monitoring applications. The volume of Earth Observations of the surface creates an issue which calls for data reduction and data streaming with a clear set of applications in mind. A too passive role of EO data users (awaiting new observations without listing modelling needs), or assuming that a web portal will multiply by itself the use of satellite products, has the risk of limited user-uptake of EO data. There is the potential of creating a gap between availability and exploitation of data. Closer connection between EO surface data and Earth surface modelling, as part of weather and environmental prediction applications, can multiply users of the data by orders of magnitude.
Earth Observations ultimately drive model development, via data assimilation and via exposing systematic model errors. However, the balance between model resolution, the degree of sophistication of represented processes and the size of ensemble configurations ( Figure 27) is a trade-off considering that computing resources are limited. The Earth surface cannot be easily observed nor modelled if coarse resolution sensors or models are utilised because the subgrid variability becomes too large. The use of ensembles with a large number of members enables representing the uncertainty of the forecasts [16] attributed to the modelling steps as well as to the analyses used as initial conditions that encompass EO observability and representativity issued.

Towards More Comprehensive Model Improvement through Joint Use of Multivariate EO Data
As illustrated throughout various examples in this paper, satellite-based EO data products are essential to improving and informing model development. However, the potential of the growing suite of EO datasets could be better exploited through the joint use of multiple EO datasets. So far, this is only done in classical data assimilation even though the approach holds promising potential also for improving model structures, and for deriving more accurate, more unique, parameter value sets.
For example, Ref. [13] showed that robust improvements in the parameter calibration of the H-TESSEL model are only possible with multiple observation-based reference datasets. For this purpose, they used several EO datasets in conjunction with in situ measurements. This allows overcoming their individual shortcomings; e.g., the satellite-based soil moisture offers unprecedented spatial coverage but is available only during relatively short time periods and is constrained to the surface layer, whereas in situ soil moisture measurements cover longer time periods and multiple soil depths. Such a comprehensive approach helps to advance model development as it can capture and adjust various couplings within a model. In contrast, calibration against individual datasets such as land surface temperature results in degraded model performance with respect to other variables such as soil moisture. This is due to an imperfect representation of the energy-water coupling in the model. In this way, the use of different combinations of reference datasets for training and validation enables a targeted identification of flawed couplings within a model.
While the suite of EO datasets is growing, the complexity of state-of-the-art (land surface and hydrological) models is also increasing. The related increasing number of parameters renders it difficult to sufficiently constrain such models. The potential of a comprehensive model calibration through an adequate ratio between the numbers of model parameters and employed reference datasets has been illustrated in [8]. They used a conceptual water balance model with few parameters and calibrated it thoroughly against a range of EO and in situ datasets. As a result, this simplistic model reached similar (and partly better) performance in simulating key hydrological variables such as soil moisture and runoff.
Despite the success of a simplistic model in this particular case, the increased complexity of state-of-the-art models is required in the context of coupled weather forecasting systems, or Earth system models [324]. This indicates that the use of emerging new EO datasets is absolutely required to better constrain those complex operational models. As a result, the improved calibration of individual modules, such as H-TESSEL, within a weather forecasting system, can contribute to advance temperature and precipitation forecasting skills, as exemplified in [325].
Following this avenue, it is critical to better link model development efforts-for example to enable to explicit representation of (to be) observed EO variables-with the planning and design of upcoming EO missions. This way, the growing suite of EO data products can contribute to an improved physical understanding of land-atmosphere interactions, and furthermore to more accurate weather predictions as well as climate projections.

Delivering EO-Driven Research to Services
Buizza and colleagues at ECMWF [326,327] have presented a typical Research to Operation (R2O) cycle for new model and assimilation versions. It includes internal updates, but it does not address yet the inter-agency planning. The latter is achieved via a variety of collaborations, externally funded projects, and internal projects. We present a sketch of the R2O in a given organization specialised in global atmospheric data assimilation and discuss ways in which cooperative work can extend to multiple organizations particularly to tackle the evolution towards a kilometer-scale representation of the Earth surface.
Earth Observations have a dual role in Earth-system monitoring and prediction applications. Firstly, they are required to define the initial state of the Earth-system components, i.e., the state from where forecasts are generated by integrating numerically the Earth-system equations. Secondly, they provide an estimate of the 'truth' (the true state of the Earth), that is required to evaluate model accuracy and forecast performance. Good quality observations are essential in the testing and evaluation phases. Diagnostics that allow objective evaluation of improvements in the development and testing phases, and routine monitoring of forecast performance are essential. The latter can trigger investigations that feed into subsequent model upgrades.
The establishment of focused multi-institute projects and working-groups e.g., around international field-campaigns, is a common practice to link remote-sensing and Earth system modelling.

Satellite-Focused Field Campaign: The Concordiasi Example
Concordiasi was a field campaign, part of the Thorpex experiment [328] dedicated to the study of the atmosphere above Antarctica as well as the land surface of the Antarctic continent and the surrounding sea-ice. In 2010, an innovative constellation of balloons, including 13 driftsondes, provided a unique set of measurements covering both a large volume and time. The balloons drifted for several months on isopycnic surfaces in the lowermost stratosphere around 18 km, circling over Antarctica in the winter vortex, performing, on command, hundreds of soundings of the troposphere. Many dropsondes were released to coincide with overpasses of the Meteorological Operation (MetOp) satellite, allowing comparison with data from the Infrared Atmospheric Sounding Interferometer (IASI). IASI is an advanced infrared sounder that has a large impact on NWP systems in general. However, there are some difficulties in its use over polar areas because the extremely cold polar environment makes it more difficult to extract temperature information from infrared spectra and makes it difficult to detect cloud properties.
A number of results have been described in the Concordiasi workshop report [329,330]. Although the performance of NWP analyses and forecasts have dramatically improved over the last decade, large systematic differences remain in analyses from various models for temperature over Antarctica and for winds on the surrounding oceans. A comparison between short-range forecasts and the Concordiasi data was investigated for various NWP centres. Results show that models suffer from deficiencies in representing near-surface temperature over the Antarctic high terrain. The very strong thermal inversion observed in the data is a challenge in numerical modeling because models need both a very good representation of turbulent exchanges in the atmosphere and of snow processes to be able to simulate this extreme atmospheric behavior. It has been shown that satellite retrievals also have problems representing the sharp inversions over the area.
An example of comparison of such profiles is provided in the Figure 28 (courtesy of T August, EUMETSAT). At the surface, particular attention has been paid to observing and modeling the interaction between snow and the atmosphere. This research has led to an improvement of snow representation over Antarctica in the Integrated Forecasting System (IFS) model at ECMWF [331]. Coupled snow-atmosphere simulations performed at Météo-France with the Crocus and Applications of Research to Operations at Mesoscale (AROME) models have been shown to realistically reproduce the internal and surface temperatures of the snow as well as boundary layer characteristics [332]. In this example of the Concordiasi campaign, datasets obtained from the field experiment over snow and sea ice have proven invaluable to document the quality and deficiencies of both satellite retrievals and NWP model fields. This has helped to provide directions in which to explore the improvement of models, which will be beneficial for weather prediction and climate monitoring. The validation and diagnostics of numerical weather modelling over snow and sea-ice with Earth observations pose challenges because of the lack of regular in situ data and because satellite observations can be difficult to interpret in those areas. Field experiments can help to provide precious information to assist scientists in this validation work.

Links with the World Weather / Climate Research WWRP/WCRP Programmes
In the context of a more seamless prediction from day to decades, the World Weather Research Programme (WWRP) and the World Climate Research Programme (WCRP) have both recognised the importance of EO for the evaluation of climate models and development of climate research. For instance, the Observations for Climate Model Intercomparisons (Obs4MIP) project under the governance of WCRP (https://esgf-node.llnl.gov/projects/obs4mips/) is aiming at developing a database of existing observational products that can be directly used for model validation. GEWEX, the core project of WCRP focusing on land-climate interactions (http://www.gewex.org), has a particularly strong emphasis in this area, and oversees the GEWEX Data and Assessment Panel (GDAP), which provides regular assessments and evaluation of existing EO products. The use of EO products to improve the understanding and simulations of global water and energy processes is substantially highlighted in the GEWEX Science Questions [333].
The GEWEX Strategy 2019-2028 reviews the scope of research in the context of increasing greenhouse gases and changes in climate extremes. Examples are heat waves, heavy precipitation, droughts, or floods, which all are of particular concern [87].
Several recent studies have shown that land-surface processes play an essential role for climate extremes, and thus EO of land surface variables are likely to help constrain the representation of these extremes in current climate models. This is particularly the case for heatwaves and changes in temperature variability, which have been shown to be strongly affected by soil moisture availability (e.g., [334][335][336][337][338][339][340]).
In particular, recent results show that much of the changes in temperature extremes in mid-latitude regions can be understood as the combination of the global climate sensitivity-i.e., the global temperature response to greenhouse gas forcing-and regional effects due to soil moisture [341]. Climate simulations also suggest that typical changes in land albedo (on the order of 0.1) can substantially affect temperature extremes on local to regional scales [342,343].
Soil moisture-precipitation feedback have been shown to be relevant as well [76], although climate models possibly present biases in the representation of these relationships [344]. There are only a few studies of the possible impact of soil moisture on precipitation extremes, but there is evidence that land evapotranspiration was a strong contributor to the 2010 Pakistan floods [345] and climate model simulations also suggest that soil moisture-precipitation feedback impact projected changes in precipitation extremes [339,346]. Besides the effects of land-climate interactions for temperature and precipitation extremes, soil moisture states also play an essential role in the development of droughts (e.g., [347]) and floods, and combined with high resolution imagers can reach km-scale resolution (e.g., SMOS/MODIS combination [348]). Soil moisture products at km-scale resolution will in turn benefit from dedicated field campaigns, such as for instance the AirMOSS (https://airmoss.ornl.gov [349]) as well as in situ ground based network for calibration/validation activities.

Links with EO Satellite Data Providers
Many important land surface geophysical parameters can be inferred from satellites including: soil moisture, snow cover, surface temperature and LAI. Due to their indirect nature and given the currently used assimilation systems, remotely sensed observations are still significantly under exploited. Used products are generated by complex external processing levels (L2 to L4) which can potentially introduce inconsistencies with other atmospheric observations and processes. Coupled data assimilation systems would address these issues by allowing a consistent handling of these observations across systems, provided systematic differences (e.g., due to calibration) are removed by bias correction schemes, that can be integral part of the data assimilation system [350]. Moreover, a stronger link between observations and data assimilation, will by design enable a quality control system with reduced latency.

An International Surface Working Group
The International Surface Working Group (ISWG) has been active since 2016 to gather requirements specific to surface observations, and to enhance both our understanding and ability to monitor the components of the Earth system. Land, vegetation, snow, ice, coastal waters and open waters are included. An international science working group focused on land surface would integrate the other active scientific groups, gathering requirements that are specific to land and surface-water monitoring. A coordinated effort to bring concerns and advances from the scientific community could bring a clear focus by reviewing what technology has reached a maturity level to be directly used in land surface monitoring and modeling applications and feed into weather and climate. Usage of readily available observing capacity that has not been exploited (such as several geostationary platforms) could be fostered by dedicated workshops and a gap analysis for observing needs from this community could provide a link to future missions. The recognition of ISWG with its focus on existing international scientific coordination can be critical to support operational applications in forecasting and monitoring of weather and climate. In particular, the satellite based estimation of soil moisture, snow, land surface temperature and surface water body extents are particularly poorly considered for continuity in current satellite missions (with few exceptions). Unfortunately, when satellite data is available, users' uptake is often confined to local applications not encompassing global weather and environmental forecasting or climate monitoring, which would naturally scale the number of users to a broader community. The primary task of ISWG should be to bring forward a continuity plan using existing systems and ones which would fill application gaps in future. ISWG would provide the community a point of focus for applications of EO data that would be particularly relevant to bridge gaps between monitoring and mapping of terrestrial surfaces that have been historically developed in different communities.

Conclusions
This review presents some examples of how Earth Observations can successfully drive model development for Earth monitoring and prediction systems, such as those used for weather and climate. Often the link between observing a process, reaching a physical understanding and designing parameterisations that can be used in models is not explicitly made but is nonetheless fundamental.
Recent land surface modelling developments, aimed at increased hydrological consistency, have been substantially guided by water-dedicated missions such as SMOS and SMAP. Surface temperatures have begun to be used in models despite the resolution barriers and large systematic model errors, which had previously limited the observation uptake.
This list is non-exhaustive and aims to exemplify the great opportunities offered by remote sensing of the Earth surface combined with in situ observations. Emphasizing the link between observations and model development is important as global Earth system modelling moves to the kilometer scale. Existing models need to evolve to reduce model-data discrepancies at the surface, and data assimilation needs to evolve to coupled surface-atmosphere data assimilation.
Scientists working in the field can benefit from more than a decade of research in regional weather and climate models. The data volume required for a high resolution representation of the Earth surface (close to 1 billion points if 1/120 degree resolution is considered) requires smart data reductions and prioritisation of benchmarks. Surface temperature-sensitive and water-sensitive remote sensing products are key data to drive model improvements.
Products that combine unique remote sensing capabilities at coarser resolution (e.g., SMOS/SMAP for soil moisture, GOME-2 for vegetation) with high resolution imagers are promising for the support and further development of the next generation of weather and climate models which will be approaching kilometer-scale resolution at the surface. To ensure continuity of the existing observation capabilities to preserve and extend complementarity and diverse and covariant information, i.e., for low frequency passive microwave measurements a follow-on mission for SMOS or SMAP is needed.
While the examples given here clearly demonstrate the benefit of investing further in EO-driven model development, the gaps are substantial. In addition, the lack of coordinated efforts to attain such a goal using current and future observations can be a major obstacle. Forming application-focused model development working groups such as the ISWG will help to channel efforts and gather modelling requirements for a more effective data uptake. Three recommendations include:

•
The improvement of parameterisation schemes over land and water surfaces to reduce systematic model errors that are clearly associated to missing processes (e.g., lack or spatial resolution in horizontal and vertical dimension, inadequate representation of local physiography) that impair the realism of surface fluxes partitioning.

•
The development of observation operators that map the observed satellite radiance into physical quantities that are represented in models, particularly for thermal infrared imagers (e.g., for LST) and low-frequency microwave radiometer/radars (e.g., for L-band brightness temperature).

•
The improvement of assimilation schemes capable of handling the models and observations uncertainties related to surface heterogeneities and processes and associated to the diverse remote sensing resolutions (e.g., LST and L-band). Data assimilation shall make use of observations for constraining both surface state-variables and fluxes.
To ensure progress in those areas, the synergetic use of in situ observations and satellite data is of paramount importance and should be developed in parallel so that ground-based observations can provide a solid anchor for the physical interpretation of satellite radiances in diverse biomes, ecosystems and climates.
Advantages for an increase accuracy in monitoring and prediction at the surface can be anticipated to be sizeable. An Earth system model which is capable of reproducing the observed range of surface temperatures in accordance with the amplitude of the diurnal and the seasonal cycle, and which reproduces synoptic disturbances or persistent weather, will have greater skill in predicting temperature extremes and therefore increase the relevance for climate studies. Similarly, the water cycle can benefit from more realistic precipitation and evaporation rates when approaching scales that are relevant to convection.
An accurate and up-to-date land-use description is necessary for obtaining modelled soil moisture that is more directly comparable to kilometer scale remote-sensing products. Attention to the energy and water cycles realism is a crucial ingredient to support simulations of the natural and anthropogenic carbon exchanges at the surface.
Representing the anthropogenic influence on the Earth surface and its effects within Earth system models is a challenge for the next generation of models. In these models, there will be an increased role for Earth observations to map and monitor the natural and human-driven surface changes in the Earth system. Their aim will be to integrate weather, climate and environmental applications.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: