Forecasting Spatio ‐ Temporal Dynamics on the Land Surface Using Earth Observation Data—A Review

: Reliable forecasts on the impacts of global change on the land surface are vital to inform the actions of policy and decision makers to mitigate consequences and secure livelihoods. Geospatial Earth Observation (EO) data from remote sensing satellites has been collected continuously for 40 years and has the potential to facilitate the spatio ‐ temporal forecasting of land surface dynamics. In this review we compiled 143 papers on EO ‐ based forecasting of all aspects of the land surface published in 16 high ‐ ranking remote sensing journals within the past decade. We analyzed the literature regarding research focus, the spatial scope of the study, the forecasting method applied, as well as the temporal and technical properties of the input data. We categorized the identified forecasting methods according to their temporal forecasting mechanism and the type of input data. Time ‐ lagged regressions which are predominantly used for crop yield forecasting and approaches based on Markov Chains for future land use and land cover simulation are the most established methods. The use of external climate projections allows the forecasting of numerical land surface parameters up to one hundred years into the future, while auto ‐ regressive time series modeling can account for intra ‐ annual variances. Machine learning methods have been increasingly used in all categories and multivariate modeling that integrates multiple data sources appears to be more popular than univariate auto ‐ regressive modeling despite the availability of continuously expanding time series data. Regardless of the method, reliable EO ‐ based forecasting requires high ‐ level remote sensing data products and the resulting computational demand appears to be the main reason that most forecasts are conducted only on a local scale. In the upcoming years, however, we expect this to change with further advances in the field of machine learning, the publication of new global datasets, and the further establishment of cloud computing for data processing.


The Need for Forecasts
Humanity is currently facing global challenges that require immediate and informed action.As indicated in the fifth assessment report by the Intergovernmental Panel on Climate Change (IPCC), climate change manifests in a global increase of air and sea temperature and results in the loss of polar snow and ice, sea level rise, as well as the increased occurrence of extreme weather events [1].This is accompanied by a directly anthropogenically induced environmental change expressed, for example, by extensive deforestation [2,3], the land take of cities and settlements [4,5], and the increasing pollution of the environment [6][7][8].The ramifications of these changes affect the livelihoods of humanity directly: Decreasing snow cover in the mountains threatens the tourismbased economies of regions and entire countries [9,10].Water shortage in rivers, lakes, and reservoirs, a result of both changed water availability and increased water demand, results in freshwater scarcity and hampers electricity generation [11][12][13].And the uncontrolled increasing urbanization destroys adjacent ecosystems, impairs biodiversity, and often affects agricultural lands that are needed to support a growing population [14,15].In the 2030 Agenda for Sustainable Development, the United Nations (UN) have therefore formulated the Sustainable Development Goals (SDGs) to foster social and economic development in the future while at the same time using natural resources sustainably and mitigate climate change [16].
How these goals can be achieved and challenges best faced is a matter of concern for policy and decision makers, planners, and scientists, but also for each individual member of society.Informed decisions about how to act require reliable estimates on what will happen in the near future and how the future may be affected directly and indirectly by each respective action.Making meaningful forecasts is not a trivial task as it requires understanding of and dependable data about the processes to be modeled [17].Fortunately, the availability of extensive and high-quality datasets on nearly every aspect of the global environment has never been better and is accompanied by new and innovative means of data distribution and analysis.From these observations it is possible to understand environmental processes and, in the next step, make forecasts that can inform decisions on how to face the challenges of global change and mitigate its consequences (Figure 1).Global change impacts the Earth's surface in many ways.These dynamics directly affect the livelihoods of people and can be monitored with remote sensing satellites which generate time series of geospatial EO datasets.These can yield valuable insight into processes on the Earth's surface and enable forecasting.Reliable forecasts can inform policy and decision makers, planners, as well as society as a whole to take action to mitigate global change itself and its impact on the land surface.Several symbols modified courtesy of the Integration and Application Network, University of Maryland Center for Environmental Science (ian.umces.edu/symbols/).

Potential of Earth Observation for Forecasts
One of the most relevant sources of environmental data is remotely sensed Earth Observation (EO) data.Especially data from satellite systems can offer a global coverage of continuous measurements over long periods of time in contrast to small scale in-situ measurements.With the first civil EO satellites having launched in the 1970s, remote sensing data archives have steadily grown since -not only in data versatility through the development of new and better sensor systems, but also in temporal coverage and overall sensing continuity.Particularly some optical sensors have been imaging the land surface for multiple decades and producing long-term datasets: The Landsat sensor family has been in operation since 1972, delivering continuous high-resolution imagery now for over 40 years [18].The program is constantly extended through the launch of new sensors and complemented by the data of the Sentinel-2 satellites of the European Copernicus EO program [19].Data of the Advanced Very-High Resolution Radiometer (AVHRR) has been collected since 1981 at a coarse spatial resolution of 1 km but at a twice daily repeat cycle [20].Furthermore, the Moderate Resolution Imaging Spectroradiometer (MODIS) has been imaging the Earth's surface since 1999 at a medium 250 m spatial and daily temporal resolution in a wide variety of spectral bands [21].Both AVHRR and MODIS are scheduled to be complemented by the Visible Infrared Imaging Radiometer Suite (VIIRS) [22].The analysis of multi-decadal time series of these datasets can potentially detect long-term trends of land surface dynamics and extrapolate them into the near-future, while the continuity of observation will eventually enable a validation of the forecasting results.Additionally, the increasing availability of analysis-ready high-level data products of the aforementioned and other sensors systems can foster EO-based forecasting of many aspects of the land surface that are subject to global change.EO data offers a wide variety of long-term observations in the form of geophysical parameters, spectral indices, or thematic maps that are in many cases freely available.For example, global time series data on remotely sensed geophysical variables is available from MODIS for, among others, evapotranspiration [23], Fraction of Photosynthetically Active Radiation (FPAR) and Leaf Area Index (LAI) [24], as well as gross (GPP) and net primary productivity (NPP) [25].ESA's (European Space Agency) CCI soil moisture dataset is derived from 40 years of different active and passive microwave sensors' data [26].Time series of spectral indices on vegetation are also available from MODIS [27], as well as from AVHRR in the GIMMS dataset [28].Thematic long-term land use and land cover (LULC) products have been generated for snow [29], water [30], or general LULC [31].Finally, EO significantly profited from advances in computer and data science in the last years, since the analysis of multi-temporal and multi-dimensional raster data sets has always posed a computational challenge.For example, cloud computing platforms like Google Earth Engine [32] can now process large datasets without the necessity to download and store terabytes of imagery to local machines.Additionally, innovative machine learning methods give new data-driven insights into the continuously growing amount of geoscientific data, particularly with Deep Learning showing the potential to harness the spatio-temporal information stored in EO datasets for accurate forecasting of processes of the land surface [33].In the geosciences, short-term and long-term forecasts are already established for weather and climate parameters, respectively, and are enabled by the use of complex models and multiple and extensive data sources [33].For land surface applications, in comparison, research on the derivation of trends and reliable projections from remote sensing time series appears to be lacking and has been identified by Kuenzer et al. [34] as a future challenge for remote sensing scientists.
For this reason, this paper reviews the existing literature of the past ten years on the topic of EO data-based forecasting.We focus on terrestrial spatio-temporal processes that can be monitored using remotely sensed data, hereafter called land surface dynamics.These include EO-derived indices and geophysical parameters as well as thematic products such as land use and land cover.To the best of our knowledge there is no review paper encompassing all available EO-based forecasting methods on all applications pertaining to the land surface.However, we identified four reviews specialized on certain applications.Aburas et al. [35] and Musa et al. [36] both reviewed the future modeling of urban growth, the former focusing particularly on Cellular Automata (CA) models.Rembold et al. [37], in contrast, reviewed the usefulness of low-resolution remote sensing data for crop yield prediction and forecasting, while Li et al. [38] studied the application of EO data in flood forecasting.Hence, the aim of this review paper is for the first time to give a comprehensive overview of the application of EO data for the spatial and temporal forecasting of land surface dynamics and to identify possible research gaps.In doing so, we will tackle the following research questions:


What types of land surface dynamics have been forecast using EO data? Where and on which spatial scales have the forecasts been conducted?Our review methodology is outlined in Section 2. We then present the results of the literature analysis in Section 3: First we give an overview over the research topics in EO-based forecasting, and on which spatial scope and by whom they have been conducted.We then proceed to categorize and briefly introduce the identified forecasting methods and analyze the usage of methods in the past decade and for certain applications.Finally, we present the temporal scope of the reviewed studies and technical aspects such as EO-data importance and sensor usage.Our findings are discussed in Section 4 and finally summarized in Section 5.

Review Methodology
In this study we reviewed scientific studies that attempted to forecast spatio-temporal dynamics on the land surface using EO data.In order to identify relevant literature we searched the Web of Science database [39] for Science Citation Index (SCI) papers containing variations of the keywords forecast, prediction, simulation, modeling, trend, projection, prognosis or future in the publication title.We selected this wide range of search tags because we found that the term "forecast" lacks consistent terminology in many EO-based studies.We limited the search on papers published in the last ten years (January 2010 to January 2020) because we expect the recent opening of many EO archives as well as the rapidly growing number of EO data sources may have fostered research on EO-based forecasts especially in the last decade.Additionally, we limited our review to the 16 international scientific journals with a thematic focus on remote sensing and EO listed in Table 1 to ensure that the studies are based at least in parts on EO data.Including a wide range of searching keywords ensured that no relevant literature was omitted but also resulted in a large number of possibly relevant papers.We proceeded to manually screen and filter this list of 5580 publications according to the following criteria:


The forecast must be based at least in parts on EO data, hence the focus on EO journals.To count as EO data, input datasets must be derived directly and exclusively from remote sensing data sources.


The forecast must pertain to parameters, indices or thematic classes of the terrestrial Earth surface including inland waters.Studies with marine or atmospheric applications were rejected.


Studies should attempt a temporally explicit forecast, i.e., model values at a specified point in time in the future as seen from the latest dataset used.If, e.g., for validation reasons, future values are modeled by leaving one year out, then the proposed method at least should be explicitly designed for forecasting.
In doing so, 143 relevant studies including four review articles were identified and further analyzed with regards to their publication year, the field of application, the spatial scope of the study and author nationality, the forecasting methods employed, the temporal scope of the study and its input data, as well as technical aspects.Refer to the supplementary Table S1 for the full list of references and analyzed parameters.

Research Topics
Circa 20% of all reviewed studies were published in the year 2019 (Figure 2a).In general, after a drop between 2010 and 2012, there is a growing trend of scientific publications on EO-based forecasting.This may reflect the increasing availability of and the easier access to EO data in the last decade.It is, however, also worth mentioning that the publication output of the Chinese scientific community increased significantly in recent years.For example, in 2010 only one of 12 (~8%) reviewed studies were published by authors with an affiliation to a Chinese research institution, whereas in 2019 it was six of 28 (~21%).In comparison, the share of US publications slightly decreased from ~33% to ~29% between 2010 and 2019.
To enable a better overview over the research topics of the reviewed studies, we subdivided the multitude of forecast applications into the six spheres displayed in Figure 2b.Table 2 shows examples of applications in each sphere.We are aware that there is much overlap especially between anthroposphere and the other spheres.Hence, we consider all land surface dynamics that are actively managed through human actions as anthroposphere.For example, agricultural applications pertain to vegetation and could be interpreted as part of the biosphere.Since there is considerable human influence on the vegetation dynamics which thus differ from natural vegetation cycles, we consider agricultural applications as part of the anthroposphere.Additionally, we introduced the energy flux class for studies that focus on energy fluxes that are not the result of a specific feature on the land surface (such as a LULC class) or pertain to properties of said feature.Of the 139 studies quantatively analyzed in this review, 81 (~57%) put a focus on problems of the anthroposphere.This may be the consequence of anthropogenic processes being among the most dynamic on the Earth's surface.Processes of the biosphere and the hydrosphere are forecast in ~19% and ~12% of the cases, respectively.The remainder of the applications pertain to the lithosphere (~6%), cryosphere (~3%), and energy flux (~3%).In the following, we will discuss the most important applications and studies within each sphere.
The second major application in forecasting anthropogenic land surface dynamics is crop yield.In 2013, Rembold et al. [37] reviewed the use of low-resolution remote sensing data such as AVHRR, MODIS and SPOT VEGETATION for crop yield forecasting.They found that this data was popularly used for forecasts on the national and regional scale but that difficulties arose in regions with fragmented agricultural areas.Nevertheless, they stress that the long time series data and continuity of low-resolution sensors is mandatory for building robust crop yield models.In our reviewed literature, except for Kouadio et al. [84] and Jia et al. [85], time-lagged regressions are used for crop yield estimation and forecasting at the end of the season, sometimes as a part of specialized crop growth models.As pointed out in Section 3.4., time-lagged regression forecasts profit from a high number of observations.As a consequence, sensor systems with a low spatial resolution but a high temporal resolution such as AVHRR, MODIS, and SPOT are used predominantly.The most popular crop to be forecast is wheat [84][85][86][87][88][89][90][91][92][93][94][95][96][97][98], followed by corn [93,94,[99][100][101][102][103][104][105][106], sugarcane [107][108][109][110], and rice [111,112].Furthermore, forecasts have been made for wine [113] and potato [114] yields, as well as multiple crop types within a single study [91,93,94,99,100,115].Crop yield at the end of the growing season is in most cases regressed against spectral indices measured at a specific point in mid-season or temporal metrics of these indices computed over parts of the growing season.The most frequently used index by far is the Normalized Difference Vegetation Index (NDVI) [84,[86][87][88][89][90][91][92][93][95][96][97][98]100,101,[106][107][108][109]113,114], followed by the Enhanced Vegetation Index (EVI) [84,95,105].Furthermore, derived parameters like LAI [85,91,104] and FPAR [108,110] are used as explanatory variables.The use of these indices and parameters reflects the predominant use of optical sensors in crop yield forecasting, and only a few forecasts have been conducted using SAR data in recent years [88,102,111].These studies show, however, that microwave data can yield even better results than optical data [88], or at least be a useful complement [102].
Besides LULC and crop yield forecasting we identified few other applications, among them the modeling of temporal NDVI dynamics over sugarcane fields [116], forecasting of the displacement of a dam using InSAR data [117], future evapotranspiration as a result of land use change [118], trends of fire occurrence on different tenure systems in Zimbabwe [119], and the response of the NDVI over agricultural areas to climate change [120].

Biosphere
Applications of EO-based forecasting of the biosphere are thematically and methodologically more diverse than those of the anthroposphere even though fewer studies (26) have been conducted in that regard.Spectral indices such as the NDVI and EVI have been modeled in ten cases, e.g., for purposes of vegetation condition [121][122][123] and dynamics [124][125][126] forecasting.Others have attempted to generate future NDVI data with the help of external variables like precipitation [127] or climate indices [128], or to synthesize future NDVI images from past scenes using a deep learning approach [129].In a different study, Miao et al. [130] forecast desertification based on NDVI data and climate projections almost 90 years into the future.
Vegetation cover as a thematic land cover class has been forecast for mangroves [131,132] and various types of riparian vegetation abundance [133].Coops et al. [134] modeled climate induced tree species migration in the American North-West, while Khoi and Murayama [135] simulated forest conversion vulnerability in Vietnam.In more general approaches, Tiné et al. [136] and Anand and Oinam [137] forecast the dynamics of various LULC classes in wetlands.In a near real-time approach, Sun et al. [138] modeled the spread of a wildfire using a cellular automaton, LULC maps and wind data.
Short-term vegetation phenology forecasts have been conducted by Carrao et al. [139], who employed and compared three different self-learning time series approaches, while Liu et al. [140] applied a time-lagged regression for a similar task.Biophysical parameters have been forecast by Jiang et al. [141], who compared time-series based forecasting methods for LAI, by Donmez et al. [142] forecasting ecosystem productivity via NPP in a Mediterranean pine forest, and by Shrestha et al. [131] who forecast LAI, GPP, and leaf chlorophyll content in a mangrove forest.Based on different urban growth scenarios Sheikh Goodarzi et al. [143] simulated future conservation suitability.Arantes et al. [144] forecast evapotranspiration based on different EVI and LULC scenarios in the Brazilian Cerrado biome, while Park et al. [145] developed a short-term drought forecast system utilizing satellite-based NDVI, land surface temperature and rainfall products, as well as oscillation indices.Finally, Huesca et al. [146] forecast fire risk in an auto-regressive time-series based approach.

Hydrosphere
Li et al. [38] reviewed remote sensing-based operational fluvial-flood forecasting.They found that the implementation of EO data in operational forecasting systems is not yet established.This is due to the inaccuracy of some remotely sensed hydrological products and scaling problems, among others, and they stress that more research on the topic is needed.Nevertheless, short-term river discharge [147,148], river water level [149][150][151], and streamflow [152,153] forecasts have been performed with time-lagged regressions against up-stream precipitation, inundation, or snow cover extent.Haile et al. [154] used short-term weather projections for flood forecasting in Nigeria, while long-term forecasts of hydrological parameters have been performed based on climate change scenarios [155,156], as well as simulated LULC scenarios [157,158].Liao et al. [159] used altimetry data and auto-regressive time series modeling to forecast the water level of Qinghai Lake in China, whereas Chipman [160] used multiple optical and SAR sensors to quantify water level trends in Egyptian lakes.Large-scale regional studies have been conducted by Sutanudjaja et al. [161], who forecast groundwater height from ERS scatterometer time series data over Western Europe, and Ahmed et al. [162] simulating terrestrial water storage with a Nonlinear Auto-Regressive with Exogenous Input (NARX) neural network for major African watersheds with a half-year lead time.

Lithosphere
We identified five forecasts of geomorphological dynamics.Cenci et al. [163] and San et al. [164] have forecast shoreline dynamics in the Mediterranean by calculating trends of shoreline advancement and retreat on transects perpendicular to the coast using high resolution optical satellite imagery.Deng et al. [165] and Ma et al. [166] employed SAR data to simulate future land subsidence in Beijing and Shanghai, respectively.Assuming stable LULC change conditions, Kavian et al. [167] forecast soil erosion in an Iranian catchment 40 years into the future.
3.1.5.Cryosphere Pastick et al. [168] used a recent LULC mosaic and projected climate parameters in a regressiontree approach to map future permafrost distribution in Alaska.Yin et al. [169] and Luo et al. [170], in contrast, used a specialized model driven by MODIS land surface temperature time series data, insitu, and projected temperature measurements as well as meteorological data to model future permafrost distribution and parameters such as active layer thickness and mean annual ground temperature in the Qinghai-Tibet Plateau.Giles [171] modeled the dynamics of the Mertz Glacier Tongue in Antarctica using historic photographs and satellite data and estimates a future break-off event based on the observed cyclic behavior of the glacier.

Energy Flux
Mathew et al. [172] developed a linear time series model to forecast land surface temperature over the city of Jaipur, India, from time series of MODIS and ASTER temperature data.Furthermore, surface irradiance has been forecast for very short lead times from Meteosat data [173,174].While Licciardi et al. [173] used the spatial as well as the temporal information of each pixel and its neighbors to train an artificial neural network (ANN) for this purpose, Urbich et al. [174] calculated irradiance based on the movement vectors of clouds between two images.Patil et al. [175], in contrast, forecast synthetic spectral band values in a random forest approach to generate future synthetic Landsat images.

Spatial Scope and Author Affiliation
Next, we identified geographical hotspots in EO-based forecasting with regards to study area and author nationality.The blue hues in Figure 3 indicate the number of study areas per country, while the dots represent the approximate geographical center point of each study area.The dots are color-coded according to the spatial scale on which the respective study was conducted.For continental and three large-scale regional studies we did not count each country but listed the studies separately.Their study areas, however, are indicated as dots on the map.Note that studies with multiple distinct and distant study areas appear as multiple dots on the map, e.g., the studies of Becker-Reshef et al. [86] and Santamaria-Artigas et al. [92] in the Ukraine and U.S. Most studies were conducted in China (24), the U.S. ( 23), followed by India (16), Brazil (7), Iran (7), and France (7, including French overseas territories).African countries, in contrast, are underrepresented in EObased forecasting, with only Nigeria being the study area of more than one study (4).Apart from France and the U.S., the reviewed studies are clustered in countries with a high development rate that is accompanied by dynamic land surface processes such as urban growth and LULC change.Moreover, the types of recent land surface dynamics in these countries are also reflected in the applications of the studies conducted there.For example, in countries with a dynamic urban growth, such as China, India, and Iran, forecasts of LULC dynamics dominate (~30%, ~69%, and ~71% of all applications, respectively).In contrast, countries with large scale agriculture such as the USA and Brazil focus more on crop yield modeling and forecasting (~48% and 29% of all applications, respectively).The vast majority of the reviewed studies (~87%) are conducted on a local scale (Figure 4).Even in the times of open archives and increasing processing power, working with global remote sensing data sets seems to remain a challenge.Additionally, in many cases accurate forecasting requires higher level datasets such as LULC classifications and time series, which means that the EO data must first pass through the entire remote sensing processing chain.This, too, results in increased processing demand.As a consequence, we identified only ten (~ 7%) studies conducted on a national level, six (4%) on a regional level, and two (1%) on a continental scale.Liu et al. [140] made shortterm forecasts of spring phenology in North America using VIIRS and MODIS data, while Petersen [99] developed a framework for MODIS-based crop yield estimation and forecasting in every African country.Further regional studies were conducted in Europe [161,174], Africa [162], Central Asia [130], East Asia [145], and South Asia [147].We identified no study that conducts EO-based forecasting on a global scale.The nationality of the studies' first authors generally reflects the spatial distribution of study areas (Figure 5a).We determined the nationality by the location of the research institution the first author is affiliated with.In some cases, authors may have multiple affiliations and thus multiple nationalities.The similarities of the spatial patterns of study areas and author nationalities indicate that that local studies are usually conducted by researchers employed in the same region.While a similar number of studies have been published for study areas in the U.S. and China, the U.S. researchers have still conducted the majority of studies (33), with China taking the second place (22, Figure 5b).Most of the research on EO-based forecasting has been done in Asian countries (63), followed by North and Central America (35) and Europe (34).We attribute this to the high land surface dynamics in Asian countries in combination with the technological and scientific know-how of the local research facilities.In contrast, only nine authors were affiliated to research institutions located in African countries.Given the high development rates in many African countries, we see great potential in fostering research on EO-based forecasting in these regions.While in North America and Asia research is clearly dominated by the U.S. and China, in Europe the author nationalities are more evenly distributed with France (7), Italy (6), the U.K. (4), Germany (3), and Spain (3) being the main contributors.In sum, the European research contribution to EO-based forecasting is quantitatively comparable to that of the North and Central American researchers.

Categorization of Forecasting Methods
The literature review shows that numerous diverse methods are used to forecast land surface dynamics based on EO data.Since we aim to analyze these methods with regards to application types, spatio-temporal forecasting capabilities as well as technical aspects, we decided to categorize the identified forecasting methods.We found an approach that takes into account the variety of input data as well as the temporal forecast mechanism best suited to the task This also reflects a user's perspective from which the data availability and the desired forecast horizon (also called lead time) often determine which forecast model to employ.
The first criterion of our categorization approach therefore examines whether the data used is primarily historical data of the parameter to be forecast or whether it also includes external explanatory variables from which the future value of the parameter in question is modeled.We therefore discriminate whether a future value is self-learned based on past values of itself or explained by a regressive relationship to a different variable.In a second step we examine how the method models the temporal forecasting component.In case of the regressive methods, this is achieved either by a fixed time lag (RT) between the explanatory and modeled variables or by integrating values of the explanatory variables that are already projected (RP) into the future (e.g., climate projections, weather forecasts, or LULC change projections) into the forecasting model.The self-learning methods can be sub-divided into whether they employ time series (ST) analysis methods to extrapolate future values or are modeled through iterations (SI) of transitions between a small number of observed past states.We identified only three methods that do not fit into this four-class categorization scheme and therefore classified those as either a hybrid (H) between the regressive and self-learning methods or, in one case, as an other (O) application-specialized numeric model.
Figure 6 shows the number of applications of each method according to our categorization.Note that a scientific publication can employ more than one forecasting method; hence there are 181 forecast applications in the 139 reviewed studies.The self-learning iterative methods (37% of all applications) and the regressive time-lagged methods (33%) are by far the most frequently used, followed by self-learning time series (17%) and projection-based regressive (12%) methods.As already mentioned, other models comprise a small minority of three cases in this categorization.The following section briefly introduces each method category regarding its forecast mechanisms and the mathematical models applied.We also highlight studies, which are exemplary in their methodology or compare different forecasting approaches.Regressive time-lagged (RT): In the simplest and most common form this category comprises ordinary least-squares regressions (OLS) between an explanatory variable and a forecast variable.
The temporal component of the forecast is achieved by a time lag in the regression and is thus a fixed time interval.For example, the observed mid-season NDVI value can be correlated with the end-ofseason crop yield, e.g., [86,87,99], or river discharge can be regressed against up-stream precipitation, inundation, or snow cover, e.g., [147,148,155].In order to model future parameters based on multiple explanatory variables with a fixed time lag, machine learning techniques such as decision and regression trees (DRT) or ANN are often used.This enables the use of multiple metrics [107,114], the integration of different remote sensing systems [88,100,121,145], or the additional use of non-EO data sources [89,90,100,101,120,122,128,145] to improve forecast accuracy.For example, Gómez et al. [114] compared ten different machine learning methods at different hyperparameter settings for potato yield prediction using several Sentinel-2-derived temporal and spectral metrics retrieved during the growing season as input for the forecasting models.A random forest model showed the most promising results for the forecast with a lead time of one month.Zambrano et al. [120] compared an OLS with an ANN model to forecast NDVI values for several lead times up to half a year in Chile.Although the ANN was trained with multiple data sources such as MODIS-derived NDVI, satellitebased precipitation indices, and oscillation indices while the OLS used only the single most important predictor variable, there were no significant differences in forecasting accuracy between those two methods.This indicates that the use of more data sources and more sophisticated models does not necessarily result in more accurate forecasts.While these machine learning methods are applicable to a variety of problems, there are also more complex regression-based forecasting models that specialize in a particular application and need a very specific set of input data, which can only partially be obtained from remote sensing data sources.Crop yield, for example, is forecast using crop growth models like MOSICAS [108], SAFY [91,102], and ORYZA [111], integrating detailed information on weather and climate, soil properties and cropping practices, among others.These models, however, are not necessarily more accurate than simple regressions between NDVI and endof-season crop yield [108].Similarly, specialized hydrological models such as an adaptive streamflow model (developed by Bindlish et al. [176], applied by Vittucci et al. [149]), VIC [152], HEC-RAS [150,151], and SnowCloudHydro [153] are employed for the short-term forecasting of river water level and streamflow.
Regressive projections (RP): The integration of additional data sources generally requires a more sophisticated model design.Hence, when projected data is used, specialized complex models are more frequently employed than in time-lagged regression forecasts.By forcing these models with projected data, the forecast lead time can be increased considerably and is only limited by the temporal scale of the input data.On the downside, validation of the modeled future data is naturally impossible before the study is published and therefore subject to an amount of uncertainty that is difficult to quantify.In general, any of the more complex models introduced above can be forced with projected data.For example, models such as HEC [154] and VIC [177] are used for hydrological modeling with projected climate data, as are J2000 [156] and SRM [155].By combining the WOFOST crop yield model and the WEP-L hydrological model, Jia et al. [85] were able to estimate wheat yield 30 years into the future.To model future permafrost distribution and characteristics, Yin et al. [169] and Luo et al. [170] employed the specialized GIPL2 model, whereas Pastick et al. [168] trained a DRT model with future climate data for a similar task.In general, however, next to OLS, machine learning methods are used much less frequently in this forecast model category.Of the 22 projection-based forecasting studies identified in this review, 15 were based on climate projections [78,85,130,131,134,142,[154][155][156][168][169][170]175,177,178], six on LULC projections [65,143,144,157,158,167], and one study on both [118].
Self-learning iteration (SI): In 67 cases in the reviewed literature, future land surface dynamics have been forecast by modeling the transition probabilities between past states of the land surface and, by iterating those in a stochastic process, extrapolating them into the future.Most of these forecasting methods are based on the concept of Markov chains (MC) that allow the modeling of multiple discrete states based on previous states and the transition probabilities between those states [40].Each iteration in that process updates the states according to these probabilities.In an EO context, MC-based methods are therefore primarily used to forecast the future dynamics of thematic land-use and land-cover classes (c.f.Section 3.3.3.).Since MC modeling as such is not spatially explicit [40,41], these methods are in most cases integrated with CA models that also consider the state of the neighboring pixels when calculating the transition probabilities and allocate state changes according to a local suitability map [40][41][42][43][44][56][57][58][66][67][68][69][70][71]158].To improve the accuracy of these models, the transition probabilities and suitability (i.e., probability whether and where LULC change will occur) are frequently expressed as functions of multiple explanatory variables such as topography, socioeconomic metrics and distance functions.The relative influence of these variables can then be weighted in a Multi Criteria Evaluation (MCE) [45,46,59,79,80].In a slightly different approach, ANNs like the Multilayer Perceptron (MLP) are often combined with MC to calculate the transition potentials as functions of multiple change drivers [47][48][49][50][51]60,[72][73][74]81,82,131,135,137].Here, a multiobjective land allocation algorithm (MOLA) is usually employed to allocate the changes and produce future LULC maps.In a direct comparison by Ozturk [60] the MC-MLP approach resulted in a slightly higher simulation accuracy than MC-CA.A specialized method, which is established in urban LULC forecasting, is the CA-based SLEUTH model [61][62][63]143].It has been calibrated and used for many cities around the world with good results but has a fixed set of input parameters and cannot be integrated with other models [35].For a more comprehensive overview of studies that model urban growth with CA-based approaches, see Aburas et al. [35].Obviously, many MC-based models can appear as a hybrid between regression and self-learning models in our categorization.Since the basic mechanism of MC is the updating of a state based on transition rules learned between previous states, however, we classify these models as self-learning.
Self-learning time series (ST): While models in the SI category usually learn from the transitions between the two last time steps and earlier observations are not included [75], time-series based methods rely on a denser and longer history of observations of the forecast parameter (c.f.Section 3.4.).They seldom include external variables and are therefore, in a stricter sense than SI methods, auto-regressive.This type of forecasting method has been employed 30 times in the reviewed literature.The simplest method of a forecast in this category is to fit a linear trend to a time series and calculate future values from the resulting regression equation [119,160,163,164,178,179].More sophisticated methods of time-series forecasting consider auto-correlation, global trends, and seasonal cycles [17].Box and Jenkins [180] introduced the auto-regressive moving average (ARMA) time series modeling, whose many variations are sometimes used in EO-based forecasting [116,[123][124][125]141,146,159,161].Jiang et al. [141] compared three ST type models (Seasonal Auto-Regressive Integrated Moving Average (SARIMA), Seasonal-Trend Decomposition based on Loess, and Dynamic Harmonics Regression) for LAI forecasting and found that SARIMA yielded the most accurate results when provided with an input time series of good quality.Usually, auto-regressive models are univariate and handle only one input time series data set.Vector Auto-Regression (VAR), however, allows for multivariate forecasts based on multiple input time series, but is only effective when each time series used can be modeled accurately [116].While in the RT category ANNs are used to predict land surface dynamics based on multiple explanatory variables, they can also be trained solely on the past values from time series data of the forecast parameter [129,166,173].For example, Das and Ghosh [129] developed an ANN based on a Deep Stacking Network (DSN) to predict future NDVI images.Compared to several other ANN algorithms (NARNET, MLP and DSN) the proposed Deep-STEP approach performs significantly faster and is more accurate.Licciardi et al. [173] demonstrate the ability of ANNs in handling spatio-temporal data by using both past and neighbor values as the input in their pixel-based forecasting approach.
Hybrid (H) and Other (O) models: In only three papers the forecasting methods did not fit in our categorization scheme.The Nonlinear Auto-Regressive with Exogenous Input (NARX) neural network approach combines auto-regressive and causal forecasting in an ANN model and is therefore categorized as a hybrid method [127,162].Zhou et al. [117] applied a specialized numerical model based on interferometric synthetic aperture radar (InSAR) time series, mechanical parameters, and in-situ displacement measurements to model and forecast the displacement of a dam in China.We categorize this approach as O, and since there is limited transferability to other applications it will not be introduced in more detail.
We use this categorization system in the following sections to analyze recent developments in application, spatio-temporal capabilities, and technical demands of different forecast model types.

The Use of Forecasting Methods in the Past Decade
In order to detect possible trends in the use of certain forecasting methods in the past decade, we analyzed the application of the six identified forecasting method categories in each publication year (Figure 7).Note that we show the count of different methods used and that one scientific publication can employ multiple forecasting methods.We identify no clear trend towards the increased or decreased usage of a specific forecasting method over the entire study interval.Both of the most frequently used forecasting method categories, SI and RT, have been constantly applied in the last ten years, with slightly varying shares in each year.Very high numbers of applications in 2014 (RT) and 2019 (RT and SI) are in part the result of studies that compare multiple forecasting approaches [57,74,91,108,114].Forcing forecasting models with projected climate data (RP) apparently gained traction after the publication of the 5 th IPCC report in 2014/2015 [181] where multiple climate scenarios along different greenhouse gas concentration pathways had been presented.Although remote sensing time series have extended continuously, forecasting based on time series (ST) has not increased in the past decade and only constitutes a small fraction every year.While the classical Box-Jenkins forecasting approaches have been primarily applied during the first half of the past decade [116,[123][124][125]141,146,159,161], the past few years might indicate a growing trend towards the use of artificial neural networks [127,129,162,166,173].This reflects the general trend in remote sensing towards the increased use of machine learning methods for the handling of large and/or multiple datasets [33].

The Use of Forecasting Methods for Particular Applications
The choice of which forecasting method to apply is strongly dependent on the type of the feature that is to be forecast.Remotely sensed data can be categorized into whether it is an index directly calculated from spectral bands (e.g., the NDVI), a geophysical parameter derived from satellite measurements (e.g., soil moisture, river discharge), or a thematic class, e.g., LULC in general or a specific LULC class (Table 3).Figure 8 shows the number of forecasting methods applied for each feature type.In ~48% of the reviewed cases a geophysical parameter was forecast, in ~42% thematic classes have been simulated in the future, and only in ~9% of the cases a spectral index was modeled.While indices and parameters are generally continuous numeric values, thematic classes are always discrete.This is reflected in the use of forecasting approaches: While thematic problems are forecast with SI methods in ~84% of the cases, these methods are almost never used for index or parameter forecasting.While in parameter forecasting time-lagged regression dominates (~63%), indices are mostly (~59%) forecast with self-learning time series approaches.

Type of Forecast
Feature Examples index NDVI, EVI, reflectance 1 , NDWI 2 geophysical parameter crop yield, primary production, phenology, evapotranspiration, discharge, streamflow, water level, erosion rate, irradiance thematic land use, land cover, binary land use or cover masks, fire, permafrost occurrence, shoreline dynamics 1 Reflectance is not an index as such but a direct spectral response in a numeric format and not a derived parameter. 2Normalized Difference Water Index.Examples for feature types are given in Table 3.

Temporal Scope
Naturally, most forecasting methods use multi-temporal (~62%) or time series EO data (~30%) as input (Figure 9a).We consider multi-year EO datasets with a high frequency of evenly timed observations a time series.In general, regression-based forecasting profits from as many observations as possible to establish a robust relationship between the explanatory variables and the forecast parameter.In consequence, time-lagged regressions and self-learning time series forecasting always require at least multi-temporal input data.Additionally, long and dense time series enable the modeling of inter-and intra-annual variances.SI forecasting requires at least a bi-temporal input dataset to model the transition probabilities between observations.In most cases, at least one additional time step is used to validate the simulated image.Since many SI forecasts use high resolution images with a low temporal resolution and require computationally demanding classifications, this kind of forecast method is not frequently applied to time series data.In their CA-based LULC simulation model however, Wang et al. [74] analyze the temporal dynamics of each pixel using a Breaks for Additive Season and Trend (BFAST) algorithm and thus increase the modeling accuracy significantly.Generating a consistent time series from the low-temporal resolution Landsat data nevertheless proved a considerable challenge and was achieved with the (Enhanced) Spatial and Temporal Adaptive Reflectance Fusion Model ((E)STARFM) algorithm, which requires the additional use of MODIS data to simulate Landsat-like images.
Time-series based auto-regressive forecasting methods (ST) inherently need a high number of observations and therefore use at least multi-temporal (~40%) and in most cases time series data (60%) as input.Especially when using high resolution optical data, creating a consistent time series with highly frequent and evenly timed observations can be challenging, though, mainly due to cloud coverage and long revisiting times.There are different approaches to tackle this problem in EO-based forecasting with the most popular being the utilization of already existing high-level remote sensing products such as MODIS or NOAA AVHRR time series data sets [119,[123][124][125][126]141,146,173].Preproduced datasets such as vegetation index time series, however, can only be used for a limited variety of applications.Alternatively, multi-sensor approaches can be employed to close time-gaps or extend time series [159,160].If there is no use-ready time series product available, the required datasets can be produced given enough computational power to process a large amount of images [116,129,161].
In most of the applications which apply time series data, short to medium time series up to a length of 15 years were used (~74% of the cases, Figure 9b).The longest time series in all reviewed studies was used by Miao et al. [130] who established a relationship between 30 years of AVHRR NDVI data and climate data to model and forecast dryland desertification based on climate projections.Other studies using AVHRR time series longer than 20 years have been conducted by Kogan et al. [93], Mangiarotti et al. [126], and Shrestha et al. [131].Even though AVHRR offers one of the longest time series of all remote sensing instruments (39 years), MODIS time series (operational since 2000) are used far more often for forecasting (~63% of cases), probably due to the higher spatial resolution of the instrument.Further details regarding the use of specific sensor systems and their properties will be discussed in Section 3.5.Despite EO time series continuously expanding, easier access, and increasing computational power, we do not observe a clear increasing trend of scientific studies that utilize time series for forecasting in the past decade (Figure 7).Rising numbers of publications in 2018 and 2019, however, may indicate a recent increase in the use of time series.It remains to be seen whether this trend is continued in the coming years.
A forecast based on fewer observations (mono-temporal, bi-temporal) is possible but is only made in few and special cases.We observed mono-temporal forecasts only in studies, in which projected non-EO data is combined with EO-derived recent or future LULC maps [65,78,134,138,143,[156][157][158]168].The same occurs in two bi-temporal projection-based studies by Patil et al. [175] and Yao et al. [177], while others employ transition probabilities or transition vectors between two time steps to forecast future conditions [65,81,83,174].
Figure 10 shows the forecasting lead time per model application and category.Most of the reviewed studies either had a short lead time of less than a year (83) or a very long forecasting horizon of more than 10 years (72 cumulated).This also reflects the distribution of the most frequently applied methods of the time-lagged and iteration-based model categories.While RT methods are only capable of short-term forecasts, SI can be used for predicting land surface dynamics up to 50 years in advance.Forecasts based on projections have even been conducted with over 70 years lead time [130,134,155,156,168,175,177].Circa 70% of the short-term forecasts up to one year have been conducted using a time-lagged approach (RT), while time-series modeling (ST) was used in ~24% of the cases.The lead time of hydrological forecasts, for example, is determined by the river flow velocity and can therefore be modeled as a regression with a time lag of a few days between upper river measurements and downstream water level or discharge estimations [147][148][149][150][151]154].In crop yield forecasting, time lags are determined between the harvest date and, for example, vegetation index measurements that in the most cases are taken one to three months in advance [84,[86][87][88][91][92][93][94][95][96][97][98][99][100][101][102][103][104][105]107,109,111,112,114,115].Near-real-time forecasts with lead times of only a few hours have been conducted by Sun et al. [138], who modeled the spread of fire using a cellular automaton, as well as Licciardi et al. [173] and Urbich et al. [174] forecasting solar irradiance.
For medium lead times between one and five years, time-lagged regression was used only in one case, with Cunha et al. [113] having predicted wine yield with a lead time of ~1.5 years.Selflearning iterative and time series methods are used more frequently.While time series-based methods and the hybrid NARX model are used especially for shorter forecasting horizons up to six years [160,161,179], SI are applied for longer lead times up to ten years [52,56,65,66,69,74,75,80,82,83,132,133,157].
Even longer lead times are achieved especially with SI and RP methods.Transition probabilities of MC models are generally calculated between two images or maps acquired five or ten years apart.Each iteration of the model that updates the map according to these probabilities then models the change for a similar time interval.A few iterations of the model thus result in long lead times.It must be noted, however, that changes that are predicted far into the future are in general based on present change rates and the model accuracy is likely to deteriorate with an increasing number of iterations.The lead time of models using projections, in contrast, is limited by the temporal extent of the projections used for the forecast.As different warming and concentration pathways have been modeled up to the year 2100, the forecasts with the longest lead times are achieved with models using projections.
Mathematically, the lead-time capability of ST methods is not a result of the absolute length of the input time series but rather of the ratio between time series length and observation spacing: A short, dense time series will have a shorter forecast horizon than a long time series with large observation increments.In order to accurately model seasonality, however, the observation density in turn must be able to represent intra-annual variances.As a consequence, in the ST category, lead times of more than 10 years have only been achieved using conventional linear trend interpolation [119,164,178] which does not reflect seasonal variances.Those have been be modeled only for short lead times using Box-Jenkins forecasting approaches.

Technical Aspects
We analyzed the importance of EO data in each forecast application to examine which methods can be employed using primarily EO data and which methods rely heavily on additional non-EO datasets.To do so, we computed the ratio of the number of EO datasets to the number of total datasets used in each forecast and classified the results according to the share of EO data (Figure 11).Note that we count images from a single sensor as one data set, so that for example a time series of 100 MODIS images counts only as one single dataset.Around 20% of the forecasts use EO data almost exclusively (> 75% share), while another ~15% use less than 25% of EO data.Naturally, autoregressive time series methods (ST) in many cases have no need for external data and can thus solely rely on EO data.In contrast, forecasts using projections are often based on climate models, which we do not consider EO data in a stricter sense, and thus have a lower share of EO data.In cases where forecasts are based on projected LULC, however, even RP methods can be based primarily on EO data [143,144].In theory, SI methods can be applied using only EO data.They are, however, able to integrate external data to improve modeling results, which is done to various degrees.Models in time-lagged regression forecasting (RT) are usually built using in-situ data such as crop yields or water gauges and therefore usually rely on EO data to a medium degree (25-75%).Next, we analyzed how often sensor types (Figure 12a) and which remote sensing sensor systems (Figure 12b) were used for forecasting.Again, one reviewed study may employ more than one sensor, and since some sensors can produce multiple data types (e.g., MODIS optical and thermal), the total counts between sensors and sensor types differ slightly.Optical EO data was used 160 times, followed by SAR (17), thermal (10), passive microwave (9), and altimetry data (8).Gravimetry and scatterometer data have been used in one case each.The reviewed studies used 36 different sensor systems, which reflects the wide range of various applications of EO-based forecasts (c.f.Section 3.1.).The four most frequently used systems, Landsat (57, ~29%), MODIS (45, ~23%), SPOT (18, ~9%), and AVHRR (10, ~5%), are optical and offer at least 20 years of continuous observation and high-level use-ready products.Landsat sensors (MSS, TM, ETM+, OLI) are primarily used for thematic LULC forecasting with SI methods, where a high spatial resolution is more important than a high frequency of observation.In contrast, MODIS and AVHRR are mainly used in RT, RP, and ST type forecasts, where a high density of observations outweighs spatial resolution.SPOT is used in all forecast types, as it offers both high spatial resolution and, in the case of the SPOT VEGETATION products, a high temporal-resolution at medium spatial resolution.In theory, SAR sensors should be ideal for time-series or regression-based forecasting due to their ability to image the Earth surface independently from cloud cover and solar illumination.In practice, however, few SAR systems are operational for more than a decade, processing can be computationally demanding especially when including the complex phase information, and there is little use of high-level products that could facilitate forecasting.The most frequently used SAR sensor is Sentinel-1, which has been operational since 2014, has a revisit time of six days and whose data products are freely available.

Discussion
EO sensors have been monitoring the Earth's surface for over 40 years.EO data archives are continuously expanded not only through new image acquisitions by operational remote sensing systems but also by regular launches of new EO sensors.Therefore, long time series of observations of a multitude of land surface processes are now available and enable the detection of long-term trends that can be interpolated into the future.At the same time, the spatial dimension and coverage is the great strength of EO data.For example, it does not only allow the estimation that a city will grow in the future but also enables the modeling of probable spatial trajectories of urban growth.The resulting spatially explicit forecast data can inform planning and decision-making processes and thus help tackle the challenges posed by global change.Therefore, we see great potential especially in short to medium-range forecasts up to five years, which would align with the often limited terms of office of policy and decision makers.For these lead times we consider auto-regressive time series approaches to be particularly promising, as they enable both the detection and interpolation of longterm trends and can model seasonal dynamics on a finer temporal scale.
Developing forecasting methods for lead times of more than one year inherently poses challenges in terms of validation.Naturally, independent observations of a future variable are impossible at the time when the forecast is made.Usually, a validation is therefore performed by excluding the most recent values of the forecast variable from the training or calibration dataset.Then the values of this variable are forecast using the model, and eventually the modeled results are compared to the observed values.This approach is especially feasible for short-term forecasts.For long lead times, however, the validation period can be relatively short compared to the overall forecasting horizon.For example, for LULC forecasting using iterative self-learning approaches, stable change rates, even for long lead times, are assumed in order to define transition probabilities.These probabilities in turn strongly depend on the chosen observation interval, which can be comparably short.As a result, the deterioration of forecasting accuracy with increasing lead time is difficult to quantify accurately during the validation.An exact validation of the results based on independent observations in this case would only be possible in follow up studies years after the forecast is made.We did not identify any such studies during our literature review.However, we do not necessarily consider perfectly accurate forecasts mandatory nor realistically possible, since unexpected disruptions can always occur.We rather see the need for the reliable detection and interpolation of approximate spatio-temporal trends.To achieve this, long time series of observations are required and by now available from some remote sensing sensors.Nevertheless, in EO-based forecasting, the potential of long-term time series datasets is not being utilized to the extent that it could be.This is particularly true in the case of thematic LULC forecasting: Especially in an urban context, LULC dynamics are generally forecast at finer spatial resolutions and based on much fewer observations, e.g., when Landsat data is used.Usually, the data input then comprises only a few observations that are made between five and ten years apart.It has been demonstrated, however, that these forecasts can profit from a high temporal resolution that accurately models the history of each pixel [74].Furthermore, in order to apply time series forecasting methods on thematic data, the data first has to be converted into a numerical format.To achieve this, it would be necessary to temporally aggregate the discrete information, e.g., into "days of coverage" within a certain time interval.This, again, would require the use of high temporal resolution time series data.
Even with the increasing availability of data products from long-term operating remote sensing systems, generating use-ready products is a challenge.Forecasting happens as the last step of the entire remote sensing processing chain.Satellite imagery, sometimes from multiple data sources, first has to be acquired, calibrated, geometrically and atmospherically corrected, and image information extracted into analysis-ready datasets.Especially in time series generated from optical sensors, data gaps mainly resulting from cloud cover have to be closed.To perform the entire preprocessing of the data in order to generate a validated high-level product fit for forecasting is often not possible within the scope of a single scientific project or for policy and decision makers in governmental institutions.Hence, for applied forecasting there is a great need for use-ready, validated, high-level remote sensing data products with a high temporal resolution.There is an increasing number of such datasets available, but only the index and geophysical parameter time series from optical sensors such as MODIS and AVHRR are already widely used in EO-based forecasting, while thematic LULC products are usually still produced individually in each study.Despite the global coverage of many analysis-ready datasets, our review identified no study that attempted forecasting on a global scale.On the contrary, in most cases, forecasting was limited to local case studies, leaving the potential of global EO datasets yet unharnessed.
In summary, next to the inherent validation problem, EO-based forecasting especially faces challenges in terms of computational demand due to the need for long-time series of use-ready and multi-dimensional data products.Since there has been considerable progress in the field of Big Data in the past years, we see the potential for a rapid development of EO-based forecasting on large spatial scales in the near future.For example, the application of machine learning algorithms, such as decision and regression trees, artificial neural networks and deep learning, is becoming increasingly popular.That these methods are employed in all of the identified forecasting method categories indicates their versatility.In EO-based forecasting they are used for multivariate regressions to estimate crop yield or identify drivers of city growth, to establish regressions between time series of climate and vegetation data, as well as for auto-regressive time series modeling.At the same time, especially the ability to include multiple data sources, EO and non-EO, as well as the spatial information makes the use of machine learning popular.The example of the NARX neural network furthermore demonstrates that even the integration of multivariate regression and univariate auto-regressive time series modeling is possible, making use of both increased data variety and temporal coverage for accurate short-term to medium-range forecasts.The data required for such analyses is continuously expanded by the establishment of new remote sensing missions such as the satellites of the European Copernicus program as well as the continuation of already operational long-term missions such as MODIS, Landsat, and AVHRR.As it is independent from illumination and cloud conditions, we consider especially SAR data, e.g., from the Sentinel-1 sensor, to be a potential source of gap-free time series.Finally, the easy and fast processing of these multiple and large datasets will be increasingly facilitated by the further establishment of cloud computing solutions such as Google Earth Engine.
In this review we put a focus specifically on EO data and forecasting of parameters of the land surface and limited the scope of the literature review to journals with an EO and remote sensing focus accordingly.Due to the inconsistent terminology in many EO-based forecasting studies we had to include a wide variety of keywords in our search query, resulting in over 5000 potentially relevant items that we then had to screen manually for their actual relevance.We are aware of the fact that other relevant studies may have been published in journals that pertain to particular applications or have a broader scientific scope.We considered the inclusion of these journals, which would have increased the number of papers that would have to be screened disproportionally, though.We refrained from excluding some journals based on arbitrarily defined thresholds of citation indices such as the impact factor.Eventually, we considered our focus on EO and remote sensing journals to be a good compromise that well fits the topic of our review paper.We further argue that by choosing a review scope based on input data reflects an appropriate cross-section of applications and methods and allows the interested reader to dive deeper into a specific topic by consulting the literature presented in this review.

Conclusions and Outlook
By now, remote sensing sensors offer multi-decadal observations of almost the entire Earth's land surface.This allows the identification of long-term trends of land surface dynamics and, in the next step, enables estimations about possible future spatio-temporal trajectories.In this review, we therefore comprehensively assessed EO data-based scientific forecasting studies pertaining to all aspects of the land surface within the last decade.We identified a total of 143 relevant papers, among them four review papers focusing on certain applications and methods, in a selection of 16 high ranking remote sensing journals.We analyzed the studies with regard to the research focus and area, author affiliation, the forecasting methods applied, the temporal scope, and the type and the importance of EO input data.Our main findings according to our research questions outlined in detail in Section 1.2.are:


The strong impact of anthropogenic processes on the land surface is reflected in the research foci of the reviewed EO-based forecasting studies.In 57% of the cases land surface dynamics of the anthroposphere were the research focus of the reviewed studies, followed by applications pertaining to the biosphere (19%), hydrosphere (12%), lithosphere (6%), cryosphere (3%), and energy flux (3%).EO data is particularly frequently applied in the future modeling of crop yields and LULC dynamics, oftentimes in an urban context.In the biosphere, predominantly vegetation indices and parameters are forecast, while in hydrology the use of EO data is established in short-term flood forecasting.Further applications include the future modeling of permafrost conditions, shoreline dynamics and solar irradiance.


Researchers affiliated with institutions in the U.S. and China are the main contributors in EObased forecasting.As a consequence, a major part of the reviewed studies are conducted in either of these two countries, followed by India, Brazil, and Iran.In general, forecasts have been conducted especially in countries with high land surface dynamics or outspoken economic interests in these forecasts.For example, LULC change studies have been performed in countries with a high rate of urban development such as China, India or Iran, while crop yield forecasting is especially dominant in regions with large scale agriculture such as the U.S., Brazil or Ukraine.Due to pressing concerns regarding urban sprawl, flood and drought risk and food security, as well as a general data scarcity, we see great potential for EO-based forecasting especially in African countries.This potential remains still untapped which is reflected in the low number of authors affiliated with African research institutions identified in our review.Additionally, the potential of global datasets remains still unused.87% of all reviewed studies have been conducted on the local scale, while 7%, 4%, and 1% of the studies pertained to a national, regional, and continental scale, respectively.We identified no EO-based forecasting study on a global scale in this review.We attribute this fact to the computational challenges that still arise from the processing of large-scale geospatial data sets but expect that current developments in cloud computing and machine learning will facilitate global forecasting in the near future.


We identified a multitude of different forecasting techniques that made a categorization based upon input data and temporal forecasting mechanisms necessary to gain meaningful insight.Except for three studies all identified methods fit into our categorization scheme.The choice of which method to apply strongly depends on the variable that is to be forecast: While thematic variables such as LULC are predominantly modeled with self-learning iterative MC-based approaches, numerical variables such as indices or geophysical parameters can be forecast in time-lagged regressions, regressions based on external a priori projected data, or in autoregressive time series models.Time-lagged regressions (~33% of all applications) and selflearning iterative methods (~37%) are most frequently applied and are the dominant approaches for crop yield and LULC forecasting, respectively.We identify a trend towards the increased use of machine learning methods such as artificial neural networks and deep learning.We attribute their popularity especially to their ability to handle large multi-source datasets for multivariate modeling.Despite the increasing availability of EO time series, ML is more popular than univariate Box-Jenkins time series modeling approaches in which we see potential for accurate short to medium-term forecasts considering intra-annual variances.


For EO-based forecasting, either multi-temporal (~62%) or time series data (~30%) is used.A high number of observations, however, is used for establishing robust regressive relationships between multiple variables rather than for auto-regressive time series modeling and trend interpolation.Consequently, we see a still unused potential in spatio-temporal forecasting based on long remotely sensed time series products.EO-based forecasts have been made for lead times between a few hours up to nearly one hundred years.The forecasting horizon strongly depends on the forecasting method employed.Very short lead times based on time-lagged regressions as well as long lead times achieved with self-learning iterative methods and projection-based regressions dominate in the literature.However, we observed a lack of medium-range forecasts between one and ten years.We consider this forecasting range important because of two reasons: Firstly, this time span is the ideal range for policy and decision makers to act upon.Secondly, in this time range auto-regressive time series-based approaches are still capable of accurate intraannual modeling and could provide forecasts depicting seasonal dynamics. Forecasts are often accomplished by combining EO and non-EO data.This is especially true for regressive forecasts, while the share of EO data is higher in self-learning methods.Data from optical sensor systems dominate in EO-based forecasting.Especially data from sensor families that offer a long observation time and use-ready products like Landsat, MODIS, and AVHRR is employed.Due to their ability to generate gap free time series, we see great potential for the application of SAR data in EO-based forecasting once high-level products for long observation times are available.
In summary we identify three recent developments that may foster the research on EO-based forecasting in the near future: (1) Recent advances in data science have resulted in the development of innovative data analysis methods like machine learning, enabling new data-driven insights into the continuously growing wealth of EO data.(2) While the continuity of long-term operational EO missions is ensured, new sensor systems are launched regularly and add to the variety of EO data on nearly every aspect of the Earth's surface.(3) Recent advancements in cloud computing enable the fast processing of large datasets and will facilitate the EO-based forecasts on large spatial scales.As a result, we expect rapid developments in the field of EO-based forecasting in the coming years that will eventually enable planners and policy and decision makers to better deal with the challenges posed by global change.

Figure 1 .
Figure 1.The relevance of Earth Observation (EO) data for the forecasting of land surface dynamics.Global change impacts the Earth's surface in many ways.These dynamics directly affect the livelihoods of people and can be monitored with remote sensing satellites which generate time series of geospatial EO datasets.These can yield valuable insight into processes on the Earth's surface and enable forecasting.Reliable forecasts can inform policy and decision makers, planners, as well as society as a whole to take action to mitigate global change itself and its impact on the land surface.Several symbols modified courtesy of the Integration and Application Network, University of Maryland Center for Environmental Science (ian.umces.edu/symbols/).

Figure 2 .
Figure 2. Research topics of EO-based forecasting studies: (a) Number of studies published per year according to application sphere; (b) Share of reviewed studies according to each application sphere.

Figure 3 .
Figure 3. Number of EO-based forecast studies per country, geographical location and spatial scale of the respective study areas.One study may have been conducted on multiple study areas.

Figure 4 .
Figure 4. Number of EO-based forecast studies according to spatial scale of the study area.

Figure 5 .
Figure 5. Nationality of the first authors: (a) Number of first authors by nationality; (b) Total number of authors per nationality and continent.Nationality is determined by the research facility the first author is affiliated to.One author may have multiple affiliations and therefore nationalities.

Figure 6 .
Figure 6.Number of forecasting methods used, categorized by method type.One research article may employ multiple forecasting methods.

Figure 7 .
Figure 7. Number of forecasting methods employed by type per publication year.One research article may employ multiple forecasting methods.The red line shows the number of studies in each year using remotely sensed time series data as input (c.f.Section 3.4.).

Figure 8 .
Figure 8. Number of forecasting methods applied in each category of remotely sensed feature types.Examples for feature types are given in Table3.

Figure 9 .
Figure 9. Temporal properties of the input data in EO-based forecasts: (a) Temporality of input data per forecasting method; (b) Number of applications of time series input data per time series length in years.

Figure 10 .
Figure 10.Number of forecast applications and forecasting lead times according to the forecasting method categories.

Figure 11 .
Figure 11.Importance of EO data as percent of datasets used as input into the forecasting model.Categorized per forecasting method.

Figure 12 .
Figure 12.Remote sensing sensor types and sensor systems used for EO-based forecasting: (a) Number of sensor types applied; (b) Number of sensor system families applied.All aerial data used in the reviewed studies is optical.Some sensor systems offer multiple sensor types which is considered in (a) but not in (b).

Table 1 .
List of SCI journals included in the literature review and counts of relevant publications between 2010 and 2020.Numbers in bold contain review articles.Impact Factor according to Web of Science[39].

Table 2 .
Examples of reviewed EO-based forecast applications according to thematic spheres.

Table 3 .
Examples of forecast variables for each feature type.