Earth Observation Data Supporting Non-Communicable Disease Research: A Review

: A disease is non-communicable when it is not transferred from one person to another. Typical examples include all types of cancer, diabetes, stroke, or allergies, as well as mental diseases. Non-communicable diseases have at least two things in common—environmental impact and chronicity. These diseases are often associated with reduced quality of life, a higher rate of premature deaths, and negative impacts on a countries’ economy due to healthcare costs and missing work force. Additionally, they a ﬀ ect the individual’s immune system, which increases susceptibility toward communicable diseases, such as the ﬂu or other viral and bacterial infections. Thus, mitigating the e ﬀ ects of non-communicable diseases is one of the most pressing issues of modern medicine, healthcare, and governments in general. Apart from the predisposition toward such diseases (the genome), their occurrence is associated with environmental parameters that people are exposed to (the exposome). Exposure to stressors such as bad air or water quality, noise, extreme heat, or an overall unnatural surrounding all impact the susceptibility to non-communicable diseases. In the identiﬁcation of such environmental parameters, geoinformation products derived from Earth Observation data acquired by satellites play an increasingly important role. In this paper, we present a review on the joint use of Earth Observation data and public health data for research on non-communicable diseases. We analyzed 146 articles from peer-reviewed journals (Impact Factor ≥ 2) from all over the world that included Earth Observation data and public health data for their assessments. Our results show that this ﬁeld of synergistic geohealth analyses is still relatively young, with most studies published within the last ﬁve years and within national boundaries. While the contribution of Earth Observation, and especially remote sensing-derived geoinformation products on land surface dynamics is on the rise, there is still a huge potential for transdisciplinary integration into studies. We see the necessity for future research and advocate for the increased incorporation of thematically profound remote sensing products with high spatial and temporal resolution into the mapping of exposomes and thus the vulnerability and resilience assessment of a population regarding non-communicable diseases.


Relevance of Public Health Research
Public health is a crucial focus of every national economy, as the status of public health directly impacts the workforce and thus productivity of a country, as well as its resilience and social stability [1]. Low public health or high mortality rates impact every national economy negatively. The relevance of this topic is reflected in the UN's Sustainable Development Goals (SDGs) of 2015. Directly referenced in the third goal as "Good Health and Well-Being" and linked to other goals, the member states made clear that immediate action must be taken to globally increase good health [2].
Non-communicable diseases (NCDs) are an especially great burden on society. These are diseases that have no transmitting vector that spreads them from one individual to another. For example, typical NCDs are cancer, diabetes, stroke, allergies, or mental diseases [3]. The World Health Organization (WHO) reported that 71% of global deaths occurred due to NCDs in 2016 [4]. Furthermore, the economic losses that can be attributed to NCDs are enormous. The organization speaks of a likely US$7 trillion loss over the time span from 2011 to 2025 in low-and middle-income countries alone. As a result, global targets were set in 2014 to decrease NCD mortality and promote healthier lifestyles to meet also the SDG requirements. While some countries have made progress in the fight against NCDs, the WHO emphasizes the need for high-impact interventions in all countries. Explicitly proposed were national NCD targets and policies alongside monitoring frameworks to track progress over time [5]. Even though low-and middle-income country populations are especially vulnerable to NCDs, everyone is at risk regardless of age, country, or region [4]. The key risk factors can generally be split into behavioral risk factors (such as alcohol consumption or tobacco smoking) and metabolic risk factors (such as high blood pressure and obesity). Metabolic risk can partly be attributed to genetics but unhealthy lifestyles and especially environmental influences can add to that risk [6].
To effectively decrease the risk of NCDs, next to educational measures to positively impact and thus lower behavioral risk, we need to analyze human exposure to health-affecting environmental parameters. In particular, air, ground, and water pollution and the exposure to radiation (both ionizing and non-ionizing) impact people's health negatively [7]. Furthermore, there is evidence that exposure to noise and nocturnal light pollution have negative impacts on the immune system. The urban heat island effect and poor access to recreational green spaces have also been attributed to increased NCD health risks. Exposure to environmental and behavioral risk factors are oftentimes coupled. For example, the exposure to strong noise might lead to sleep deprivation and changed sleep-wake rhythms [8,9]. Another example is a behavioral risk factor: physical inactivity. Highly urbanized, high-income countries exhibit the highest ratio of affected people per capita in this regard (Top 3 affected are Saudi Arabia: 61%, Kuwait: 56.6%, Qatar: 41.6%) [5]. To mitigate this risk factor, the walkability of cities could be increased, which in turn would also mean less air pollution from traffic and increased neighborhood attractivity [5].

The Role of Earth Observation Data in Public Health Research
Satellite remote sensing (RS) is experiencing a golden age in times of high computing capacities for big data processing and a large amount of multi-sensor data available in archives that stretch back for several decades (e.g., >40 years for sensors such as Landsat [10] and the Advanced Very High Resolution Radiometer (AVHRR) [11], or 20+ years for Moderate Resolution Imaging Spectrometer, or MODIS). Additionally, many newly emerging remote sensing sensors allow for a level of spatial detail and temporal repetition that is unprecedented. Especially the availability of European Sentinel mission data (Sentinel-1, Sentinel-2, Sentinel-3, and Sentinel-5 Precursor) allowed for a big leap in this respect. Today, satellite-based remote sensing can provide global coverage at a high spatial (< 10 m) resolution with near-daily revisit time [12].
Environmental parameters that are relevant for public health can be monitored via a number of sensors, depending on the research focus and the required spatial and temporal resolution. Air pollution parameters are traceable through atmospheric measurements of e.g., the Total Ozone Mapping Spectrometer (TOMS), the Ozone Monitoring Instrument (OMI), the Tropospheric Monitoring Instrument (TROPOMI), and to some extent even the Moderate Resolution Imaging Spectrometer (MODIS). Atmospheric sensors provide high temporal resolution imagery but usually offer low spatial detail [13][14][15][16]. Land surface parameters (e.g., greenness) can be monitored at higher spatial resolution based on sensors such as Landsat and Sentinel-2. Furthermore, additional geoinformation products that were derived from Earth Observation (EO) data can contribute to the mapping of an individual's exposure risk. For example, the Open Street Map Data Set, OSM, is a global public crowdsourcing effort based on the highest resolution satellite data, which would be of unique value to the mapping of noise exposure. Therefore, information gathered from EO data can be used to analyze the individual's exposure to NCD promoting factors and supports the derivation of exposure or risk maps for entire regions. This is part of the extensive monitoring that is needed to better understand and lower our susceptibility to NCDs, especially in terms of environmental risk determinants [5].
In our review, we present the advances in this field over the past two decades. We concentrate on research that includes both EO data and health data to find answers to these research questions: • How has the synergistic research field of joint analyses of NCD and EO data evolved over the last 20 years? • Where are the global investigation hotspots regarding this topic?
• What time frames and temporal resolutions are commonly covered in NCD studies incorporating EO data? • On what spatial scale do the investigations take place?

The Role of Earth Observation Data in Public Health Research
Satellite remote sensing (RS) is experiencing a golden age in times of high computing capacities for big data processing and a large amount of multi-sensor data available in archives that stretch back for several decades (e.g., >40 years for sensors such as Landsat [10] and the Advanced Very High Resolution Radiometer (AVHRR) [11], or 20+ years for Moderate Resolution Imaging Spectrometer, or MODIS). Additionally, many newly emerging remote sensing sensors allow for a level of spatial detail and temporal repetition that is unprecedented. Especially the availability of European Sentinel mission data (Sentinel-1, Sentinel-2, Sentinel-3, and Sentinel-5 Precursor) allowed for a big leap in this respect. Today, satellite-based remote sensing can provide global coverage at a high spatial (<10 m) resolution with near-daily revisit time [12].
Environmental parameters that are relevant for public health can be monitored via a number of sensors, depending on the research focus and the required spatial and temporal resolution. Air pollution parameters are traceable through atmospheric measurements of e.g., the Total Ozone Mapping Spectrometer (TOMS), the Ozone Monitoring Instrument (OMI), the Tropospheric Monitoring Instrument (TROPOMI), and to some extent even the Moderate Resolution Imaging Spectrometer (MODIS). Atmospheric sensors provide high temporal resolution imagery but usually offer low spatial detail [13][14][15][16]. Land surface parameters (e.g., greenness) can be monitored at higher spatial resolution based on sensors such as Landsat and Sentinel-2. Furthermore, additional geoinformation products that were derived from Earth Observation (EO) data can contribute to the mapping of an individual's exposure risk. For example, the Open Street Map Data Set, OSM, is a global public crowdsourcing effort based on the highest resolution satellite data, which would be of unique value to the mapping of noise exposure. Therefore, information gathered from EO data can be used to analyze the individual's exposure to NCD promoting factors and supports the derivation of exposure or risk maps for entire regions. This is part of the extensive monitoring that is needed to better understand and lower our susceptibility to NCDs, especially in terms of environmental risk determinants [5].
In our review, we present the advances in this field over the past two decades. We concentrate on research that includes both EO data and health data to find answers to these research questions:

•
How has the synergistic research field of joint analyses of NCD and EO data evolved over the last 20 years? • Where are the global investigation hotspots regarding this topic? How relevant is EO in existing research efforts on the impact of the environment on NCD risk, and what are the major patterns with respect to the similarities and outcomes found in these studies?

Review Focus
In this review, we examine interdisciplinary approaches of NCD research using EO data and EO-derived geoinformation products. We specifically focus on studies that integrated remotely sensed Earth Observation data and categorize the studies based on the environmental parameters investigated, the spatial and temporal resolution, the sensors used, and the role of Earth Observation data within the overall mix of data available in the studies, as well as observable trends. There are some reviews that cover related topics. Some advocate the importance of EO data for public health in general [17] or NCD research specifically [18], while others reviewed methodologies for the analysis of health-relevant environmental parameters using EO. Specifically, we encountered reviews on the modeling of anthropogenic air pollution with high spatial precision [19][20][21][22] and on the impact of noise on the cardiovascular and metabolic system [23].
With an EO-based perspective and broad focus on all aspects of exposure risk, we reviewed all available articles on the topic that were published over the last 20 years following the methodology outlined in Section 2. Our results are presented in the subsequent Section 3. In Section 4, we discuss our findings before summarizing and concluding the review in Section 5.

Methodology of the Review
For this review, we concentrated on papers that had NCDs as their focus. We included all high-impact articles that based their findings jointly on public health data (mostly in the form of cohort studies) as well as Earth Observation data and geoinformation products derived thereof. As indicated in Figure 2, we used the Web of Science platform for parsing the search strings that we formulated based on certain prerequisites:

1.
The topic must be related to the topic of EO, remote sensing, or GIS 2.
It further must be related to NCDs 3.
Publication type must be "article" 4.
Language must be English 5.
Publication date must be between 2000 and 2020 Of this requirements list, the first and second point define the subject of interest. Selecting articles by adding "NCD" or "non-communicable disease" to the search string did not yield satisfying results because of the different writing styles used for NCDs in journal articles (e.g., noncommunicable, non-communicable, chronic diseases, non-transmittable diseases). We instead used a workaround by searching for common vocabulary in NCD papers such as "exposure", "health", or "allergies" and excluded articles that focused on infectious diseases. The literature search was conducted in the first quarter of 2020; therefore, only articles as recent as 20 March 2020 were included. Since the amount of database entries that we obtained based on our search strings was quite large (n > 1000), the preselected results were further filtered based on the Web of Science categories given in Figure 2. In this way, we sort out articles that were unrelated to the underlying subjects of spatial sciences and health Remote Sens. 2020, 12, 2541 5 of 34 sciences. We subsequently filtered the obtained studies based on the impact factor of publishing journal (threshold: 2.0) and by screening title and abstract, reducing the amount of considered articles further (n = 251). The final number of reviewed papers (n = 146) was obtained after excluding approaches that did not use both health and EO data.
We analyzed the remaining 146 studies to identify relevant parameters for this review, namely: •  These parameters will be presented in depth in the following section. Not all studies explained their methods equally detailed. Therefore, we had to generalize certain aspects to make them comparable (e.g., all Landsat satellites were pooled into one joint "Landsat" category; the MODIS sensors were pooled into one joint "MODIS" category, regardless of platform). Environmental parameters and health outcomes that were only included into very few studies (<5% of all investigated articles) were grouped into catch-all groups labeled as "Other".

Temporal Development of Studies Published over the Last Two Decades
Overall, non-communicable disease research employing both health data and Earth These parameters will be presented in depth in the following section. Not all studies explained their methods equally detailed. Therefore, we had to generalize certain aspects to make them comparable (e.g., all Landsat satellites were pooled into one joint "Landsat" category; the MODIS sensors were pooled into one joint "MODIS" category, regardless of platform). Environmental parameters and health outcomes that were only included into very few studies (<5% of all investigated articles) were grouped into catch-all groups labeled as "Other".

Temporal Development of Studies Published over the Last Two Decades
Overall, non-communicable disease research employing both health data and Earth Observation-derived geoinformation on environmental parameters is clearly a relatively novel field; however, it is impressively on the advance. Figure 3 depicts the number of studies that were published per year within our review timeframe. While we considered articles published between 2000 and March 2020, the oldest study matching the review criteria described in Section 2 was published in 2003. Most reviewed publications are from within the last decade. Starting with the year 2011, we see publications for each year with overall increasing numbers. Although the number of published research efforts fell from 2016 (n = 19) to 2017 (n = 9), we see an increase of publication numbers on the topic over the last years. The ongoing year 2020, for which only articles published in the first quarter were considered, has the fourth-highest number of reviewed publications per year (n = 10).
Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 34 however, it is impressively on the advance. Figure   Here, about 40 publications can be expected by the end of the year. This indicates a continuation of the positive trend we see for the recent years.

Spatial Distribution of Studies Investigated
We analyzed the distribution of publications globally, based on the first institution of the first author ( Figure 4A). At first glance, North America and East Asia stick out as hotspot regions of research. However, taking a closer look at Europe ( Figure 4B), we see considerable research activity on the subject as well, albeit with much lower publication rates for the individual countries compared to the top publishing countries: USA holds the largest share of articles (n = 62), followed by China (n = 31) (Taiwan counted separately) and Canada (n = 18) ( Figure 4A). Figure 4C shows a slightly different image of the world. Depicted here, we see the countries that were investigated, as the country of affiliation of the first author and the focus country of the study are often not the same. The highest values can still be seen for the USA, followed by China, and then Canada. However, in general, the values seem to be spread more evenly. Looking closely, portions of the African continent, Central Asia, South Asia, and South East Asia also appear in this graphic, though with low numbers of articles.
In the zoomed-in regional view of Europe ( Figure 4D), we see higher numbers than in Figure  4B, indicating that Europe has more often been subject to investigation with respect to this topic than the investigator. Especially the United Kingdom and Spain show a high number of published articles in which they were picked as the main area of interest. The representation difference of countries in Figure 4A may be explained by the difference of available medical cohort data and underlying differences in data protection regulations among countries. Large-scale accessible  Here, about 40 publications can be expected by the end of the year. This indicates a continuation of the positive trend we see for the recent years.

Spatial Distribution of Studies Investigated
We analyzed the distribution of publications globally, based on the first institution of the first author ( Figure 4A). At first glance, North America and East Asia stick out as hotspot regions of research. However, taking a closer look at Europe ( Figure 4B), we see considerable research activity on the subject as well, albeit with much lower publication rates for the individual countries compared to the top publishing countries: USA holds the largest share of articles (n = 62), followed by China (n = 31) (Taiwan counted separately) and Canada (n = 18) ( Figure 4A). Figure 4C shows a slightly different image of the world. Depicted here, we see the countries that were investigated, as the country of affiliation of the first author and the focus country of the study are often not the same. The highest values can still be seen for the USA, followed by China, and then Canada. However, in general, the values seem to be spread more evenly. Looking closely, portions of the African continent, Central Asia, South Asia, and South East Asia also appear in this graphic, though with low numbers of articles. In the zoomed-in regional view of Europe ( Figure 4D), we see higher numbers than in Figure 4B, indicating that Europe has more often been subject to investigation with respect to this topic than the investigator. Especially the United Kingdom and Spain show a high number of published articles in which they were picked as the main area of interest. The representation difference of countries in Figure 4A may be explained by the difference of available medical cohort data and underlying differences in data protection regulations among countries. Large-scale accessible patient's databases such as the Chinese National Scientific Data Sharing Platform for Population and Health (NSDSPPH) [24] for example could hardly be realized in accordance with privacy regulations in countries such as Germany.
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 34 patient's databases such as the Chinese National Scientific Data Sharing Platform for Population and Health (NSDSPPH) [24] for example could hardly be realized in accordance with privacy regulations in countries such as Germany.

Discipline of Studies Investigated
Public health research is a field that has its roots in medicine. By incorporating spatial data into the investigation, such as environmental parameters, it becomes a transdisciplinary subject. In Figure 5, we outline the disciplinary focus of publishing journals of reviewed papers. We used the disciplinary focus of the publishing journal as an indicator of the scientific background of studies. The reviewed studies were published in 46 journals with foci on joint multi-disciplinary, exclusively medical, or exclusively geospatial research.
The majority of the reviewed articles was published in multi-disciplinary journals ( Figure 5B). Especially, the journals "Environment International" and "Environmental Health Perspectives" are frequent platforms. While most studies were published in multi-disciplinary journals, only 14 different journals with this focus were chosen for publication. The variety of medical journals contributing to this review is larger (n = 28) ( Figure 5A).

Discipline of Studies Investigated
Public health research is a field that has its roots in medicine. By incorporating spatial data into the investigation, such as environmental parameters, it becomes a transdisciplinary subject. In Figure 5, we outline the disciplinary focus of publishing journals of reviewed papers. We used the disciplinary focus of the publishing journal as an indicator of the scientific background of studies. The reviewed studies were published in 46 journals with foci on joint multi-disciplinary, exclusively medical, or exclusively geospatial research. The majority of the reviewed articles was published in multi-disciplinary journals ( Figure 5B). Especially, the journals "Environment International" and "Environmental Health Perspectives" are frequent platforms. While most studies were published in multi-disciplinary journals, only 14 different journals with this focus were chosen for publication. The variety of medical journals contributing to this review is larger (n = 28) ( Figure 5A).

Temporal Coverage of Studies Investigated
We identified the temporal grain and extent of studies, based on the used EO datasets. The temporal grain is depicted in Figure 6A. We differentiated between uni-temporal, bi-, and multi-temporal approaches. Analyses with only one observation or only one long-term average were labeled as uni-temporal. Those with two distinct observations were labeled as bi-temporal. Approaches with more than two time-steps were labeled as multi-temporal. Figure 6B shows the time span covered per article. The oldest studies are listed at the top and the most recent ones are listed at the bottom; the start and end year are indicated on the x-axis. The color of each dumbbell represents if an article is classified as uni-, bi-, or multi-temporal.
While most studies incorporate temporal changes by using multiple RS observations, in over one-third of the investigated approaches, only uni-temporal EO sensing products are used. As depicted by Figure 6B, the temporal coverage of studies (indicated by line length) mostly does not serve as an indicator for temporal resolution (indicated by color shading). Excluded from this observation are research efforts that extend further back than 1990. These, in contrast, are all multi-temporal [25][26][27][28][29].

Temporal Coverage of Studies Investigated
We identified the temporal grain and extent of studies, based on the used EO datasets. The temporal grain is depicted in Figure 6A. We differentiated between uni-temporal, bi-, and multi-temporal approaches. Analyses with only one observation or only one long-term average were labeled as uni-temporal. Those with two distinct observations were labeled as bi-temporal. Approaches with more than two time-steps were labeled as multi-temporal. Figure 6B shows the time span covered per article. The oldest studies are listed at the top and the most recent ones are listed at the bottom; the start and end year are indicated on the x-axis. The color of each dumbbell represents if an article is classified as uni-, bi-, or multi-temporal.
While most studies incorporate temporal changes by using multiple RS observations, in over one-third of the investigated approaches, only uni-temporal EO sensing products are used. As depicted by Figure 6B, the temporal coverage of studies (indicated by line length) mostly does not serve as an indicator for temporal resolution (indicated by color shading). Excluded from this observation are research efforts that extend further back than 1990. These, in contrast, are all multi-temporal [25][26][27][28][29]. As an observable tendency, studies with longer time frames appear to be more often multi-than uni-temporal. Yet, while some studies investigate data that was accumulated over a long time via averaged values , other studies rely on time series with a high repetition rate over a relatively short time span [57][58][59]. Both the length of the line and its color give an indication of the research focus. While some studies investigate chronic diseases and their association with long-term exposure to certain environmental parameters (e.g., [41,51,55,[60][61][62]), others concentrate on health response to acute environmental changes such as for example heat waves [57,58]. While Figure 6B incorporates the recency of studies, it should be noted that the investigated time period of a study does not necessarily coincide with its publication year. Furthermore, even though some studies rely on data of ongoing investigations (e.g., cohort studies) or describe the intermediate status of ongoing research efforts, the end-year marker in Figure 6B is always smaller or equal to the publication year. No clear patterns are visible that would associate temporal extension with grain in the reviewed studies.

Spatial Coverage of Studies Investigated
The spatial coverage is chiefly dependent on the spatial coverage and resolution of their underlying medical and EO data. We categorized the studies we analyzed into broad groups based on geographic coverage, namely from small to larger into "local", "regional", "national", "continental", and "global".
• "local"-Studies that concentrate on one specific location (e.g., a city with at most suburbs in the direct vicinity) • "regional"-Studies that include multiple cities, counties, or even federal states/provinces but not an entire country • "national"-Studies that include at least one entire country but not an entire continent • "continental"-Studies that include an entire continent in their research • "global"-Studies based on data of more than one entire continent The classes serve as indicators of spatial extent and administrative scale of studies, although it must be noted that some regional studies cover a larger study area than certain national studies. For example, while Huss et al. cover the Canadian province of Québec (approximately 1.6 Mio. km²) [63], which falls into the category of a regional study, Marinaccio et al. cover the entire country of the Netherlands [64], putting the study into the "national" category, although the covered area is much smaller (approximately 0.04 Mio. km²). As an observable tendency, studies with longer time frames appear to be more often multi-than uni-temporal. Yet, while some studies investigate data that was accumulated over a long time via averaged values , other studies rely on time series with a high repetition rate over a relatively short time span [57][58][59]. Both the length of the line and its color give an indication of the research focus. While some studies investigate chronic diseases and their association with long-term exposure to certain environmental parameters (e.g., [41,51,55,[60][61][62]), others concentrate on health response to acute environmental changes such as for example heat waves [57,58]. While Figure 6B incorporates the recency of studies, it should be noted that the investigated time period of a study does not necessarily coincide with its publication year. Furthermore, even though some studies rely on data of ongoing investigations (e.g., cohort studies) or describe the intermediate status of ongoing research efforts, the end-year marker in Figure 6B is always smaller or equal to the publication year. No clear patterns are visible that would associate temporal extension with grain in the reviewed studies.

Spatial Coverage of Studies Investigated
The spatial coverage is chiefly dependent on the spatial coverage and resolution of their underlying medical and EO data. We categorized the studies we analyzed into broad groups based on geographic coverage, namely from small to larger into "local", "regional", "national", "continental", and "global".
• "local"-Studies that concentrate on one specific location (e.g., a city with at most suburbs in the direct vicinity) • "regional"-Studies that include multiple cities, counties, or even federal states/provinces but not an entire country • "national"-Studies that include at least one entire country but not an entire continent • "continental"-Studies that include an entire continent in their research • "global"-Studies based on data of more than one entire continent The classes serve as indicators of spatial extent and administrative scale of studies, although it must be noted that some regional studies cover a larger study area than certain national studies. For example, while Huss et al. cover the Canadian province of Québec (approximately 1.6 Mio. km 2 ) [63], which falls into the category of a regional study, Marinaccio et al. cover the entire country of the Netherlands [64], putting the study into the "national" category, although the covered area is much smaller (approximately 0.04 Mio. km 2 ).
An overwhelming number of analyses were conducted on a local (approximately 24%), regional (42%), or national scale (approximately 31%). Very few studies (approximately 2%) work on a continental or even global scale. Of the included research, only Garland et al. and Butland et al. analyzed NCDs globally, and even in their papers, not every country is represented through medical data [65,66]. The lack of available health data makes large-scale studies particularly challenging, as suggested by Jia et al. [67].
EO data is available globally with good spatial and temporal coverage, but since most reviewed studies rely on public health parameters such as cohort studies, and most cohort studies have investigation areas range from local to national scale, most investigations are therefore confined to national boundaries. Studies conducted on a larger scale rely on the few international project datasets available [66,68,69] or use aggregated data from various sources [65].

Remote Sensing Satellite Sensors Used within the Studies Investigated
All reviewed studies used EO or GIS products and combined those with medical data to gain insight into the effect of the environment on public health. All approaches rely on EO data, either by directly acquiring satellite data and working with custom-built products or through analysis-ready data or models that were derived using satellite data. Figure 7 outlines the popularity of different sensor types ( Figure 7A) and individual sensors or sensor families ( Figure 7B) in reviewed studies. Broadly, EO sensors are either active or passive. Active sensors emit electromagnetic (EM) radiation themselves toward Earth and detect the backscattered signal for analysis. Passive sensors, on the other hand, measure incoming EM radiation reflected or emitted from the Earth's surface. They do not emit signals themselves for measurement but instead gather reflected energy that stems from the sun (mostly in visible and near-infrared wavelengths) or was emitted from Earth (mostly thermal infrared) [70]. While the inner ring of Figure 7A presents the usage of active and passive sensors in the studies we reviewed, the outer ring subdivides used sensor types into optical and thermal (passive) versus radar (active) instruments. Furthermore, this figure shows which studies used multiple sensor types. Other sensor types that are not depicted in Figure 7 are not used in any of the reviewed studies.
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 34 An overwhelming number of analyses were conducted on a local (approximately 24%), regional (42%), or national scale (approximately 31%). Very few studies (approximately 2%) work on a continental or even global scale. Of the included research, only Garland et al. and Butland et al. analyzed NCDs globally, and even in their papers, not every country is represented through medical data [65,66]. The lack of available health data makes large-scale studies particularly challenging, as suggested by Jia et al. [67].
EO data is available globally with good spatial and temporal coverage, but since most reviewed studies rely on public health parameters such as cohort studies, and most cohort studies have investigation areas range from local to national scale, most investigations are therefore confined to national boundaries. Studies conducted on a larger scale rely on the few international project datasets available [66,68,69] or use aggregated data from various sources [65].

Remote Sensing Satellite Sensors Used within the Studies Investigated
All reviewed studies used EO or GIS products and combined those with medical data to gain insight into the effect of the environment on public health. All approaches rely on EO data, either by directly acquiring satellite data and working with custom-built products or through analysis-ready data or models that were derived using satellite data. Figure 7 outlines the popularity of different sensor types ( Figure 7A) and individual sensors or sensor families ( Figure 7B) in reviewed studies. Broadly, EO sensors are either active or passive. Active sensors emit electromagnetic (EM) radiation themselves toward Earth and detect the backscattered signal for analysis. Passive sensors, on the other hand, measure incoming EM radiation reflected or emitted from the Earth's surface. They do not emit signals themselves for measurement but instead gather reflected energy that stems from the sun (mostly in visible and near-infrared wavelengths) or was emitted from Earth (mostly thermal infrared) [70]. While the inner ring of Figure 7A presents the usage of active and passive sensors in the studies we reviewed, the outer ring subdivides used sensor types into optical and thermal (passive) versus radar (active) instruments. Furthermore, this figure shows which studies used multiple sensor types. Other sensor types that are not depicted in Figure 7 are not used in any of the reviewed studies.
All reviewed studies rely on passive sensors in nearly all cases exclusively ( Figure 7A). Only 1% (n = 2) additionally incorporate data from active sensors [56,71]. All used active sensors are synthetic aperture radar (SAR) sensors ( Figure 7A). These emit and detect EM signals in the microwave range [70].  All reviewed studies rely on passive sensors in nearly all cases exclusively ( Figure 7A). Only 1% (n = 2) additionally incorporate data from active sensors [56,71]. All used active sensors are synthetic aperture radar (SAR) sensors ( Figure 7A). These emit and detect EM signals in the microwave range [70].
Most approaches in the studies we analyzed rely on a combination of passive optical and thermal detectors (approximately 84%). A much smaller percentage of studies exclusively use optical sensors (approximately 15%). All studies that used SAR data incorporate data from optical and thermal sensors as well ( Figure 7A). Figure 7B shows the popularity of sensors (and sensor families) as relative shares of the total amount of reviewed papers; the colors of the individual columns represent the sensor type. Especially, MODIS sticks out as a frequently used sensor. Over 70% of included articles used MODIS data or products derived thereof. The second most used sensors are the ones onboard the Landsat satellites. About 17% of the considered research efforts use Landsat data. MODIS and the Landsat sensors all cover optical as well as thermal wavelengths. It must be noted that not all studies including sensors with thermal detection capabilities actually used temperature data in their analyses. A closer look at derived parameters will be given in the following subsection.

Environmental Parameters Analyzed in the Studies Investigated
To describe the environmental influence on NCD occurrence frequency, studies rely on spatial data in the form of EO data and often additionally include other geoinformation products. However, included covariates, such as socioeconomic status, lifestyle and demographic details, or the medical history of cohort participants for example, mostly exist as non-spatial datasets. Most studies working with cohort data incorporate EO-based environmental parameters in the form of point measurements. The coordinates of these measurement points are defined by geotagged locations of cohort participants. Depending on the study, participant locations mostly coincide with home or work addresses. Most studies using these coordinates associate them with individual pixel values from the EO-derived geoinformation products. Only a few studies produce area-wide exposure models based on EO parameters and validate those using NCD-occurrence information.
Many of the reviewed studies rely on non-spatial data alongside EO-derived and other spatial data in their analyses ( Figure 8A). In fact, not even one-third (approximately 28%) of the reviewed texts outlined methodologies that were purely based on spatial data. The rest included some form of non-spatial data. The used spatial datasets, which are presented in Figure 8B, can be divided into atmospheric and land surface datasets. The amount of studies that include both atmospheric and land surface measures is larger (approximately 45%) than either group solely focusing on one of those environmental aspects. Furthermore, we see that approximately 76% used atmospheric data, and only slightly fewer approaches (approximately 69%) included land surface parameters.
Of the studies that included atmospheric measurements into their analysis, 93% concentrate on air pollution parameters. Considerably fewer studies included atmospheric measurements purely in the form of meteorological data [8,9,48,49,56,72,73]. Only one of the reviewed study efforts measured pollen concentration, which does not belong to either one of the two introduced sub-groups [74].
An impression of the usage frequency of individual environmental parameters is given in Figure 8C. The environmental parameters are color-coded according to their broader categorization in Figure 8B. Not all parameters used in the reviewed studies are represented individually in Figure 8C. Measures that were included into less than 5% of reviewed studies were assigned to catch-all groups labeled "Other". The atmospheric sub-groups each have their own "Other" class in this figure. The group "Other atmospheric parameters" directly translates to pollen data. Land surface parameters were not subdivided into smaller categories. Measures used in fewer than 5% of included articles were aggregated in the class "Other land surface parameters" for this figure.
Concentrating on individual measures, we see the highest usage frequency for the atmospheric air pollution measure PM 2.5 (which stands for particulate matter with a diameter of approximately 2.5 µm, also referred to as fine dust). This parameter is included in approximately 69% of all reviewed texts. The second most frequently incorporated parameter is atmospheric optical depth (AOD), with approximately 54% of the approaches including this measure. Both parameters belong to the same sub-category of atmospheric measures. The third most used parameter is land use, which belongs to the group of land surface parameters. It is included in approximately 36% of reviewed studies.
Taking a closer look, we see that within the air pollution sub-group, only the two parameters AOD and PM 2.5 are the focus of more than 10% of reviewed studies. Additionally, the two parameters are usually coupled together, with approximately 70% of articles including both parameters. This is because AOD is often incorporated as a satellite-derived proxy of air pollution, and especially PM. Many studies include satellite-based AOD and calibrate the data with ground-based PM 2.5 measurements for large-scale analyses of PM abundance [25,28,29,31,34,35,[37][38][39][40][41][42][43][44]50,52,54,55,59,60,66,68,69,72,. This is often done following an approach of van Donkelaar et al. [119,120] that additionally includes meteorological parameters and atmospheric models for higher accuracy. In total, over 90% of studies including AOD use it to approximate PM concentration by calibrating against ground measurements. Only a small minority directly assesses PM-induced adverse health effects using AOD without prior calibration using ground measurements [121]. Along with AOD and PM, some studies enhance their analyses of atmospheric pollution by using O 3 and/or NO 2 data as well [28,36,[39][40][41][42]55,59,62,66,69,71,85,86,[90][91][92][93]97,105,110,113,[122][123][124][125][126][127][128][129][130][131][132]. Few studies consider other air pollution indicators, such as SO 2 or PM 1 [55,97,110,113,133]. Those parameters were aggregated into the category "Other air pollution parameters" in Figure 8C. Taking a closer look, we see that within the air pollution sub-group, only the two parameters AOD and PM2.5 are the focus of more than 10% of reviewed studies. Additionally, the two parameters are usually coupled together, with approximately 70% of articles including both parameters. This is because AOD is often incorporated as a satellite-derived proxy of air pollution, and especially PM. Many studies include satellite-based AOD and calibrate the data with ground-based PM2.5 measurements for large-scale analyses of PM abundance [25,28,29,31,34,35,[37][38][39][40][41][42][43][44]50,52,54,55,59,60,66,68,69,72,. This is often done following an approach of van Donkelaar et al. [119,120] that additionally includes meteorological parameters and atmospheric models for higher accuracy. In total, over 90% of studies including AOD use it to approximate PM concentration by calibrating against ground measurements. Only a small minority directly assesses PM-induced adverse health effects using AOD without prior calibration using ground measurements [121]. Along with AOD and PM, some studies enhance their analyses of atmospheric pollution by using O3 and/or NO2 data as well [28,36,[39][40][41][42]55,59,62,66,69,71,85,86,[90][91][92][93]97,105,110,113,[122][123][124][125][126][127][128][129][130][131][132]. Few studies consider other air pollution indicators, such as SO2 or PM1 [55,97,110,113,133]. Those parameters were aggregated into the category "Other air pollution parameters" in Figure 8C. The studies that included meteorological data mostly do not specify the exact composition of that dataset. For the studies that do go into detail on the meteorological data used, the air temperature has been found to be communicated most often (in approximately 9%), while all other meteorological parameters do not exceed the 5% threshold mentioned earlier and are therefore aggregated in the class "Other meteorological parameters" (Figure 8C). The studies that included meteorological data mostly do not specify the exact composition of that dataset. For the studies that do go into detail on the meteorological data used, the air temperature has been found to be communicated most often (in approximately 9%), while all other meteorological parameters do not exceed the 5% threshold mentioned earlier and are therefore aggregated in the class "Other meteorological parameters" (Figure 8C).
Of the studies making use of land surface parameters, most approaches focus on the parameters of "Greenness" or "Land use" (Figure 8C).
The category "Greenness" represents several measures that essentially focus on the same parameter. While most studies including greenness rely on the normalized difference vegetation index (NDVI) [26,43,45,46,53,56,59,62,71,73,74,81,91,97,113,122,124,126,[128][129][130], some opt for additionally including the enhanced vegetation index (EVI) [152], or pre-analyzed EO products that feature information on vegetation presence and type [56,159]. Land use is the most popular land surface parameter in the reviewed articles. Land use classifications not only discriminate by physical parameters such as land cover, but also provide information on the use of found classes. The additional thematical depth apparently makes land use information more popular than any other land surface parameter ( Figure 8C). Land use products are also used for regression-based modeling of air pollution, which usually yields daily predictions with R 2 values of 0.7-0.85 and Root Mean Square Errors (RMSEs) of approximately 2.5-15 µg/m 3 , depending on the model used [28,[37][38][39]43,59,71,72,75,76,80,81,92,99,100,102,104,113,117,118,[124][125][126]. Fewer investigated studies use land cover information (approximately 18%). Land cover products portray the Earth's surface in the form of abstracted classes, which are categorized based on measurements of remote sensing sensors according to the used classification scheme.
The remaining land surface parameters are used considerably less often in reviewed studies. Land surface temperature (LST) is the remaining parameter with the largest share (approximately 10% of reviewed studies) and often takes on a key role in the analyses of heat-related mortality [56][57][58]64,77,81,109,134,160,161]. Elevation, which is featured in a similar share of studies (approximately 8%), on the other hand, is exclusively used as a supporting land surface parameter and is never singled out as chiefly important for NCD occurrence [25,37,71,80,88,89,99,104,113,123,149,158]. The two remaining individually counted land surface parameters, light pollution (approximately 7%) and traffic volume (approximately 6%), are investigated even less often. However, most methodologies including these strongly associate them with investigated NCDs [8,9,33,54,56,63,71,75,81,124,141,145,148,[162][163][164][165]. Only one study investigated breast cancer incidences and light pollution and could find neither a correlation nor association [166].
Depending on the products needed in the analyses, researchers use different EO sensors that fulfill spatial, temporal, and spectral resolution requirements. We generally see that most studies chose optical and thermal sensors (presented in Figure 7).
Patterns we could reveal when looking at the measures derived from the most used sensors (all above 5% threshold) are displayed in Figure 9. Notice that the columns show portions of all usage cases of a certain sensor. They do not indicate differences in the usage frequency of individual sensors, as shown in Figure 7B.
Concentrating on the broader view provided in Figure 9, we can see that all sensors except for one are exclusively used for either atmospheric or land surface parameters. The sensors are listed by usage frequency. MODIS, the most used sensor, is the only sensor that is used both for atmospheric and land surface parameters. We see that a clear majority of MODIS use cases is concerned with atmospheric products. Products of Landsat and the Defense Meteorological Program Operational Line-Scan System (DMSP OLS) are purely land surface-oriented, while those of the Multi-angle Imaging SpectroRadiometer (MISR) and Ozone Monitoring Instrument (OMI) are purely atmosphere-oriented. fulfill spatial, temporal, and spectral resolution requirements. We generally see that most studies chose optical and thermal sensors (presented in Figure 7).
Patterns we could reveal when looking at the measures derived from the most used sensors (all above 5% threshold) are displayed in Figure 9. Notice that the columns show portions of all usage cases of a certain sensor. They do not indicate differences in the usage frequency of individual sensors, as shown in Figure 7B. Concentrating on the broader view provided in Figure 9, we can see that all sensors except for one are exclusively used for either atmospheric or land surface parameters. The sensors are listed by usage frequency. MODIS, the most used sensor, is the only sensor that is used both for atmospheric

Disease Groups in Focus within the Studies Investigated
All the reviewed research includes health data, and most of the studies reviewed had access to cohort studies and provide insights regarding possible associations of NCDs with environmental parameters. We present diseases that are investigated in at least 5% of the approaches individually in Figure 10. Investigated NCDs are often grouped together in reviewed studies (e.g., all types of respiratory diseases are simply addressed as respiratory diseases). We apply the same categorization in this review. For simplicity, we will be using the term NCD instead of NCD groups from now on. Figure 10A shows how often certain NCDs are part of the research focus. The presented NCDs show interrelations, and being affected by one can raise the susceptibility toward another [167][168][169][170][171][172][173]. Therefore, some studies focused on multiple NCDs at once [45,54,77,87,102,112,135,137,174]; thus, the sum of columns of Figure 10A is > 100%.
Concentrating on Figure 10A, we see that no individual NCD is focused on more than 20% of reviewed studies. The most investigated disease group is cardiovascular diseases (CVDs) (approximately 17%). CVDs span all NCDs that affect the heart and vascular system [175]. Specific outcomes that belong to the CVD category are heart failures (e.g., [38,87,157]) and strokes (e.g., [35,50,82,93,110,131]) as well as hypertension (e.g., [51,59,78,107,158]). The second most investigated disease group is respiratory diseases (approximately 15%). These include NCDs affecting the lung or general respiratory tract. For example, this includes asthma (e.g., [54,85,86,100,102,162]) and constricted breathing in general (e.g., [37,94]). The third most researched NCD is mental health (approximately 14%). Studies included in this category either focus on a mental condition (e.g., [45,104,117,151,153,154]) or increasing mental health (e.g., [91,138,140,155,176]). Nearly as many study cases (approximately 13%) focus on mortality related to environmental parameters. While mortality is not an NCD, preterm deaths are often linked to extreme events (e.g., heat waves) and existing NCDs (e.g., [56,58,89,109,160]). Adverse birth outcomes, which include all types of negative effects on childbirth associated with environmental parameters that are not attributed to infectious diseases, are investigated in approximately 10% of reviewed studies. Examples are relatively mild effects such as height-for-age (e.g., [46,115,122,128,177,178]) but also extreme cases leading to perinatal death [68,179]. The next smaller category is "Cancer" (approximately 9%). We included every study investigating cancer without further categorization, since some studies analyzed the environmental impact on total cancer risk [32]. Others concentrated on specific types such as breast cancer [33], lung cancer [28,180], or skin cancer [65]. The last individual category is "Diabetes" (investigated in approximately 7% of reviewed articles), in which studies focus on insulin resistance, prediabetes, gestational diabetes, or diagnosed diabetes of any type [34,39,52,111,181]. All other diseases (summed up approximately 27% of the studies) are not shown here because of the small number of research articles in which they were investigated. Figure 10B shows the relative amount of studies that investigate a certain NCD incorporating only spatial data or spatial and non-spatial data together. We see that for all NCDs, most studies use both spatial and non-spatial data for their analyses. Especially high shares of spatial and non-spatial data-using approaches are found for mental health, cancer, and diabetes research. While the highest shares of purely spatial data-oriented research are visible for CVD-investigating analyses. Figure 10C presents the usage of environmental data from the atmosphere or land surface domain per NCD. Most NCDs in Figure 10C show higher shares of investigations using atmospheric data than land surface-based approaches. Respiratory diseases, adverse birth outcome, and cancer research use atmospheric data in a large majority of cases (each > 85%), indicating an importance that has been found and accepted within the research community. In the case of diabetes-and CVD-oriented research, many studies rely purely on atmospheric data. On the other hand, mental health-focusing approaches are mostly land surface-oriented. In fact, purely land surface-oriented methodologies have the absolute majority here (they make up 55% of reviewed mental health-oriented works).
Looking at individual used measures ( Figure 10D) per NCD, we see that AOD and PM 2.5 are among the most included parameters across the board. However, taking a closer look, certain patterns can be identified. For CVD, PM 2.5 and its RS proxy AOD are the most included EO parameters, with approximately 70% of investigated studies including either parameter. In the land surface domain, the most used parameter is "Land use" (approximately 48%), followed by "Greenness" (approximately 40%). Studies investigating respiratory diseases, diabetes, or the aggregated The "Other" NCD class shows similar popularities in terms of the most used atmospheric and land surface parameters. Mental health research, which is dominated by land surface-oriented approaches, often incorporates greenness measures (approximately 60%) or land use (approximately 50%). All-cause mortality shows an exceptionally high amount of scientific efforts, including land surface temperature, as a considerable portion of papers concentrating on this topic investigate heat-related mortality [56][57][58]134,160,161]. Cancer research shows the highest diversity in atmospheric parameter usage. AOD and PM 2.5 are not as predominant here as e.g., CVD research. Especially many approaches include NO 2 or O 3 . Furthermore, cancer research shows high shares of studies including the land surface parameters "Light pollution" (approximately 15%).
Remote Sens. 2020, 12, x FOR PEER REVIEW 15 of 34 health-focusing approaches are mostly land surface-oriented. In fact, purely land surface-oriented methodologies have the absolute majority here (they make up 55% of reviewed mental health-oriented works).
Looking at individual used measures ( Figure 10D) per NCD, we see that AOD and PM2.5 are among the most included parameters across the board. However, taking a closer look, certain patterns can be identified. For CVD, PM2.5 and its RS proxy AOD are the most included EO parameters, with approximately 70% of investigated studies including either parameter. In the land surface domain, the most used parameter is "Land use" (approximately 48%), followed by "Greenness" (approximately 40%). Studies investigating respiratory diseases, diabetes, or the aggregated The "Other" NCD class shows similar popularities in terms of the most used atmospheric and land surface parameters. Mental health research, which is dominated by land surface-oriented approaches, often incorporates greenness measures (approximately 60%) or land use (approximately 50%). All-cause mortality shows an exceptionally high amount of scientific efforts, including land surface temperature, as a considerable portion of papers concentrating on this topic investigate heat-related mortality [56][57][58]134,160,161]. Cancer research shows the highest diversity in atmospheric parameter usage. AOD and PM2.5 are not as predominant here as e.g., CVD research. Especially many approaches include NO2 or O3. Furthermore, cancer research shows high shares of studies including the land surface parameters "Light pollution" (approximately 15%).

EO Data Relevance within Studies Investigated and Major Patterns of Outcome
In the following, the used EO datasets per study are compared to other included measures or datasets. With this, we aim at assessing the impact and relevance that EO data has for the study. The number of used EO datasets in comparison to all used data does not give information regarding the actual value EO provides per study. However, we assume that a larger number of EO data included points toward a larger explanatory impact of this data.
To account for EO data usage intensity, the data sections of reviewed articles were analyzed. This quantitative measure is described in the following and presented in Figure 11.

EO Data Relevance within Studies Investigated and Major Patterns of Outcome
In the following, the used EO datasets per study are compared to other included measures or datasets. With this, we aim at assessing the impact and relevance that EO data has for the study. The number of used EO datasets in comparison to all used data does not give information regarding the actual value EO provides per study. However, we assume that a larger number of EO data included points toward a larger explanatory impact of this data.
To account for EO data usage intensity, the data sections of reviewed articles were analyzed. This quantitative measure is described in the following and presented in Figure 11.
At first glance, we see that most approaches (approximately 62%) use less EO-derived information than other data ( Figure 11A). About 25% use equally many EO and non-EO datasets, while approximately 13% include more EO than non-EO data.
As visible in Figure 11B, EO data contributes more or less with an equal share to atmospheric or land surface-related studies. A very slight tendency toward less EO data integration can be seen for atmospheric studies.
Remote Sens. 2020, 12, x FOR PEER REVIEW  17 of 34 At first glance, we see that most approaches (approximately 62%) use less EO-derived information than other data ( Figure 11A). About 25% use equally many EO and non-EO datasets, while approximately 13% include more EO than non-EO data.
As visible in Figure 11B, EO data contributes more or less with an equal share to atmospheric or land surface-related studies. A very slight tendency toward less EO data integration can be seen for atmospheric studies. In Figure 11C, EO data contribution in atmosphere and land surface studies is presented in depth with respect to the individual parameters. We see that in atmosphere-related studies, EO data for all parameters is used slightly less than in land surface studies, but that especially atmospheric temperature and PM10-related analyses seem to be more EO data-intensive. For land surface-related studies, analyses focusing on LST's share of EO data are usually above 50%, whereas for the other parameters, it ranges around 50%. In Figure 11C, EO data contribution in atmosphere and land surface studies is presented in depth with respect to the individual parameters. We see that in atmosphere-related studies, EO data for all parameters is used slightly less than in land surface studies, but that especially atmospheric temperature and PM 10 -related analyses seem to be more EO data-intensive. For land surface-related studies, analyses focusing on LST's share of EO data are usually above 50%, whereas for the other parameters, it ranges around 50%. Figure 11D depicts EO data use with respect to individual disease groups. The distribution of shares is not as homogeneous as that in Figure 11B. Clearly, we can see a more pronounced variability of EO data integration. Studies dealing with all-cause mortality and adverse birth outcomes used a large amount of EO data. For all-cause mortality studies, over 40% of investigations used even more EO datasets than other datasets. In contrast, very little EO-derived information is integrated into diabetes-related studies, for example.
In the following, we categorized the major findings and overall outcomes of the studies investigated. We assess the impact that environmental parameters (both atmospheric and land surface) have on NCD risk. For that, we looked at how often past studies associated investigated NCDs with individual environmental parameters. This is displayed in Figure 12. Environmental parameters are listed on the left side of the graphic, and NCDs are listed on the right.
The listed NCDs on the right side of Figure 12 are now addressed one by one. Starting with respiratory diseases, we see prominent associations with air pollution parameters such as NO 2 , O 3 , and PM 2.5. All air pollution parameters that are associated with respiratory diseases have a negative influence on public health. The design of studies focusing on respiratory diseases is quite similar. For example, the majority of studies focus on big cities such as Beijing, Boston, Mexico City, or Melbourne, and they usually had access to large cohort datasets of several 10,000 participants. Often times, pregnant women and also children were a focus within the cohort, as they are considered more vulnerable. Especially in children, respiratory illnesses are strongly associated with PM, even more so, when high PM 2.5 exposure occurred during a sensitive prenatal time window in mid-pregnancy. Apart from air pollution, respiratory diseases can also be related to vegetation exposure, especially when associated with exposure to vegetation-related allergens [63,126]. For example, this was found by Andrusaityte et al. [126], who analyzed the health data of 1500 children in Lithuania. The authors related EO-derived greenness with asthma occurrence, finding a slightly increased risk in regions with high greenness. This is backed up by Lambert et al. [74], who investigated NDVI data and data on asthma occurrence, and they also found strong correlations between the two. Furthermore, Dadvand et al. [137] found similar associations but confined to specific types of green spaces (for example, city parks). The study suggested that the vegetation composition in such green spaces (e.g., a majority of grasses) can be less favorable to human health than naturally vegetated areas.
The NCD group "Cancer" is mostly connected to PM 2.5 exposure, UV irradiation, and light pollution. All of the cancer-oriented studies in which PM 2.5 is identified as an associated factor agree on its negative health impact [40,41,88,92,112,125]. These authors all investigated different types of cancer, therefore indicating that PM 2.5 has negative health consequences for the complete human organism (e.g., not only lung cancer occurrence as might be expected).
The second most frequently attributed environmental parameter to the Cancer group is UV irradiation. For studies that outline an association of UV irradiation and cancer occurrence, different authors come to slightly differing conclusions. All authors agree on the fact that exposure to UV radiation increases the risk of skin cancer, and most authors solely investigated skin cancer as a possible consequence [25,32,65]. Only Lin et al. further investigated the possibility of general increased cancer risk; based on an extensive cohort dataset including over 400,000 participants in USA, they found that UV radiation in fact only increases the risk of skin cancer. Other cancer types show no or only a very slight correlation with UV [30]. With respect to light pollution, it is debated whether nocturnal light pollution may have an impact on increased breast cancer risk [33,182] or not [166].
Remote Sens. 2020, 12, x FOR PEER REVIEW 19 of 34 Figure 12. Correlations between EO-based environmental parameters and investigated NCDs. Note that atmospheric optical depth (AOD) is not shown on its own, it is accounted for by PM2.5 and PM10. Similarly, air and land surface temperature are shown as one group. Investigating mostly measured land surface temperature (LST) and calibrating the results using air temperature measurements, resulting in one joint temperature product.
Particularly, Bauer et al. point out that although correlation between nocturnal light pollution and breast cancer incidences exists, causality cannot automatically be assumed. They argue that differences in exposure due to different lifestyles may play an important role for breast cancer risk [33].
Diabetes investigations often find PM2.5 values to be associated with cohort study outcomes. High PM2.5 concentrations are identified as a negative influence on health, increasing the risk of developing the disease [34,39,52,55,111]. It is worth noting this risk is increased even when the values are below the allowed thresholds of the cohorts' individual countries (mostly in China and Canada). However, here again, the difference between correlation and causality needs to be taken into account. While PM2.5 exposure definitely increases the risk, a lot of other-especially behavioral-factors (e.g., nutrition) play a role. On the other hand, vegetation exposure was identified as a mitigating factor that decreases the risk for diabetes [59,78]. This might be attributed to usually higher rates of physical activity for people living near green spaces. Correlations between EO-based environmental parameters and investigated NCDs. Note that atmospheric optical depth (AOD) is not shown on its own, it is accounted for by PM 2.5 and PM 10 . Similarly, air and land surface temperature are shown as one group. Investigating mostly measured land surface temperature (LST) and calibrating the results using air temperature measurements, resulting in one joint temperature product.
Particularly, Bauer et al. point out that although correlation between nocturnal light pollution and breast cancer incidences exists, causality cannot automatically be assumed. They argue that differences in exposure due to different lifestyles may play an important role for breast cancer risk [33].
Diabetes investigations often find PM 2.5 values to be associated with cohort study outcomes. High PM 2.5 concentrations are identified as a negative influence on health, increasing the risk of developing the disease [34,39,52,55,111]. It is worth noting this risk is increased even when the values are below the allowed thresholds of the cohorts' individual countries (mostly in China and Canada). However, here again, the difference between correlation and causality needs to be taken into account. While PM 2.5 exposure definitely increases the risk, a lot of other-especially behavioral-factors (e.g., nutrition) play a role. On the other hand, vegetation exposure was identified as a mitigating factor that decreases the risk for diabetes [59,78]. This might be attributed to usually higher rates of physical activity for people living near green spaces. Cardiovascular diseases, CVDs, exhibit similar patterns. All studies that outline the association of PM 2.5 with CVDs agree that high exposure increases the risk of CVDs [29][30][31]38,50,53,55,78,87,107,[111][112][113]158]. While all demographic groups are affected by this, studies found especially significant associations between PM 2.5 exposure and strokes in young adults. On the other hand, all studies that investigate the influence of green spaces on CVD occurrences agree that vegetation in the participants' vicinity decreases the risk of CVD onset [59,71,135,148,157,174]. Other environmental parameters have seldomly been associated with CVDs. However, one study that may be particularly noteworthy investigates the association of heat waves with CVD incidences. In this study, correlations between CVD risk symptoms and the ownership of air conditioning were analyzed. The authors conclude that due to lower heat exposure inside, persons are less used to acclimating themselves to high temperatures. This may leave them vulnerable to high temperature events outside, triggering CVD risk symptoms when exposed to sudden heat stress [129]. Last but not least, the impact of traffic noise on CVD risk was investigated. Studies were mainly undertaken in Europe, and each was based on a study cohort of approximately 4000 participants. An effect of traffic noise on the study population was seen, but the correlation with CVD risk was not significant [164,165].
Adverse birth outcomes are strongly associated with PM 2.5 concentration in all studies focusing on air pollution parameters [48,75,79,106,115,177]. The studies were mostly based on large (>100,000 participants), nation-wide Chinese cohorts with a focus on pregnant women. There seems to be a number of possible effects that may stem from high PM exposure. Especially fetus size may be affected. Further studies see an increase in preterm births associated with PM 2.5 exposure [48,106,133,177]. As for other NCDs, surrounding green had a generally positive health effect, mitigating the risk of adverse birth outcomes [46,122,124,142,150]. However, most of these studies took place in regions where high neighborhood greenness is generally associated with high socioeconomic status (USA, Canada, Europe).
All-cause mortality was mainly attributed to PM 2.5 and heat events. In all studies in which the relationship between PM 2.5 and all-cause mortality is in focus, increasing mortality risk from PM 2.5 exposure is reported [38,60,72,83,108,112,130,179]. Goyal et al., who investigate PM 2.5 exposition and also the specific composition of the particulate matter, observed that only carbonous PM 2.5 has significant negative health impacts. Naturally occurring dust and sea salt with the same grain sizes could-in his study-not be attributed to higher mortality [179]. Of the studies focusing on temperature as an influential environmental factor, a majority found clear associations with all-cause mortality [56][57][58]86,134,160]. This is especially the case when high temperatures remain high overnight, which is particularly likely in urban environments because of the urban heat island effect. Dousset et al. point out that observed mortality rates respond principally to increases of nighttime temperature [134]. Likewise, it has been found out that temperature variance is also associated with an increase in mortality. All investigations concerned with the effect of high temperature on the human body point in the same direction: Sudden stress that forces abrupt acclimatization is most harmful and therefore associated with negative health outcomes. This holds true especially when there are no sufficiently long phases of relaxation in between stressing events. Vegetation is reported to have a mitigating influence on mortality risk, which may also be due to its cooling effects on the urban heat island [56,130,143].
In mental health research, PM 2.5 was always identified as a risk factor [42,44,98,102,117]. Negative effects associated with PM 2.5 exposure include depressive symptoms and autism. The latter was especially found in children. Findings were based on cohorts from around the world, but particularly Al-Hamdan et al. [44] base their findings on a large study cohort of >1,000,000 US children. Exposure to green spaces is generally identified as an environmental parameter that increases mental health [26,45,73,104,136,140,144,146,147,149,151,153,154,176]. Especially, one Danish study sticks out in this regard because of its extremely large medical dataset of approximately 1,000,000 participants (nearly 20% of the entire country's population). The authors found clear associations of good mental health in people who had high vegetation cover in their vicinity while growing up. It is pointed out that the strong associations found remain significant even after correcting for socioeconomic factors and parental medical history. This excludes possible cross-correlations of behavioral and genetic risk that may influence the study's outcome [26].
The next NCD group in Figure 12 ("Other") consists of multiple smaller NCDs that were only addressed in a single study (e.g., individual studies on short sightedness, sleep deprivation, or obesity). For all these individual studies, PM 2.5 proved to increase NCD risk. Furthermore, similar to for other NCDs, vegetation cover decreased NCD risk, especially in elderly [45,141,147,152]. For two individual studies, focusing on sleep deprivation (insomnia), light pollution was found to be an environmental parameter increasing NCD risk [47,163]. Both studies based their results on observations from large cohort studies (each > 100,000 participants) in highly urbanized environments in Korea and the USA, respectively.

Discussion
In this review, we analyzed 146 studies that investigated non-communicable diseases (NCDs) as well as the human exposome strongly impacting health, with the help of Earth Observation (EO)-derived geoinformation products. The studies were selected based on publication type, impact factor, and data sources. Details on methodology and research foci were aggregated and categorized to work out the similarities and differences. Results show a strong increase in research activity regarding the subject with a high count of articles from the USA and China, but also a rising number of publications from Europe. Whereas before 2010, only 1-2 papers per year were published on the synergistic analyses of health and EO data, in 2019, already 38 studies that matched our search criteria can be found. The chosen selection criteria for reviewed studies were developed to present the most prominent scientific efforts in the field over the last 20 years. However, it has to be noted that this methodology may have omitted individual studies on the topic that do not fulfill these criteria. It is possible that the number of scientific works would be higher if, for example, all publishing platforms and languages were considered.
That being said, we still clearly see the signs of an emerging field where more and more research teams develop expertise in both associated fields (environmental medicine as well as mapping the exposome supported by Earth Observation data). Most of the studies investigated are based on medical data in the form of cohort studies. These studies largely incorporate environmental parameters in the form of point measurements. The coordinates of these measurement points are defined by geotagged locations of cohort participants. Depending on the study, participant locations mostly coincide with home or work address. Therefore, most studies used these coordinates and associated them with individual pixel values from the EO-derived geoinformation products (NDVI, LST, etc.) without generating geoinformation with full spatial coverage of the cohort extent. While this approach can provide accurate results for investigations with a focus on exposure levels per individual, exposure maps for entire areas are rarely generated, but they may be valuable for future NCD research.
So far, the most used EO datasets in reviewed studies are freely and globally available MODIS or Landsat data. Both data sources have rich archives with over 20 years of continuous data, which makes them especially useful for this subject. Newer sensor fleets, such as Sentinels 2 and 3 for example, offer similar data at a higher resolution, although the operation time span for these sensors is still comparatively short. However, for future research, Sentinel products can be a great alternative to the more conventional products because of their higher spatial and temporal resolution. Especially in heterogeneous environments, such as urbanized regions, this may be beneficial for the spatial accuracy of EO-based environmental parameters and therefore the reliability of the NCD associations found.
However, when working with high-resolution spatial information on health-affecting environmental parameters, ethical challenges must be considered. Especially when working with medical data such as cohort studies, the resulting products should not interfere with privacy concerns or patient anonymity. Adding to these concerns that have to be considered, risk maps and exposure models are sensitive data that may-for example-have an impact on the real estate sector. On this basis, low-exposure neighborhoods may see rapid gentrification and rises in real estate value, while high-exposure areas may be faced with the opposite situation. Challenges that come with such information should be kept in mind. The higher the studies' spatial resolution, the more this applies.
Another point worth discussing is the difference between correlation and causality. Most studies find clear positive or negative correlations between disease occurrences and environmental parameters. However, this does not necessarily mean that the environmental parameters are clearly the explanatory variable for the disease. Here, one must be aware that correlations by no means indicate causality. For example, in many places of the world, a stronger access to inner city green spaces has been found to be beneficial for overall health-however, at the same time, we have to be aware that urban areas with much green space are usually higher income areas, where factors such as higher education levels, higher quality nutrition, larger living space, etc., might also have positive impacts on the overall health condition. Many authors of the studies investigated are aware of these facts, and several try to rule out the impact of socioeconomic factors via normalization procedures. Nevertheless, the large mix of influences of the exposome, genome, and behavioral factors are so complex that as much additional data as possible need to be integrated to move from correlation to causality.
Finally, NCD research using EO parameters is a strongly transdisciplinary topic, but most studies still integrate only a small amount of EO-derived environmental parameters. Although only a relatively low number of articles have been published in the field, a rapid increase of interest over the last few years was observed. We interpret this as a sign of an emerging research topic with a large so far untapped potential that relies on transdisciplinary expertise and data. So far, most studies that focus on this topic approach it from a medical point of view. Fewer studies investigate the relationship between NCDs and the environment from an EO perspective and provide real, area-covering spatial information such as exposure risk maps. Research groups that focus on both NCD research and Earth Observation are still relatively few. However, especially in current times, the need to establish a relationship between geospatial data and disease data is more pressing and relevant than ever. We expect-not only for NCDs but for all types of diseases-a further rising demand and increasing necessity to integrate Earth Observation data into disease analyses and models.

Conclusions and Outlook
The exposome is defined as the sphere that a human is exposed to-including all environmental factors, which might impact health in a negative way. Environmental risk factors with adverse health effects range from air pollution, over extreme heat, to strong noise exposure. The number of studies on the topic is somewhat limited by the availability of medical data that can be incorporated with EO data to find associations of environmental parameters with NCDs. This also reflects the fact that the majority of studies address a local or regional scale (e.g., communal level, federal or provincial level), a considerable amount address the national level, and only very few studies cover an even larger scale. This can be attributed to the fact that the medical data included in all reviewed studies is for the most part restricted to a national or even smaller scale. Research with a continental or global focus is rare, as national health data is seldom comparable, as methodologies and standards of sampling and health data collection differ from country to country.
In our research, we found that a considerable amount of studies uses EO data on both atmospheric and land surface parameters to map the exposome of the sampling group, and that many studies even accounted for the temporal variability of environmental factors. EO data from optical and thermal sensors are clearly favored for this subject, whereas SAR data is only seldom included in reviewed approaches.
Atmospheric parameters are slightly more often included in the joint analyses than land surface parameters. Typical atmospheric parameters integrated into NCD studies are atmospheric optical depth, ozone, nitrous oxides, dust and fine dust, concentrations, and air temperature. Typical land surface parameters investigated are land cover, land use, greenness, light pollution, land surface temperature, and elevation, to name the most relevant ones. Then, the environmental risk associated with these parameters (e.g., extreme levels of fine durst, or extreme heat events, etc.) is investigated with respect to the occurrence of NCDs. Typical NCDs investigated in the studies are cardiovascular diseases, respiratory, diseases, cancer, diabetes, mental health disturbances, and adverse birth effects, as well as overall mortality.
Next to health information from cohort studies and EO-derived geoinformation, also other data are often integrated into the analyses (socioeconomic data on income, education level, and nutrition). On average, and simply based on the number of datasets that were used in the investigated studies, 40% of data stem from Earth Observation satellites. Over the investigated two decades, we see an increasing trend of this proportion.
Nearly all studies integrating atmospheric parameters exclusively dealt with anthropogenic-induced air pollution (AOD, O 3 , PM 2.5 , PM 10 ), whereas only one focused on pollen concentration (not considered an air pollutant). Common to these studies is that air pollution measures are overwhelmingly associated with negative health impacts, which is predominantly the occurrence of respiratory and cardiovascular diseases. Bad air quality also has a correlation with the occurrence of cancer, diabetes, and mental health as well; however, there are fewer studies focusing on the association of air pollution with these NCDs. Correlations that were found between these atmospheric parameters and negative health impacts were overall very strong and significant. Elevated air temperature, especially during extreme heat events in the summer, was often associated with a higher all-cause mortality especially in the elderly living in urban environments. Studies with a focus on temperature usually integrated atmospheric and land parameters-namely air temperature and land surface temperature (LST). Interestingly, it was found that especially nighttime air temperature and LST play an especially crucial role. Vegetation cover is mostly identified as a mitigating factor regarding NCD risk. It was found that the greener the neighborhood of a person, the lower the risk for disease. Especially the occurrence of mental health-related diseases increases rapidly if people live in environments with no access to green space.
Even though most reviewed studies found associations between environmental parameters and NCD risk, we see that some of the potential provided by Earth Observation has been left untapped. Higher spatial and temporal resolution datasets than the ones presented by many studies are available and should see an increase in use over the next years. Furthermore, we see a high potential in employing EO data time series over several years or even decades and correlate these with long-term cohort studies to model the development of health risk under a changing exposome over time.
We postulate that the potential lying within the joint analyses of medical data and EO data in a synergistic manner is still relatively untapped. Even if these transdisciplinary approaches are clearly on the advance, novel pathways need to be explored. The association of environmental parameters and health outcomes is currently often analyzed with a focus on exposure values for individuals rather than an area-wide focus. This means that in most studies pixel-related information was extracted per cohort participant (e.g., PM, NDVI, LST, etc.), but no area-wide mapping took place.
Advanced spatial models and predictions validated by medical data are a possible future trend. The few research efforts that already incorporate EO with NCD research are able to achieve area-wide monitoring of health-affecting environmental parameters. We are certain that its resulting products will be important not only for the health sector but also the decision makers beyond it. Application of such exposome monitoring includes the prediction of in-and out-patient visits in hospitals attributable to environmental triggers but also in risk minimization efforts of health insurance companies or individuals. However, in the complex landscape of the health sector, such an advance could trigger a variety of adjacent processes, both desirable (e.g., individualized patient care in hospitals, intensification of green spaces to increase public health, etc.) and undesirable (e.g., intrusion into patient privacy, negligence of neighborhoods with high environmental pollution, etc.).
We see the need for more research focusing on the development of area-wide exposure maps and models, as high-resolution geoinformation layers for large areas can easily be derived. Then, cohort localities of individuals can be placed within these layers. However, this has to be done in a fashion that respects patient anonymity and with respect to the societal sensitivity that such accurate exposure models hold.
Author Contributions: C.K. and C.T.-H. jointly developed the idea for this research and developed the structure of the manuscript. P.S.-supervised by C.K. and C.T.-H.-created the literature database, analyzed the articles, generated the figures and wrote an initial version of the manuscript. All authors jointly subsequently discussed, iterated, and rewrote sections of the manuscript until finalization. All authors have read and agreed to the published version of the manuscript.
Funding: This work was undertaken in the context of the Helmholtz-Climate Initiative Project (HI-CAM), funded by the Helmholtz Association, HGF.