1. Introduction
The Normalized Difference Vegetation Index (NDVI) is arguably the most widely implemented remote sensing spectral index for monitoring Earth’s land surface. Since the earliest report of use in 1973 [
1,
2], the term NDVI is found in nearly 121,000 scientific articles, conference papers, and books (Google Scholar). The index capitalizes on the optical properties of the cellular structure of leaves; the photosynthetic pigments (chlorophyll, associated light-harvesting pigments, and accessory pigments) efficiently absorb radiation in the visible range of the spectrum (to power photosynthesis) and reflect radiation in the near-infrared (NIR) range. The simple formula of NDVI and its direct relationship to vegetation photosynthetic capacity is a proxy for a wide range of essential vegetation characteristics and functions (e.g., fraction of photosynthetic radiation absorbed by the canopy, leaf area, canopy “greenness”, gross primary productivity) with countless applications in agriculture, forestry, ecology, biodiversity, habitat modeling, species migrations, land surface phenology, earth system processes (nutrient cycling, net primary productivity, evapotranspiration), and even economic, social, and medical sciences.
Satellite remote sensing (SRS) allows for the calculation of NDVI globally at a range of temporal intervals and spatial resolutions dependent on sensor characteristics and the satellite orbit, with a common inverse relationship between temporal and spatial resolutions. The Landsat Mission, with its first sensor launched in 1972, is the only uninterrupted long-term (>30 years) high-resolution remote sensing dataset that can provide a continuous historic NDVI record globally. The Landsat record at 30-m resolution is ideally suited for local or regional scale time-series applications, particularly with the recent release of higher-level surface reflectance products from Landsat sensors 5 ETM, 7 ETM+, and 8 OLI from 1984 to present. Utilizing these products across scenes and through time, however, is not without complications [
3], particularly for users without GIS and Remote Sensing training and resources. To create consistent mosaics or long-term time series, users must account for data record gaps, radiometric differences across sensors [
4], scene overlaps, malfunctions (e.g., the Landsat 7 scan line corrector malfunction), and inherent noise (due to clouds, atmospheric contamination, missing auxiliary data, etc.). As the region of interest and temporal extent increases, data volume and compute processing needs present significant barriers to many users without access to high performance computing facilities or the necessary skills to manipulate such data. These limitations often prevent the implementation of such a dataset in ecological studies, conservation monitoring efforts, or teaching exercises despite the clear value of its application.
The rise of high performance computing clusters, public access to supercomputing facilities and cloud computing and storage removes many of the computational barriers associated with Landsat data. The ability to create user friendly applications that interacts with these computing services eliminates additional barriers associated with data manipulation and enables users with minimal technical coding skills to access and process data. We capitalize on the abilities of high performance computing resources and web-based software to provide a Landsat derived conterminous U.S. (CONUS), 30-m resolution, NDVI product (
Figure 1). We use Landsat 5 ETM, 7 ETM+, and 8 OLI sensors, with a user specified climatology (historic NDVI value limited by a user-defined time-period) for temporal smoothing, and Google Earth Engine (a cloud-based geospatial platform for planetary-scale data analysis) for rapid data processing and visualization [
5], to produce 16 day NDVI composites from 1984 to 2016. We validate the NDVI product by comparing against other established remote sensing products across multiple spatial scales. The resulting NDVI record enables greater use of Landsat data in answering crucial ecological questions across broad spatio-temporal scales at a higher level of spatial detail than possible with other currently available NDVI products. While Landsat composite products exist (e.g., the Web Enabled Landsat Data product [
6] and the ability to create simple mean/median/max composites) our product improves upon these with the novel gap-filling and smoothing approaches (
Figure 2). Additionally, we make the composites available through a dynamic web application, allowing users to customize key parameters to produce NDVI composites more suited to specific regions or ecological questions.
4. Discussion
The first-ever 16-day continuous and customizable Landsat derived NDVI composites produced here (30 m resolution for CONUS; 1984–2016) overcome many of the previous barriers of working with Landsat imagery (e.g., obtaining current or historical images; managing overlapping scenes; image storage and processing; etc.), permitting ecologists to focus time and effort on specific questions rather than data/imagery manipulation. The composites are well correlated with other observational benchmarks, including in situ phenocam observations of local vegetation conditions and coarser satellite observations from MODIS (MOD13Q1), demonstrating product capabilities for tracking greenness trends from local to regional extents. Fine spatial resolution products such as these, with a longer historical record (
Figure 3), open the door to numerous analytical possibilities and applications, ranging from change detection (
Figure S3) to conservation monitoring to ecosystem assessment [
32,
33,
34]. The ability to customize the NDVI composite, per user specification, grants the use of a priori knowledge of the region to obtain the most suitable composite for the question at hand, producing an application ready product without the need for post-processing.
As with all remotely sensed products, the scope of Landsat derived NDVI has limitations, and is best suited for local or regional applications, where incomplete data are minimized due to a smaller spatial extent. Due to the infrequent return time of Landsat observations, data may be limited during the 16-day compositing period; cloudy pixels or the lack of surface reflectance images will reduce the overall data available for the composite. Additionally, due to the orbital paths of the Worldwide Reference System 2, a composite may be created from multiple scenes obtained from different dates within the 16-day period (e.g., different scenes that intersect an area of interest, but are acquired at the beginning and end of the 16-day period). If data are incomplete (e.g., cloudy pixels, scan line corrector errors of Landsat 7 ETM+, etc.) within these scenes, it is possible that two adjacent pixels can represent two different acquisition dates; if no data for the period are available then a climatology is used for gap filling, further distancing the dates used in the composite. Frequency of gap filling that occurs varies both geographically and seasonally, and is more likely when only a single Landsat sensor is operational. Furthermore, gap filling with climatology may produce anomalies, particularly during unusually wet or dry years, yielding systematically low or high values, respectively. These caveats may result in visual artifacts in areas with incomplete data or along scene edges.
The real power of emerging big data, cloud and web-based applications, and technologies (e.g., Google Earth Engine, GeoTrellis, GeoMesa, Apache Spark, etc.) is our new-found ability to create customizable geospatial products. Publicly available applications may be built upon these technologies, ultimately allowing users greater flexibility to provide input data, set spatial or temporal restrictions, modify parameters of algorithms, or perform on the fly testing and validation before final analysis. Such capabilities change the paradigm of static geospatial products to dynamic geospatial products, where the output is dependent upon the user’s knowledge of both the system and the question. Although this requires products to be generated as needed, it provides the ability to create a much more appropriate product for any given system and question. The Landsat NDVI product and its associated web application (
http://ndvi.ntsg.umt.edu/) provide a glimpse into this reality of dynamic geospatial products.
5. Conclusions
The present work introduces a unique approach to creating and disseminating high resolution spatially and temporally continuous Landsat derived NDVI. Our motivation is to remove the barriers of these datasets to further conservation and ecological research. Sixteen-day composites are created by selecting the best available pixels during each 16-day composite period from all available Landsat sensors. Missing values, due to unprocessed scenes, atmospheric contamination, or sensor malfunction are gap filled with a user-defined climatology. The resulting NDVI time series is then smoothed to approximate natural vegetative phenology. We validate the NDVI dataset using established remote sensing products at multiple scales, demonstrating the effectiveness of our approach. We provide open access to the dataset through a simple web application (
http://ndvi.ntsg.umt.edu/) enabling ecologists, land managers, conservationists, and others–who may not have the compute processing capacity or technical skills–to process massive amounts of remote sensing data. This process is simplified with Google Earth Engine, an advanced planetary-scale cloud-based geospatial processing platform, which processes and distributes the product. Each 16-day composite for CONUS requires processing of at least 2700 individual Landsat scenes (more if the climatology is used for gap filling). The web application permits on-the-fly processing with customizable parameters, eliminating the need to store large amounts of data. Although we limit this study to CONUS, the framework can be expanded beyond CONUS where Landsat surface reflectance data are available and to include other useful vegetation indices (e.g., EVI, SAVI), and can be updated to accommodate updates or reorganization of the Landsat archive (e.g., Collection 1) or be modified to utilize other satellite remote sensing datasets.