The Global Land Surface Satellite (GLASS) Remote Sensing Data Processing System and Products

Using remotely sensed satellite products is the most efficient way to monitor global land, water, and forest resource changes, which are believed to be the main factors for understanding global climate change and its impacts. A reliable remotely sensed product should be retrieved quantitatively through models or statistical methods. However, producing global products requires a complex computing system and massive volumes of multi-sensor and multi-temporal remotely sensed data. This manuscript describes the ground Global LAnd Surface Satellite (GLASS) product generation system that can be used to generate long-sequence time series of global land surface data products based on various remotely sensed data. To ensure stabilization and efficiency in running the system, we used the methods of task management, parallelization, and multi I/O channels. An array of GLASS remote sensing products related to global land surface parameters are currently being produced and distributed by the Center for Global Change Data Processing and Analysis at Beijing Normal University in Beijing, China. These products include Leaf Area Index (LAI), land surface albedo, and broadband emissivity (BBE) from the years 1981 to 2010, downward shortwave radiation (DSR) and photosynthetically active radiation (PAR) from the years 2008 to 2010.


Introduction
Satellite remote sensing is the main method for measuring and monitoring changes in the earth's surface and atmosphere.A series of operating satellites from different countries are producing tremendous volumes of data at significantly higher levels of measurement precision [1,2].Along with the application of the Advanced Very High Resolution Radiometer (AVHRR) [3] and Moderate Resolution Imaging Spectroradiometer (MODIS) [4] data, time series remote sensing data are now available in several spatial and temporal resolutions and have been used for various purposes [5][6][7][8][9][10].The effective utilization of time series of remote sensing data is an important area in current research, concurrent with the development of domestic and overseas satellite technology [11][12][13].There exist many satellite remote sensing standard data products [3,4,14,15] and distributed remote sensing data systems [16,17] that scientists have been using in their studies of atmosphere, biosphere, cryosphere, land surface and oceans.Remote sensing-based long-sequence time series data have become a powerful tool for researching long-term global changes [8,18,19].
Although some experimental high-level products are being generated, China has not developed a product generation system to generate and distribute long-sequence time series global land products from before the year 2009.To achieve this goal, China launched the 863 key project entitled "Generation and application of global products of essential land variables" in 2009.The project is one of the 863 programs funded and administered by the government of the People's Republic of China intended to stimulate the development of advanced technologies in a wide range of fields.The central component of this project is developing the ground Global LAnd Surface Satellite (GLASS) product generation system.In order to process and release GLASS products, Beijing Normal University established the Center for Global Change Data Processing and Analysis, and built the GLASS system at there.The purpose of establishing the center is to manage, process, analyze and release Global Change Data.All GLASS products were processed and released by the center.The GLASS system can be used to generate five land surface remote sensing products: Leaf Area Index (LAI), shortwave broadband albedo, broadband emissivity (BBE), downwelling shortwave radiation (DSR), and photosynthetically active radiation (PAR).The newest algorithms that utilize multi-source remote sensing data, such as MODIS and AVHRR, to generate GLASS products are integrated into the system.
The GLASS product generation system was designed to implement high-performance computing (HPC) on a clustered environment.The system also has relatively high efficiencies for utilizing the distributed and parallel computing techniques.Five GLASS products have been generated with the system.The five products are eight-day LAI, albedo and BBE products with resolutions of 5 km and 1 km for the years before and after 2000, respectively, and three-hour DSR and PAR datasets with a 5-km resolution from the years 2008 to 2010.
The main objective of this manuscript was to describe the structure, configuration, performance and products of GLASS system.This following presentation is organized as follows.Section 2 describes the structure and performance of GLASS product generation system which involved HPC.Section 3 is focused on five GLASS products datasets, Sections 4 and 5 describe components of quality control and product service.Concluding remarks are briefly stated in Section 6.

Product Generation System
The GLASS product generation system, similar to the EOS Data Information System (EOSDIS) and MODIS data processing system [4,14,20], is designed to provide the computing and network facilities that are essential in supporting various research activities, including processing, distributing, and archiving data.

System Structure and Configuration
The system's hardware primarily consists of processing, management, and database servers.The GLASS product generation system network structure can be observed in Figure 1.The processing servers, comprising the high-performance computing (HPC) system, are used to obtain the preprocessing data (PRE), LAI, albedo, BBE, DSR and PAR data products.In this research, HPC refers to a computing environment with several computers in a cluster.The management and database servers provide the system operation management and data storage services, respectively.The system's software generally comprises production task management, HPC distribution, data quality verification and system monitoring.

High-Performance Parallel Computing System
Due to their superiority at improving computation efficiency and increasing the computation scale, the parallel computing techniques associated with HPC are adopted for some retrieval algorithms when more computing time is needed.One of the most popular high-performance parallel computing schemes is the message passing interface (MPI).This interface exchanges work requests among the nodes via message passing and thus provides a simple method for work creation [21].A typical scheduling map of high-performance parallel computing is shown in Figure 2. The thread0 is the main thread that is responsible for distributing, receiving and summarizing data, while other threads are concerned with computation.

Multitask Management and Operation
Due to the massive number of products and the limited number of computational nodes, a unified task management system is required to successfully create GLASS products in the GLASS system.Multitask

MPI_Finalize
End management refers to task-based realization and management of the production of various product levels.A task refers to the production steps formulated according to specific needs, which mainly involve such requirements as production type, input data, time, and algorithm version.Typical product generation procedures on the GLASS system are as follows: * A product generation job can be created in the user interface and be sent to a task server.A job is often more of an application-level term and simply refers to some script that is executed to do a specific set of tasks.* A task server creates tasks, dispatches them to cluster nodes, and processes returned tasks.A task is often a part of a job and sometimes is the only part.The Simple Linux Utility for Resource Management (SLURM) Version 2.2 has been used as the cluster resource manager.SLURM is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters [22].It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes.* Every task will run in its own thread.While the tasks within the parallel task are being run, the main thread will be blocked and will wait for all the child threads to be completed.In Figure 2, the thread0 is the main thread that is responsible for distributing, receiving and summarizing data, while the others are concerned with computation.All of these threads are optimized according to their computing time.That means if one task is finished, the thread can be reused.

Performance Measurements
The GLASS product generation system has relatively high efficiencies for utilizing the distributed and parallel computing techniques.High performance parallel computing is accomplished by splitting up large and complex tasks across multiple processors or cluster nodes.A job server, a task server and cluster nodes are organized together to fulfill the product generation.A job, for example, to generate one year of LAI products in 2008, can be created by the job server.As soon as the job is examined and verified, it will be sent to the task server.After that, the task server creates tasks, dispatches them to cluster nodes according to the tiles of MODIS data.Each node periodically communicates with the task server to report completed work and to get new work before all tasks are accomplished.
In order to compare the efficiency between parallel and nonparallel computing environments, we made performance measurements of the product generation using the GLASS system.All measurements were made with a Hewlett-Packard Cluster with the same configuration.Based on the features of the GLASS products' retrieval algorithms [23], a high-efficiency parallel algorithm was established on the hardware platform of the existing high-performance parallel computer system with software support.As the distributed and parallel computing technology was used, the run time of the program decreased considerably.For example, it took 14.4 h to generate one year of LAI products from MODIS data in a nonparallel computing environment with 10 nodes, while it only took four hours when the cluster with 10 nodes was used in the parallel processing program.Thus the computation time can be reduced sharply.The speedups attained by using the cluster are shown in Table 1.

Product Format and Metadata
The current GLASS products are LAI, albedo, BBE, DSR and PAR.The file format of all GLASS products is HDF-EOS Version 2.16, which can obtain geometric information that relates data to specific earth locations.The HDF-EOS data format is extensively used in MODIS products [24], so we can use certain MODIS tools, such as MODIS Reprojection, to address the GLASS products.The GLASS product filenames follow a naming convention that provides useful information regarding the specific product.For example, the filename GLASSXXNYY.VXX.AYYYYDDD.HRRVPP.YYYYDDD.hdfindicates the following: GLASS stands for the Global LAnd Surface Satellite products XX stands for the types of GLASS products, 01: LAI; 02: albedo; 03: BBE; 04: PAR; 05: DSR N stands for spatial resolution, which is labeled A, B, and C YY stands for codes for different types of GLASS products, which depend on the algorithm and data sources of the products AYYYYDDD stands for the time of the products, YYYY is for year, and DDD is for Julian day HRRVPP stands for tile ID (path/row number), HRR-row number, VPP-path number.HRRVPP is not validated when the product is processed by the AVHRR data YYYYDDD stands for the process time of the products hdf stands for the data format (HDF-EOS Version 2.16) GLASS products have two sources of metadata: the embedded HDF metadata and the external metadata.The HDF metadata contains valuable information, including global attributes and data set-specific attributes pertaining to the granule.The xml file is the external metadata file which is delivered to the user along with the GLASS products.

Preprocessed Data
In the GLASS system, MODIS and AVHRR data were used as input data to generate GLASS products.The data quality of the MODIS and AVHRR images was greatly influenced by clouds, cloud shadows, snow, and other abnormal climate conditions, which hindered the surface reflectance inversion and further impacted the GLASS products' quality.Some data, such as AVHRR, MOD09A1, MOD09GA, MCD43B3 and MOD02, were preprocessed before they were used to produce the GLASS products.To improve the data quality, the existing MODIS snow and cloud mask and the reflectance characteristics of the non-snow/cloud pixels were used in combination to identify pixels of snow, clouds and abnormal values.All of the identified values were filled by the clear pixel values over a long period of time to remove the effects of snow, clouds and cloud shadows.Figures 3  and 4 show an example of preprocessed results.Figure 3 is a raw AVHRR image, while Figure 4 is the preprocessed image.

LAI Products
LAI, which is defined as one-half the total green leaf area per unit of horizontal ground surface area [25], is an important property of vegetation and has the strongest effect on overall canopy reflectance.The LAI retrieval algorithm integrated into the GLASS system employs General Regression Neural Networks (GRNNs) to generate a long time series of global LAI data with spatial and temporal continuity from time series of remote sensing observations.The neural networks were trained by the fusion of the LAI from MODIS and CYCLOPES LAI products with the reprocessed land surface reflectance over the BELMANIP sites during the 2001-2003 period.Reprocessed MODIS and AVHRR land surface reflectance data of a whole year were input simultaneously, allowing a one-year LAI profile to be reconstructed by the GRNNs.The comparison between the GLASS LAI derived from AVHRR and MODIS surface reflectance data indicates a consistent LAI product from different sensor data.The inter-comparison of the GLASS LAI and other global LAI products show that the GLASS LAI is spatially complete and temporally continuous [23].
The GLASS system produces an eight-day LAI over a period of 30 years from MODIS and AVHRR data at different spatial resolutions.The spatial resolution of the product from 1981 to 2000 is 5 km, and that from 2000 to 2010 is 1 km. Figure 5

Albedo Products
Land surface albedo, defined as the ratio of reflected solar shortwave radiation from a surface to that incident upon it [26], is one of the most important parameters affecting the radiative energy budget of the earth's surface [27,28].The daily albedo is retrieved by the Angular Bin algorithm (AB), which, through a simple linear equation directly calculates the broadband albedo from the multi-band surface directional reflectance or the TOA (top of the atmosphere) directional reflectance [29].To correct for the solar/view angle dependence of surface or TOA reflectance, the AB algorithm divides the space of the solar/view geometry into small grids, which are called angular bins, and it derives different linear regression coefficients for each of the angular bins.As an empirical algorithm, the AB algorithm has the advantages of simplicity, speed and low input data requirements, while maintaining relatively good accuracy.The AB algorithm comprises two sub-algorithms, AB1 and AB2, which calculate the white sky shortwave (0.3-3 µm) albedo and the black sky shortwave albedo using the daily surface reflectance and the daily TOA reflectance, respectively.In the AB2 algorithm, the atmospheric correction for the input data is bypassed, thus avoiding errors introduced in the imperfect atmospheric correction.
The outputs of the AB algorithm are intermediate GLASS albedo products, which have a one-day temporal resolution.The GLASS final albedo product in eight-day steps is obtained by merging the results obtained from AB1 and AB2 in a 16-day time window after 2000 or in a 32-day time window before 2000 through a Statistics-Based Temporal Filtering (STF) algorithm [30].The STF detects and eliminates the occasional abnormal values in the intermediate products, and it fills missing values through temporal filtering supported by a prior database.The GLASS final albedo product is gapless and continuous in the spatial and temporal domains.
The GLASS system produces an eight-day albedo over a period of 30 years from MODIS and AVHRR data at different spatial resolutions.Figure 6 presents an example of the GLASS final albedo product on Julian day 115 of 1985.The data source is the daily AVHRR surface reflectance product released by the Long-Term Land Data Record (LTDR) project of NASA [31].The spatial distribution pattern of the global land surface albedo is satisfactorily captured by the GLASS product.For details of the accuracy of the GLASS albedo product, the user can refer to the reference [32].

BBE Products
The land surface BBE is an important parameter for studies of surface upward longwave radiation, which is an important component of the surface radiation budget and an important parameter for numerical weather prediction and hydrological models [33,34].The GLASS emissivity is a BBE (8-13.5 μm) product that was derived from AVHRR and MODIS optical data with our newly developed algorithms [35,36].GLASS BBE is composed of two parts: the first is the global eight-day 1-km land surface BBE retrieved from MODIS albedo ranging from 2000 to 2010; the second is the global eight-day 5-km land surface BBE retrieved from the AVHRR VNIR reflectance data of 1981-1999.In the algorithm used to generate the GLASS BBE, the land surface was classified into five types based on the Normalized Difference Vegetation Index (NDVI) threshold values: water, snow/ice, bare soils, vegetated areas, and transition zones.The BBE of water and snow/ice was assigned as 0.985 by combining the BBE calculated from the emissivity spectrum in the ASTER spectral library [37] and the MODIS UCSB spectral library [38], as well as that simulated by radiative transfer models [39].The BBEs of bare soils, vegetated areas, and transition zones were formulated as the linear function of seven MODIS narrowband black-sky albedo and the nonlinear function of the AVHRR channel 1 and 2 reflectances, respectively.The BBE derived from the MODIS albedo was validated by field measurements conducted over deserts in United Sates and China, and the absolute difference was found to be 0.02.The BBE derived from the AVHRR was consistent with that derived from the MODIS data.The mean bias and RMSE of the difference between these two BBEs was on the order of 0.001 and 0.01, respectively [36].
The GLASS system generates eight-day surface 8-13.5 μm emissivity products for a 30-year period at different spatial resolutions.The spatial resolution of the product from 1981 to 2000 is 5 km, and that from 2000 to 2010 is 1 km. Figure 7 presents an example of the final GLASS BBE product, which is for Julian day 161 in 2008.

DSR and PAR Products
DSR, defined as the illuminated solar radiation from 300 to 2,500 nm, is an important parameter in the surface radiation budget [40,41].PAR, known as solar radiation available for photosynthesis, is primarily in the visible part of the spectrum [42].The two products are important physical and ecological parameters under the total energy exchange process between the atmosphere and the land surface [43,44].In the GLASS system, the global DSR and PAR products were generated based on an improved look-up table method using both polar-orbiting and geostationary satellite data, including MODIS, Meteosat Second Generation (MSG) SEVIRI, the Multi-functional Transport Satellite (MTSAT)-1R, and the Geostationary Operational Environmental Satellite (GOES) Imager [45].
The basic procedure of the GLASS DSR and PAR products comprises three steps: (1) establish the look-up table at different atmospheric conditions and geometries for each satellite data under cloud and cloud-free conditions; (2) estimate the surface DSR and PAR from the surface reflectance and TOA radiance data for each satellite data; and (3) map the DSR and PAR globally by using the MODIS data to derive the solar radiation at surfaces greater than the 60° north and south latitudes, and calculate the radiation at the lower latitudes through a combination of polar-orbiting and geostationary satellite-derived radiation products.
The global fusion products of the DSR and PAR images are mapped in a sinusoidal projection at a 5-km spatial resolution and three-hour temporal resolution from 2008 to 2010. Figure 8 shows the global combined DSR product retrieved using data from multiple polar-orbiting and geostationary satellites on 12 November 2008.

Quality Control
The purpose of quality control for GLASS products is to ensure that users are provided with high-quality products after examination and assessment.Quality control consists of two components, namely, spatial quality checkup and numerical accuracy assessment.Spatial quality checkup involves screening for missing data or bands, mosaics, and any spatial discontinuity problems arising from the lack of raw data.Numerical accuracy assessment compares the product to the data measured on the ground to evaluate the product's accuracy.All GLASS products have already been examined by spatial quality verification processes.More than 10 institutions or departments have contributed toward the accuracy assessments of all GLASS products.

GLASS Products Service
To provide GLASS products to users, the user services team monitors the integrity of the archive, and ensures that the GLASS products arriving from the production system are stored and properly inventoried.User services are responsible for maintaining the data search and order web interface [46], which is able to supply large amounts of data online to be downloaded by multiple users.The site serves as an interactive system that provides information on queried products.Additional questions can be addressed to: datacenter@bnu.edu.cn.

Summary
The newly developed GLASS system is capable of automatically generating remote sensing parameters based on processing global satellite data, remote sensing products, and ground observations.The GLASS data processing system is designed to provide the computing and network facilities that are essential in supporting various research activities, including processing, distributing, and archiving data.The parallel computing techniques of MPI associated with HPC are adopted for product retrieval algorithms.
An array of GLASS products related to global land surface parameters are currently produced with the GLASS system.These GLASS products include LAI, albedo, BBE, DSR and PAR.The first three products span from 1981 to 2010 at a spatial resolution of 1-to 5-km and eight-day resolutions.The last two products span from 2008 to 2010, and have high temporal resolution (three-hour) and spatial resolution (5-km).The first three products are mainly based on AVHRR and MODIS data, and two radiation products use the combined five geostationary satellites and MODIS data.

Figure 1 .
Figure 1.GLASS product generation system network structure.

Figure 2 .
Figure 2. Scheduling map of high-performance parallel computing.
displays the GLASS LAI global product for Julian day 209 in 2008.

Figure 5 .
Figure 5. GLASS LAI map for Julian day 209 in 2008.

Figure 6 .
Figure 6.GLASS albedo map for Julian day 209 in 2008.

Figure 7 .
Figure 7. GLASS BBE map for Julian day 161 in 2008.

Table 1 .
GLASS system performance measurements.