Evaluation of Sub-Pixel Cloud Noises on MODIS Daily Spectral Indices Based on in situ Measurements

Cloud contamination is one of the severest problems for the time-series analysis of optical remote sensing data such as vegetation phenology detection. Sub-pixel clouds are especially difficult to identify and remove. It is important for accuracy improvement in various terrestrial remote sensing applications to clarify the influence of these residual clouds on spectral vegetation indices. This study investigated the noises caused by residual sub-pixel clouds on several frequently-used spectral indices (NDVI, EVI, EVI2, NDWI, and NDII) by using in situ spectral data and sky photographs at the satellite overpass time. We conducted in situ continuous observation at a Japanese deciduous forest for over a year and compared the MODIS spectral indices with the cloud-free in situ spectral indices. Our results revealed that residual sub-pixel clouds potentially contaminated about 40% of the MODIS data after cloud screening by the state flag of MOD09 product. These residual clouds significantly decreased NDVI values during the leaf growing season. However, such noises did not appear in the other indices. This result was thought to be caused by the different combination of wavelengths among spectral indices. Our results suggested that the noises by residual sub-pixel clouds can be reduced by using EVI, NDWI, or NDII in place of NDVI. OPEN ACCESS Remote Sens. 2011, 3 1645


Introduction
Remotely sensed spectral indices calculated from a combination of band reflectances are widely used to monitor biophysical quantities related to the terrestrial ecosystem, such as leaf area index [1][2][3][4] and leaf onset/offset phenology [5,6].However, occasionally, clouds contaminate these data, and therefore, obscure the monitoring by optical remote sensing satellites.In particular, clouds smaller than a satellite image pixel, i.e., sub-pixel clouds, often hinder the retrieval of biophysical parameters of the ecosystem from spectral indices.Although cloud masks have been produced and demonstrated to be effective [7], they are not always able to completely offset the contamination caused by clouds [8].Noise due to irremovable clouds is one of the severest problems for time-series monitoring using satellite sensors with low-to-medium spatial resolution (i.e., 250-1,000 m) such as the Terra/Aqua moderate resolution imaging spectroradiometer (MODIS).
To completely remove cloud noise from time-series spectral indices data, we should first understand the behavior of cloud noise in several frequently used spectral indices.Normalized difference vegetation index (NDVI) [9] is the most widely used spectral index for monitoring vegetation changes.Clouds generally show high reflectance in the visible and near-infrared domains (300-1,000 nm), which results in lower NDVI values in comparison to vegetation [10].Based on this knowledge, several cloud-screening techniques for time-series NDVI have been developed [11][12][13].The numerical simulation for the effect of sub-pixel clouds showed that it also led to a measurable decrease of NDVI [14].However, the actual impact of sub-pixel clouds is not entirely clear since it is difficult to quantify clouds at the sub-pixel scale.Furthermore, for other frequently used spectral indices such as the enhanced vegetation index (EVI) [15], 2-band EVI (EVI2) [16], normalized difference water index (NDWI) [17], and normalized difference infrared index (NDII) [18,19], the effects of cloud contamination are not well understood.We hypothesize that different spectral indices may also show different responses to cloud contamination because the reflectance of different wavelength domains may show different responses to cloud contamination.
In this study, we investigated the effects of cloud contamination on several widely used spectral indices measured by the MODIS sensor: NDVI, EVI, EVI2, NDWI, and NDII.To evaluate the sub-pixel cloud noise in satellite data, we conducted highly-frequent automatic in situ measurements using spectral radiometers and digital cameras.These measurements provided a set of hyper-spectral reflectances of the ground surface and hemispherical photos of the cloud conditions in the sky at the satellite overpass time.Several papers have demonstrated the usefulness of these monitoring systems when evaluating ecological remote sensing [20][21][22][23].We evaluated the residual sub-pixel cloud noise by comparing the MODIS data with the in situ data.

Study Site
The study site is a cool-temperate deciduous broadleaved forest near Takayama City in central Japan (137.4231°E, 36.1462°N,1,420 m a.s.l.(above sea level) in WGS84).The annual mean air temperature and annual precipitation from 1980 to 2002 were 7.2 °C and 2,275 mm, respectively.The site is covered with snow from around December to April.Tree census and flux measurements were carried out over a 10-year period [24][25][26][27][28].The forest canopy is dominated by Erman's birch (Betula ermanii) and Japanese oak (Quercus crispula) [29].The height of the dominant canopy trees ranges from 13 to 20 m.The forest floor is covered with evergreen dwarf bamboo (Sasa senanensis) with a height of 1.0 to 1.5 m.This site belongs to several monitoring networks: the Phenological Eyes Network [30] (http://www.pheno-eye.org/),JapanFlux network (http://www.japanflux.org/), and JaLTER (Japan Long-Term Research Network; http://www.jalter.org/).

In situ Measurements
We periodically and automatically observed spectral irradiance of the sky and forest canopy by using two hemispherical spectroradiometers (HSSRs) MS-700 and MS-712 (EKO Instrument Co. Ltd., Tokyo, Japan) (Table 1).We used the automatic rotating stage CHS-AR (Hayasaka Rikoh Co. Ltd., Sapporo, Japan), which can flip back and forth and is driven by an electric motor, to enable an HSSR to observe both the downward and upward directions in alternating fashion.The HSSRs and rotating stages were installed atop an 18-m canopy-access tower facing southeast (Figure 1(a)), and they were controlled by a personal computer and manufacturer-supplied software.We setup the HSSRs to observe both the downward and the upward directions at 10-min intervals from 10:00 to 14:00 Japan Standard Time (JST) every day.Data obtained from DOY120, 2008 (29 April)   We also monitored the sky and forest canopy conditions by using automatic-capturing digital fisheye cameras (ADFC) (CoolPix-4500 digital camera, Nikon Corporation, Tokyo, Japan) with a fish-eye lens FC-E8 (Nikon Corporation, Tokyo, Japan; Equisolid Angle Projection) stored in a custom-built housing SPC31A (Hayasaka Rikoh Co., Ltd., Sapporo, Japan).We installed the ADFCs for the downward and the upward directions atop the 18-m canopy-access tower.The downward ADFC was installed next to the HSSRs to monitor their field of view.To control the camera functions (e.g., parameter configuration, image capture, and data retrieval), we used the free software "photopc" (http://photopc.sourceforge.net/)on the personal computers.The configuration of the cameras was as follows: mode, programmed auto; white balance, sunny or auto; fisheye mode, fisheye-1 (circular); image size, 2,272 × 1,704 pixels; and file format, JPEG.We set the cameras to capture the sky and the forest canopy at 2-min and 90-min intervals, respectively.
The area for a 250 m radius around the canopy-access tower was almost covered with deciduous forest according to the high-resolution satellite image shown in Figure 1(b).Ohtsuka et al. [24] reported that the basal area of deciduous broadleaved trees accounted for approximately 95% of the total basal area in a 100 m × 100 m plot adjacent to the northwest of the canopy-access tower.The field view of the HSSRs, which was largely consistent with the area shown in the ADFC photos (the area about a 10 m radius in following figure), also consisted of the same deciduous broadleaved trees.

Calculation of Spectral Reflectance from in situ Data
To correct the instrumental error between the MS-700 and MS-712, the spectral irradiance by MS-712 was adjusted to the spectral irradiance by MS-700 by comparing the data for the 900-1,000 nm wavelength domain (i.e., empirical linear regression).We then converted the spectral irradiance into band-averaged irradiance by using the following weighted function [31]: where I i is the spectral irradiance at each wavelength, I band is the band-averaged irradiance, and w i is the relative spectral response (RSR) of Terra MODIS bands 1-6 (Figure 2) as a function of wavelength λ.We did not use the RSR of Aqua MODIS because the RSRs of Terra MODIS and Aqua MODIS were not very different, as shown in Figure 2. The irradiances of bands 1, 2, 3, and 4 were obtained from the MS-700 data and the irradiances of bands 5 and 6 were obtained from the MS-712 data.We obtained the daily band-averaged irradiance by averaging the band-averaged irradiance measured from 10:00 to 14:00 each day because this time slot was less affected by low sun elevation angles and a shadow of the canopy-access tower.We then calculated the daily band-averaged reflectance by dividing the daily band-averaged irradiance of the ground by that of the sky.By referring to the ADFC images, data taken under rainy conditions were discarded and only data collected under sunny or cloudy conditions were used for analysis.Because the canopy-access tower also comes into the field of view (FOV) of the HSSRs, we corrected the contamination of the tower reflectance.Using the linear mixture model theory [32], we assumed that the reflectance measured by the HSSRs (ρ HSSR ) can be expressed by the following equation: where f t is the coverage of the tower to the FOV of the HSSRs, ρ tower is the reflectance of the tower, and ρ canopy is the reflectance of the forest canopy.For simplicity, we did not consider the directional variation of the reflectance (i.e., bi-directional reflectance function) from the canopy and tower.We obtained ρ canopy by substituting the values of ρ HSSR , ρ tower , and f t into the above equation.For ρ HSSR , we used the daily band-averaged reflectances measured by the HSSRs.For ρ tower , we used the reflectance of the tower's iron pipe measured by the Fieldspec FR spectroradiometer (ASD Inc., Boulder, CO, USA; Figure 3).For f t , we used f t = 0.11, which was determined by the following ADFC image analysis.We first extracted the pixels of the canopy-access tower to meet both of the following criteria determined through trial and error: where DN blue and DN green are the 1-byte digital numbers (0-255) of blue and green, respectively.We then calculated the ratio of the number of extracted pixels to the number of total pixels, which was regarded as the coverage of the canopy-access tower (Figure 4).

Processing of Terra/Aqua MODIS Data
We used the data from the MODIS sensor onboard the Terra and Aqua satellites.We downloaded the products of atmospherically corrected surface reflectance data (MOD09GA and MYD09GA; daily temporal resolution; 500-m and 1-km spatial resolution; collection 5) from the NASA Land Processes Distributed Active Archive Center (LP DAAC, https://lpdaac.usgs.gov/).The MOD09GA product was obtained by the Terra MODIS in the morning during its descending orbit, and the MYD09GA product was obtained by the Aqua MODIS in the afternoon during its ascending orbit.In the 36 spectral bands of MODIS, we used the data of bands 1-3, 5, and 6 with 500-m spatial resolution.The conversion of map projections was performed by using the MODIS Reprojection Tool provided by the LP DAAC.The integrated sinusoidal (ISIN) projection of the original data was converted into a geographic projection (datum, WGS84; sampling protocol, nearest neighbor; output pixel size, 0.004167°).We then extracted the values of the band reflectance, quality control (QC) flag, state flag (including a cloud mask), and sensor zenith angle of the pixel over the study site.By referring to the QC flag, data with sensor troubles were discarded.To evaluate cloud noise that could not be removed by the cloud mask, only data with a 'clear' cloud state flag were used; data with 'cloudy', 'mixed', and 'not set' flags were not used.We also only used the data taken when the sensor zenith angle was less than 50° to avoid degradation of the spatial resolution due to large sensor zenith angles.

Cloud Coverage at the MODIS Overpass Time
We assessed sky conditions at the overpass times of Terra/Aqua MODIS using the ADFC images.The Terra/Aqua overpass time each day was obtained from the L1A geolocation fields product (MOD03/MYD03, 1-km resolution, collection 5).The MOD03/MYD03 products were downloaded from the Level 1 and Atmospheric Archive and Distribution System (LAADS, http://ladsweb.nascom.nasa.gov).The original map projection was converted into a geographic projection (datum, WGS84; sampling protocol, nearest neighbor; output pixel size, 0.008333°) using the MODIS Swath Reprojection Tool provided by the LP DAAC.
We only used the data with a 'clear' cloud state flag for the MOD09/MYD09 products, but several data were still measured under cloudy conditions according to the ADFC sky images.Therefore, we classified the data into three categories in terms of the cloud coverage in the ADFC image at the overpass time as interpreted by the human eye (Figure 5): • Category 1: cloud coverage of the sky was less than 20%; • Category 2: cloud coverage was more than 20%; • Category 3: cloud coverage was unknown because of missing ADFC images.
We included thin clouds over the sky in category 2 because the coverage was difficult to estimate with the human eye.

Results
Seasonal changes in the MODIS spectral indices were roughly in accordance with the in situ spectral indices (Figure 6).All of the in situ spectral indices rapidly increased during the leaf green-up period in spring (DOY 130-150) and decreased during the leaf senescence period in autumn (DOY 270-300).When the ground was covered with snow, NDVI, EVI, and EVI2 showed their minimum values for the year (about 0.1 for NDVI and EVI); on the other hand, NDWI and NDII showed maximum values for the year (about 0.3 for NDWI, 0.6 for NDII).These characteristics were also found in the spectral indices observed by the MODIS sensors, but the MODIS spectral indices were sometimes systematically higher than the in situ spectral indices at the start and end of the growing period.Figure 7 shows the seasonal changes in the band reflectances observed by MODIS and in situ HSSRs.For both the MODIS data and the in situ data, the reflectance of bands 1 and 3 became relatively low during the growing period; on the other hand, the reflectance of bands 2, 5, and 6 became relatively high during the growing period.The MODIS measured reflectances showed some fluctuations around the in situ reflectances.values with the MOD09GA product (Terra MODIS) and MYD09GA product (Aqua MODIS) were not much different.However, the number of data points of Aqua's band 6 (n = 44) was much smaller than that of Terra's data (n = 136) because of several inoperable detectors in band 6 of Aqua MODIS [36,37].Because Aqua's band-5 data also fluctuated more than Terra's data, we did not use the Aqua MODIS data for the statistical analysis described below.
The MODIS spectral indices shown in Figure 6 occasionally showed sudden decreases and became much smaller than the in situ spectral indices, especially during the growing period.The mismatches in NDVI were bigger than that of the other indices.The scatter plots between the Terra MODIS and in situ spectral indices (Figure 8) clearly showed that the MODIS NDVI was lower than the in situ NDVI at high values of the latter from 0.6 to 0.9.The MODIS EVI and EVI2 also sometimes showed sudden decreases, but it was less apparent than in the MODIS NDVI.NDWI and NDII did not show such clear differences.
According to the ADFC sky photos, the sudden decrease in MODIS NDVI mainly occurred when cloud coverage was greater than 20% (category 2).In almost all cases of category 2, small patchy clouds (cumulus clouds) or cirrus clouds were distributed in the sky.Category 2 made up 40.0% of all data for NDVI, EVI, and EVI2, 41.0% for NDWI, and 39.5% for NDII (Table 2).The statistics of the relationships between the Terra MODIS and in situ spectral indices are shown in Table 2.The relative root mean square deviation (RRMSD) and relative mean bias (RMB) are calculated as follows: where i = 1, 2, …, n is the number of data, x i is the in situ spectral index, y i is the MODIS spectral index, and x max and x min are the maximum and minimum values of the in situ spectral index during the study period.The RRMSD and RMB are expressed as percentages of the range of the in situ spectral index during the study period.Perfect agreement between x and y would be expressed by RRMSD = 0 and RMB = 0. R 2 is the determination coefficient for linear regression.For all data, NDVI showed larger RRMSD and lower R 2 than the other indices.RMB was within ±0.05 for all indices, and it did not greatly vary among the spectral indices.For category 1 (cloud cover <20%), RRMSD and R 2 were almost same among the spectral indices.All indices showed relatively high R 2 values for category 1 (more than 0.7).For category 2 (cloud cover ≥20%), NDVI, EVI2, EVI, and NDII showed larger RRMSD and lower R 2 in comparison to their values for category 1, but the differences in EVI and NDII were relatively small.Such a difference was not clear in NDWI.The RMB of category 2 was lower than that of category 1 for all indices, and NDVI showed a relatively big difference in RMB in comparison to the other indices.

Discussion
Both of the Terra and Aqua MODIS reflectances showed good correspondence with the in situ reflectance under non-cloud conditions (RRMSD < 20%, RMB < 10%, R 2 > 0.75, p < 0.0001).However, about 40% of the MODIS data after cloud-screening by the MOD09 state flag was potentially contaminated with irremovable small clouds.Since the fluctuations (expressed by RRMSD and R 2 ) under cloudy conditions were higher than those under cloud-free conditions, the irremovable small clouds caused the errors in the spectral indices.This error differed among the spectral indices.For NDVI, the fluctuations and biases under cloudy conditions differed widely from those under non-cloudy conditions.In particular, the noise cannot be negligible during the leaf-growing season (NDVI > 0.5) as shown in Figure 6.EVI2 also showed the relatively big differences in fluctuations and biases between under cloudy conditions and non-cloud conditions, but it was less than in NDVI.For EVI, NDWI, and NDII, the fluctuations and biases were relatively small in comparison to NDVI and EVI2.
This difference is possibly due to the difference in combination of the wavelength between the spectral indices.When considering clouds covered with f c percent of a pixel of the homogeneous ground surface, according to the spectral linear mixture model, the satellite-observed reflectance of band i, or ρ i , is expressed as follows: where ρ gi and ρ ci represent the reflectances of ground surface and clouds for band i, respectively.By using this expression, the normalized difference index, NDI, using bands i and j reflectances (e.g., when i = 2 and j = 1, NDI is equal to NDVI) can be expressed by the following equation: If cloud coverage f c is 0, then we can obtain the value under cloud-free conditions, which is expected to be equal to the in situ observed value, When ρ ci /ρ gi is nearly equal to ρ cj /ρ gj , Equation ( 11) can be approximated by Equation ( 12) through canceling the terms related to cloud contamination (f c , ρ ci , and ρ cj ).This means that the cloud noise on a spectral index can be canceled depending on the combination of band wavelengths.As shown in Table 3, ρ c /ρ g is about 8-13 for visible bands (MODIS bands 1, 3, and 4) and about 2 for infrared bands (MODIS bands 2, 5, and 6).The relationship between the cloud and forest canopy reflectances is totally different depending on wavelengths, especially visible versus infrared.NDVI is calculated from the combination of visible and infrared reflectances; therefore, we consider that NDVI cannot reduce the noise caused by small sub-pixel clouds.On the other hand, because NDWI and NDII are calculated from the near-infrared and short-infrared reflectances, they can cancel most of the cloud noise.EVI was slightly decreased by the contamination of small clouds, but the mismatches between the MODIS data and in situ data were smaller than those of NDVI.Compared to EVI, EVI2 showed bigger mismatches between the MODIS data and the in situ data, but it was also smaller than those of NDVI.This means EVI and EVI2 were less sensitive to small cloud contamination in comparison to NDVI.This cannot be explained by the abovementioned logic for NDVI, NDWI, and NDII because of the different form of equation.A possible reason is that the parameters in the EVI and EVI2 equation reduce the small cloud noise.For the differences between EVI and EVI2, Jiang et al. [16] suggested that it is mostly due to residual aerosol and cloud influences that remain after atmospheric correction of MODIS data.In our study, as shown in Figure 9, the MODIS data under cloudy conditions only showed the big difference between EVI and EVI2.This also suggested that a blue band (MODIS band 3) in the EVI equation plays an important role in reducing the sub-pixel cloud noise.NDWI showed relatively low RRMSD and high R 2 for both cloudy and non-cloudy conditions, but unlike the other indices, NDWI of cloudy conditions showed lower RRMSD and higher R 2 than that of non-cloudy conditions.We cannot currently provide a conclusive reason for this..One possible reason is sampling bias between the cloudy data and the non-cloud data.More data points are needed to make our results more reliable.Furthermore, since the measurement was conducted at only one forest site, further investigation at various sites is needed.
The MODIS indices were systematically slightly higher than the in situ indices at the start and end of the growing season.This may be due to the difference in tree phenology between the MODIS pixel area and the in situ observed area.Although we had already checked the land-cover homogeneity in a MODIS pixel when referring to a high-resolution image, the timing for spring leaf green-up and autumn leaf senescence for individual trees can differ some extent [29], which can affect the timing of the spring increase and autumn decrease of spectral indices.In other seasons, the seasonal changes in MODIS reflectances and spectral indices agreed closely with the in situ indices.This result suggests that MODIS radiometric correction is effective in at least this area.
In this study, we clarified the response to cloud contaminations of several well-used spectral indices based on in situ continuous measurements.Our results revealed that the sub-pixel clouds can potentially act as severe noise to time-series NDVI data, which can be improved without data loss by using other indices such as EVI, NDWI, and NDII.For NDVI, errors of greater than 25% of the seasonal dynamic range were present in 16% of the total data; however, for EVI, NDWI, and NDII, these errors were present in less than 5% of total data.Various spectral indices have recently been used for time-series retrieval of various ecosystem attributes such as vegetation phenology [38][39][40][41][42][43], leaf area index [4,44], and vegetation water content [33][34][35]45].The availability of the data without cloud noise is one of the key points to improve the accuracy of these terrestrial ecosystem monitoring.Our results can provide the reference information to select the best spectral index for these applications from the viewpoint of the effects of cloud noise.

Figure 1 .
Figure 1.(a) Photo of the in situ observation system taken on 7 June 2010.(b) Pan-sharpened image by Quickbird on 3 October 2002 (image source and copyright: Digital Globe, CO, USA).The spatial resolution of the image is 0.6 m.The yellow arrow in the image indicates the location of the canopy-access tower.

Figure 3 .
Figure 3. Spectral reflectance of the tower's iron pipe as measured by the Fieldspec (FR) spectroradiometer on DOY 301, 2004, and the corrected and non-corrected spectral reflectances measured by MS-700 and MS-712 on DOY 278, 2008.

Figure 4 .
Figure 4. Automatic-capturing digital fisheye camera (ADFC) image used for extracting the area of canopy-access tower.This image was taken on DOY 220, 2008.The red pixels in the right image were extracted as the canopy-access tower.We applied the extraction of the tower pixels for the region enclosed by the dotted line in the right image.

Figure 5 .
Figure 5. Examples of cloud classification for ADFC sky images at the MODIS overpass time.

Figure
Figure Seasonal changes in spectral indices as measured by the ground spectroradiometers (HSSRs) and Terra/Aqua MODIS: (a) NDVI, (b) EVI, (c) EVI2, (d) NDWI, and (e) NDII.The top horizontal box diagram represents the condition of the site observed from the canopy photos.We discarded the data with sensor trouble or cloud contamination with reference to the QC flag and state flag; we also discarded data captured when the sensor zenith angle was more than 50°.

Figure 7 .
Figure 7. Seasonal changes in MODIS band reflectances measured by the ground spectroradiometers (HSSRs) and Terra/Aqua MODIS: (a) band 1, (b) band 2, (c) band 3, (d) band 5, and (e) band 6.The top horizontal box diagram represents the condition of the site observed from the canopy photos.We discarded the data with sensor trouble or cloud contamination with reference to the QC flag and state flag; we also discarded data captured when the sensor zenith angle was more than 50°.

Figure 8 .
Figure 8. Scatter plots between the ground-based spectral indices and the Terra MODIS (MOD09GA 500 m) based spectral indices during the study period: (a) NDVI, (b) EVI, (c) EVI2, (d) NDWI, and (e) NDII.Black dots represent the data with less than 20% cloud cover in sky photographs at the satellite overpass time.White dots represent the data with more than 20% cloud cover.Solid lines represent the linear regression lines for all data.

Figure 9 .
Figure 9. Scatter plots between EVI2 and EVI: (a) in situ data and (b) Terra MODIS data.

Table 2 .
Number of data (n), relative root mean deviation (RRMSD), relative mean bias (RMB), and determination coefficient (R 2 ) for linear regression of the relationships between the Terra MODIS-based and in situ-based spectral indices.

Table 3 .
Average Terra MODIS (MOD09GA) band reflectance for the data with "clear-sky" and "cloudy" state flag and ρ c /ρ g values during the leaf growing season (DOY130-DOY300 in 2008 and 2009).