Object-Based Greenhouse Mapping Using Very High Resolution Satellite Data and Landsat 8 Time Series

Greenhouse mapping through remote sensing has received extensive attention over the last decades. In this article, the innovative goal relies on mapping greenhouses through the combined use of very high resolution satellite data (WorldView-2) and Landsat 8 Operational Land Imager (OLI) time series within a context of an object-based image analysis (OBIA) and decision tree classification. Thus, WorldView-2 was mainly used to segment the study area focusing on individual greenhouses. Basic spectral information, spectral and vegetation indices, textural features, seasonal statistics and a spectral metric (Moment Distance Index, MDI) derived from Landsat 8 time series and/or WorldView-2 imagery were computed on previously segmented image objects. In order to test its temporal stability, the same approach was applied for two different years, 2014 and 2015. In both years, MDI was pointed out as the most important feature to detect greenhouses. Moreover, the threshold value of this spectral metric turned to be extremely stable for both Landsat 8 and WorldView-2 imagery. A simple decision tree always using the same threshold values for features from Landsat 8 time series and WorldView-2 was finally proposed. Overall accuracies of 93.0% and 93.3% and kappa coefficients of 0.856 and 0.861 were attained for 2014 and 2015 datasets, respectively.


Introduction
Greenhouse mapping via remote sensing is a challenging task due to the fact that the spectral signature of the plastic-covered greenhouse can change drastically [1][2][3][4][5][6].In fact, different plastic materials with varying thickness, transparency, ultraviolet and infrared reflection and transmission properties, additives, age and colors are used in greenhouse coverings.Moreover, as plastic sheets are semi-transparent, the changing reflectance of the crops underneath them affects the greenhouse spectral signal reaching the sensor [2].
Over the last decade, greenhouse detection has mainly been addressed by using different pixel-based approaches supported by single satellite data, such as Landsat Thematic Mapper (TM) [7], Landsat 8 OLI/TIRS [8], the Terra Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) [9], IKONOS or QuickBird [1,3,[10][11][12] and WorldView-2 [13,14].The first paper using object-based image analysis (OBIA) approach for greenhouse detection was published in 2012 by Tarantino and Figorito [5], in this case working with digital true color (RGB) aerial data.Aguilar et al. [6] also applied OBIA techniques on WorldView-2 and GeoEye-1 stereo pairs.Furthermore, a comparison between traditional per-pixel and OBIA greenhouse classification was carried out by Wu et al. [15], concluding that the OBIA scheme resulted in a significant improvement.
Monitoring plastic-mulched land cover through per-pixel techniques is attracting a great deal of attention nowadays, particularly in China.In this way, Hasituya et al. [16] monitored plastic-mulched farmland by a single Landsat 8 OLI image using spectral and textural features.Working with multi-temporal imagery, Lu et al. [17] achieved a very simple but consistent decision tree classifier for extracting transparent plastic-mulched land cover from a very short Landsat 5 TM time series composed by only two images during an agricultural season.They proposed a plastic-mulched land cover index (PMLI) by using the reflectance values of red band (b3) and SWIR1 (b5).Furthermore, large time series of MODIS surface reflectance daily L2G (250 m ground sample distance, GSD) covering the cotton crop period from the 85th day to the 150th day were also used by Lu et al. [18] for plastic-mulched land cover extraction.A simple threshold model based on the temporal-spectral features of plastic-mulched land cover in the early stage of a growing season after planting was designed with the number of days (d) when the normalized difference vegetation index (NDVI) value was larger than a threshold value (x) as the discriminator.This rule achieved very good results along three different years.
Coming back to greenhouse mapping, Aguilar et al. [19] went one step further.In this work, the authors addressed the identification of greenhouse horticultural crops that were growing under plastic coverings.To this end, OBIA and a decision tree classifier (DT) were applied to a Landsat 8 OLI time series and a single WorldView-2 satellite image.They used a sample of 694 individual greenhouses whose information such as type of crop, date of planting and date of removal were known.They achieved an overall accuracy of 81.3% in identifying four of the most popular crops cultivated under greenhouse in Almería (Spain).However, in that research, two important issues remained outstanding.On the one hand, Aguilar et al. [19] decided to conduct a manual digitizing over the WorldView-2 image in order to obtain the best possible segmentation on their 694 reference greenhouses.On the other hand, the authors did not deal with greenhouse/non-greenhouse pre-classification.In other words, all 694 segments had already been pre-classified as greenhouse.
The multi-resolution segmentation algorithm (MRS) [20] included in the eCognition Developer software (Trimble, Sunnyvale, California, United States) is one of the most widely used methods to carry out the image segmentation [21].It is the first step and a crucial process in the OBIA approach.MRS has already been successfully performed on plastic greenhouses by Tarantino and Figorito [5] and Wu et al. [15].However, in both cases the three key parameters of MRS (i.e., scale, shape and compactness) were set up through a systematic trial-and-error approach only validated by visual inspection.At this point, it is important to note that segmentation evaluation plays a critical role in controlling the quality of OBIA workflow.Moreover, among a variety of segmentation evaluation methods and criteria, discrepancy measurement is believed to be the most useful and is therefore one of the most commonly employed techniques in many applications [22][23][24].
The pre-classification of segments as greenhouse or non-greenhouse has to be carried out as a previous step headed up to identify greenhouse horticultural crops growing under plastic coverings.In this way, it is important to point out two critical issues: (1) the shape of the real individual greenhouses should be represented in the best way by the segments to classify; and (2) the classification should be simple, accurate, robust, temporally stable and fast (i.e., using a small training sample size or even without training).In this way, it could make sense to follow similar approaches to those used by Lu et al. [17,18] but, in this case, through applying an OBIA workflow.
The innovative goal tackled in this paper relies on the combined use of very high resolution satellite data and multi-temporal Landsat 8 OLI imagery, within the context of an OBIA approach, to map and detect individual greenhouses through decision tree classifier.So far, and to the best knowledge of the authors, this is the first research work that deals with this topic using OBIA techniques and Landsat 8 time series.More specific subjects regarding the greenhouse mapping method addressed in this work are: (1) identifying the optimal segmentation focused on individual greenhouses by means of segmentation evaluation methods; (2) evaluating the most important and useful features for plastic greenhouse detection (basic spectral information, textural features, several spectral and vegetation indices, seasonal statistics and spectral metric); (3) comparison between the classification accuracies achieved from single satellite imagery and through time series; and (4) determining a simple decision tree temporally stable by using robust threshold values based on Landsat 8 time series and WorldView-2 features for greenhouse classification.

Study Area
This investigation was conducted in Almería, southern Spain, which has become the site of the greatest concentration of greenhouses in the world, known as the "Sea of Plastic" or the "Poniente" region.The study area comprises a rectangle area of about 8000 ha centered on the WGS84 geographic coordinates of 36.7824˝N and 2.6867 ˝W (Figure 1).useful features for plastic greenhouse detection (basic spectral information, textural features, several spectral and vegetation indices, seasonal statistics and spectral metric); (3) comparison between the classification accuracies achieved from single satellite imagery and through time series; and (4) determining a simple decision tree temporally stable by using robust threshold values based on Landsat 8 time series and WorldView-2 features for greenhouse classification.

Study Area
This investigation was conducted in Almería, southern Spain, which has become the site of the greatest concentration of greenhouses in the world, known as the "Sea of Plastic" or the "Poniente" region.The study area comprises a rectangle area of about 8000 ha centered on the WGS84 geographic coordinates of 36.7824°N and 2.6867°W (Figure 1).The types of greenhouse plastic covers in the study area are very variable.The most common materials are polyethylene films (e.g., low density, long life, thermic, with or without additives) and ethylene-vinyl acetate copolymer, also known as EVA.Furthermore, both materials can present different thickness (180 μm or 200 μm) and colors (white, yellow or green).Furthermore, the changing reflectance of the horticultural crops that are growing under semi-transparent plastic coverings during an agricultural season affects the greenhouse spectral signal reaching the sensor.In that sense, the intensive agriculture in Almería was mainly dedicated to tomato, pepper, zucchini, cucumber, eggplant, green bean, melon, watermelon and Chinese cabbage.

WorldView-2 Data
WorldView-2 (WV2) is a very high resolution (VHR) satellite launched in October 2009.This sensor is capable of acquiring optical images with 0.46 m and 1.84 m GSD at nadir in panchromatic (PAN) and multispectral (MS) mode, respectively.Moreover, it was the first VHR commercially available 8-band MS satellite: coastal (C, 400-450 nm), blue (B, 450-510 nm), green (G, 510-580 nm), yellow (Y, 585-625 nm), red (R, 630-690 nm), red edge (RE, 705-745 nm), near infrared-1 (NIR1, 760-895 nm) and near infrared-2 (NIR2, 860-1040 nm).The types of greenhouse plastic covers in the study area are very variable.The most common materials are polyethylene films (e.g., low density, long life, thermic, with or without additives) and ethylene-vinyl acetate copolymer, also known as EVA.Furthermore, both materials can present different thickness (180 µm or 200 µm) and colors (white, yellow or green).Furthermore, the changing reflectance of the horticultural crops that are growing under semi-transparent plastic coverings during an agricultural season affects the greenhouse spectral signal reaching the sensor.In that sense, the intensive agriculture in Almería was mainly dedicated to tomato, pepper, zucchini, cucumber, eggplant, green bean, melon, watermelon and Chinese cabbage.
Two single WV2 bundle images (PAN + MS) in Ortho Ready Standard Level-2A (ORS2A) format were acquired for this work.The images over the study area were taken on 30 September 2013 and 5 July 2015.The WV2 image taken in 2013 presented a final product GSD of 0.4 m (PAN) and 1.6 m (MS), while 0.5 m and 2 m were the pixel size for the image taken in 2015.All delivered products were ordered with a dynamic range of 11-bit and without the application of the dynamic range adjustment preprocessing.
From these WV2 ORS2A bundle images, two pan-sharpened images with 0.4 m (2013 WV2 image) and 0.5 m (2015 WV2 image) GSD were attained by means of the PANSHARP module included in Geomatica v. 2014 (PCI Geomatics, Richmond Hill, ON, Canada).The coordinates of seven ground control points (GCPs) and 32 independent check points (ICPs) obtained by differential global positioning system were used at this stage.We used the seven GCPs to compute the sensor model based on rational functions refined by a zero order transformation in the image space (RPC0).A medium resolution 10 m grid spacing DEM with a vertical accuracy of 1.34 m (root mean square error (RMSE)), provided by the Andalusia Government, was used to carry out the orthorectification process.The planimetric accuracy (RMSE 2D ) attained on the orthorectified pan-sharpened images was of around 0.5 m in both dates.
Furthermore, two MS orthoimages with 1.6 m (2013 WV2 image) and 2 m (2015 WV2 image) GSD containing the full 8-band spectral information were also produced.The same seven GCPs, RPC0 model and DEM were used to attain the MS orthoimage.It is worth noting that atmospheric correction is especially important in those cases where multi-temporal or multi-sensor images are analyzed [25][26][27][28].Thus, in this case, and as a prior step to the orthorectification process, the original WV2 MS images were atmospherically corrected by using the ATCOR (atmospheric correction) module included in Geomatica v. 2014.This absolute atmospheric correction algorithm involves the conversion of the original raw digital numbers to ground reflectance values, and it is based on the MODTRAN (MODerate resolution atmospheric TRANsmission) radiative transfer code [29].Finally, two atmospherically-corrected WV2 MS orthoimages with a RMSE 2D of around 2.20 m were generated for both dates.
Twenty cloud-free multi-temporal L8 OLI scenes type Level 1 Terrain Corrected (L1T) located at the footprint Path 200 and Row 34, each one comprising the whole working area, were downloaded from the U.S Landsat archive.The L8 scenes were taken with a dynamic range of 12 bits from February 2014 to November 2015.These images were really taken as two different time series: One, corresponding to the year 2014, was composed by ten scenes (19 February,7  We clipped the original images, both PAN and MS, according to the study area.Bearing in mind that spatial resolutions equal to or better than 8-16 m are recommended by Levin et al. [2] to work on greenhouse detection via remote sensing, the fusion of the information of PAN and MS Landsat 8 bands was performed through applying the PANSHARP module included in Geomatica v. 2014.The pan-sharpened images were generated with a GSD of 15 m, 12-bit depth and eight bands (C, B, G, R, NIR, CI, SWIR1 and SWIR2).We applied the ATCOR module (Geomatica v. 2014) to the 20 L8 pan-sharpened images to obtain ground reflectance values.More details about this third step can be obtained from the research published by Aguilar et al. [19].The last process involved the rectification and co-registration of the L8 images.Although all acquired L8 L1T images have been previously orthorectified and corrected for terrain relief [30], a subsequent co-registration is needed, since an extremely accurate spatial matching among multi-temporal images is crucial in this type of work [31,32] for both pixel and object-based image analysis.The pan-sharpened WV2 2013 orthoimage with a GSD of 0.4 m was used as the reference image.Each pan-sharpened L8 image was rectified with this reference image using 50 ground control points evenly distributed over the working area and a first order polynomial transformation.The resulting RMSE 2D for the geometrically-corrected images ranged from 5.29 m to 6.91 m.

Segmentation
As shown in the above section, two temporally different datasets are going to be analyzed: (1) 2014 L8 time series and 2013 WV2 MS image; and (2) 2015 L8 time series and 2015 WV2 MS image.The original approach relied on employing the WV2 MS geometrically and atmospherically corrected imagery from 2013 and 2014 to obtain the ideal segmentation for each time series (i.e., 2014 and 2015).Unfortunately, there was no availability of a VHR in 2014, so we decided to use the image taken in September 2013 as the closest date to the 2014 L8 time series.
We utilized the multi-resolution (MRS) segmentation algorithm included into the OBIA software eCognition v. 8.8.This segmentation approach is a bottom-up region-merging technique starting with one-pixel objects.In numerous iterative steps, smaller objects are merged into larger ones [20].However, this task is not easy, and it highly depends on the desired objects to be segmented [33].The outcome of the MRS algorithm is controlled by three main factors: (1) the homogeneity criteria or scale parameter (SP) that determines the maximum allowed heterogeneity for the resulting segments; (2) the weight of color and shape criteria in the segmentation process (Shape); and (3) the weight of the compactness and smoothness criteria (Compactness).The optimal determination of these three somewhat abstract terms is not easy to carry out.However, a detailed review of the MRS algorithm is beyond the scope of this paper.More information about its mathematical formulation can be found in the literature [20,33,34].
Several combinations of SP, Shape and Compactness parameters were tested in this work for the two WV2 images.According to Kavzoglu and Yildiz [35], five values {0.1, 0.3, 0.5, 0.7, 0.9} were selected as alternative weights for both Shape and Compactness parameters.The SP values were ranging from 30 to 60 in increments of one.The MRS outputs were always computed by taking into account the eight equal-weighted bands corresponding to the WV2 MS orthoimages.
At this point, a segmentation evaluation based on supervised segmentation quality metrics was carried out in order to choose the best segmentation for each WV2 image.Unlike previous studies that used only 30 reference polygons per class [23,36], in our research, 60 greenhouses were manually digitized working on the WV2 pan-sharpened orthoimages.In this work, and to quantitatively assess the goodness of the different segmentations, the outputs were compared to these 60 manually delineated reference polygons representing greenhouses.
Concretely, the supervised discrepancy measure known as Euclidean Distance 2 (ED2) recently proposed by Liu et al. [23] was the method used in this work to find out the best estimate segmentation.In a nutshell, ED2 considers, in a two dimensional Euclidean space, both the geometrical discrepancy (by mean of the potential segmentation error, PSE) and the arithmetic discrepancy between image objects and reference polygons (by using the number-of-segmentation ratio, NSR).

Features Used to Carry Out Object-Based Classification
Several groups of features were retrieved from both L8 time series and a WV2 single image.Thus, the eight bands provided by the WV2 MS single orthoimage and L8 time series were used to obtain the object-based features shown in Tables 1 and 2, respectively.The selection of the features was mainly based on the results of previous researches [6,19], although a few new object-based characteristics were also tested.In addition to the object-based features containing spectral information (mean and standard deviation values of all pixels within an object) and several spectral and vegetation indices, the capabilities of the spectral metric Moment Distance Index (MDI) [37,44,45] was explored in this study as one of the predictor variables for greenhouse mapping.To the best knowledge of the authors, it is the first time that this index is tested in plastic covering mapping.The MDI exploits the available bands of the remote sensing image by analyzing the shape of the reflectance spectrum, and at each composite, calculating the moment distances among the bands through simple geometric operations.The robustness of the method in defining the shape of the spectral curve derives from the refereed distances from two point locations designated as shorter pivot and longer pivot wavelength region, assuming that the reflectance curve is displayed in Cartesian coordinates with the abscissa displaying the wavelength (λ) and the ordinate displaying the reflectance (ρ).The subscript LP denotes the left pivot (located in a shorter wavelength) and subscript RP denotes the right pivot (located in a longer wavelength).λ LP and λ RP are the wavelength locations observed at the left and right pivots, respectively, where left (right) indicates a shorter (longer) wavelength.In our case, all the eight available bands of L8 and WV2 images, given by normalized reflectance values ranging from 0 to 1 and wavelengths expressed in µm, were used to compute MDI.
The moment distance from the left pivot (MD LP ) to the right one (MD RP ) is the sum of the hypotenuses constructed from the left pivot to the value at successively longer wavelengths (index i from λ LP to λ RP ); one base of the triangle is the difference from the left pivot (i-λ LP ) along the abscissa and the other is simply the value at i (Equation ( 1)).Similarly, the moment distance from the right pivot (MD RP ) is the sum of the hypotenuses constructed from the right pivot to the value at successively shorter wavelengths (index i from λ RP to λ LP ); one base of the triangle is the difference from the left pivot (λ RP ´i) along the abscissa and the other is the value at i (Equation ( 2)).Finally the MDI can be computed as shown in Equation (3).More details about MDI formulation can be found in Salas et al. [45].
In regard to GLCM features, only three out of the 14 originally proposed by Haralick et al. [43] were finally considered in the case of WV2 imagery.Homogeneity, dissimilarity and entropy texture features were always computed considering all of the directions over the eight WV2 bands.Moreover, most of the indices tested in the case of WV2 imagery were adapted for this satellite from the work published by Oumar and Mutanga [39].
Lastly, simple statistical seasonal features were also computed from the 10 images of both Landsat 8 time series for the years 2014 and 2015.Those features were MAX (maximum), MIN (minimum) and DIF (difference between MAX and MIN) values from the whole time series images for each feature shown in Table 2.The average for the eight Mean DIF values (AVG.DIF 8) computed for each time series was also included as seasonal statistic feature.

Extraction of Features
The OBIA software eCognition v. 8.8 was employed for the extraction of the object-based features from both aforementioned L8 and WV2 orthoimages.In the case of WV2 features, the best segmentations attained from WV2 MS 2013 (1.6 m GSD) and WV2 MS 2015 (2 m GSD) orthoimages were used to extract the object-based features dataset.The best WV2 2013 segmentation was transferred to the ten 2014 L8 pan-sharpened orthoimages through the chessboard segmentation algorithm included in eCognition.In the same way, the vector file containing the ideal segmentation from WV2 2015 was used on the ten L8 images from 2015.In other words, object-based techniques were applied on L8 pan-sharpened imagery but working on the WV2-based segmentation.Two different eCognition projects were conducted for 2014 and 2015.It is important to note that the 15 m GSD of the L8 pan-sharpened orthoimages was increased to 1.875 m by halving three times the original pixel size.Thus, the L8 segmentation for 2014 and 2015 time series fit much better to the segmentations undertaken on the WV2 2013 and 2015 orthoimages.This way to combine L8 and WV2 information using an OBIA approach was already reported by Aguilar et al. [19].

Decision Tree Modeling and Classification Accuracy Assessment
A decision tree (DT) classifier based on the algorithm proposed by Breiman et al. [46] has been used in this work.This classifier presents a very clear structure composed of several splits with single threshold values.In this sense, the DT classifier could be implemented directly in eCognition by means of rule sets to determine simple and robust feature thresholds.DT is a non-parametric rule-based method in which each node makes a binary decision that separates either one class or some of the classes from the remaining class (or classes).Classification through the DT algorithm has already been applied in greenhouse mapping via remote sensing with very good and robust results [17,19].Moreover, it has been the selected classifier in recent OBIA investigations focused on outdoor crop identification [47,48].Other advantages of DT classifiers are: (1) they fit well to the OBIA procedure; (2) DTs are computationally fast and make no assumptions regarding the distribution of the data; (3) they are able to take numerous input variables and perform rapid classification without being severely affected by the "curse of dimensionality"; and (4) the DT methods provide quantitative measurements of each variable's relative contribution to the classification results, allowing users to rank the importance of input variables.
From the two ideal segmentations attained by using the WV2 MS 2013 and 2015 orthoimages, two different datasets composed by 3000 objects were selected as the object-based ground truth.In both cases, 1500 segments for the "Greenhouse (GH)" class and 1500 for "Non-Greenhouse (Non-GH)" class were extracted by means of a visual-based sampling procedure carried out over the pan-sharpened WV2 orthoimages.All these objects represented meaningful objects for each class (i.e., segmented objects which delineate well their corresponding real objects), avoiding mixed segments.Finally, and according to Peña-Barragán et al. [47], the 3000 sampling objects for each ground truth were randomly separated into two sets: 50% for model training and 50% for model validation.Both datasets contained 750 GH and 750 Non-GH segments.
STATISTICA v10 was used for the classification and regression decision tree analysis (StatSoft Inc., Tulsa, OK, USA).Briefly, the DT algorithm seeks to split the data into segments that are as homogeneous as possible with respect to the dependent or response variable.In this case, the categorical dependent variable was the class (GH or Non-GH).The DT node-splitting rule was the Gini index, which is a measure of impurity for a given node [49].Its application attempts to maximize the homogeneity of the child nodes with respect to the values of the dependent variable.The experimental design was implemented within the STATISTICA environment through a stratified 10-fold cross-validation procedure performed on the training dataset (i.e., 750 GH and 750 Non-GH).The computed DT model was finally applied to the validation dataset to obtain the corresponding confusion matrices.
The DT classifier also provides an assessment of the relative importance of the different features or variables during the classification process.This aspect is useful for multi-source studies, where data dimensionality is very high and it is important to know how each predictive variable influences the classification model in order to select the best ones [50].To assess the importance of each feature, the DT switches one of the input random variables while keeping constant the remaining ones.
However, and because of the aforementioned confusion matrix was based on meaningful objects, the errors related to segmentation process and/or mixed objects were not considered.Thus, a new classification accuracy assessment based on the real ground truth for the whole study area was carried out.Hence, two real ground truths for 2014 and 2015 were manually digitized over the WV2 pan-sharpened orthoimages from 2013 and 2015 (Figure 2).These ground truths were finally exported as raster files with 1.6 m and 2 m pixel size for 2014 and 2015, respectively.In this way, a new pixel-based set of confusion matrices were attained to show the real classification accuracy.In this case, the DT models were developed for different datasets (WV2, L8 time series and WV2 + L8 time series) by using all the 3000 reference and meaningful objects as training set.Moreover, the original feature space was significantly reduced to the ten most important features for each case.The threshold rules provided by the DT models were implemented into eCognition and the classification results were matched with the real ground truth by means of a pixel-based accuracy assessment.
The classification accuracy assessment in this work was based on the two aforementioned error matrices (i.e., 3000 based-object error matrix and the pixel-based one computed over the manually digitized whole study area).The accuracy measures extracted from the confusion matrices were user's accuracy (UA), producer's accuracy (PA), overall accuracy (OA) and kappa coefficient (kappa) [51].Additionally, in order to offer significance to the given results in the based-object error matrix, confidence intervals (p < 0.05) were calculated by applying the so-called Exact method [52].It corresponds to the maximum likelihood estimate (i.e., the actual value of the estimated accuracy OA), even when it is not symmetrical.

Segmentation
In the case of the MRS carried out on the WV2 MS orthoimage taken in September 2013, the minimum value of ED2 (0.11) was reached for the following setting: SP = 49, Shape = 0.3 and Compactness = 0.5.This segmentation was composed of 12,676 objects.Regarding the optimal segmentation for WV2 MS orthoimage from July 2015, 12,975 objects were delimited by using a combination of 34, 0.3 and 0.5 values for SP, Shape and Compactness parameters, respectively.In this case, the ED2 value was 0.29.

Object-Based Accuracy Assessment
In this section, the goal was to carry out an accuracy assessment based on the 3000 objects per year (1500 for training and the other 1500 for validation).The accuracy results (i.e., OA, kappa, PA for Greenhouse class (PA_GH) and UA for Greenhouse class (UA_GH)) shown in Figure 3

Segmentation
In the case of the MRS carried out on the WV2 MS orthoimage taken in September 2013, the minimum value of ED2 (0.11) was reached for the following setting: SP = 49, Shape = 0.3 and Compactness = 0.5.This segmentation was composed of 12,676 objects.Regarding the optimal segmentation for WV2 MS orthoimage from July 2015, 12,975 objects were delimited by using a combination of 34, 0.3 and 0.5 values for SP, Shape and Compactness parameters, respectively.In this case, the ED2 value was 0.29.

Object-Based Accuracy Assessment
In this section, the goal was to carry out an accuracy assessment based on the 3000 objects per year (1500 for training and the other 1500 for validation).The accuracy results (i.e., OA, kappa, PA for Greenhouse class (PA_GH) and UA for Greenhouse class (UA_GH)) shown in Figure 3  noteworthy that our DT classifier was always computed using the prune based on misclassification error as the stopping rule.In the same way as Breiman et al. [46], minimal cost-complexity crossvalidation pruning was performed to find the "right-sized" tree from amongst all the possible trees.Usually, only two or three split features were selected for each DT.
Regarding the OA and kappa values (Figure 3a,c), although the accuracies reached from the single L8 images resulted to be very changeable, the results from 2015 single L8 images were overall better than the ones attained from 2014.We have to bear in mind that the segmentation and the class assigned to the 3000 objects were based on the WV2 image taken in September 2013 (i.e., close but out of 2014).Without any doubt, this fact has come to penalize the final accuracy attained from the single images taken in 2014.
When the statistical seasonal features from the L8 time series of 2014 and 2015 (2014 TS and 2014 TS) were considered, the accuracy was improved, especially in 2015.The OA values were 97.3% and 98.8% for 2014 TS and 2015 TS, respectively.Furthermore, these values turned out to be statistically significant (p < 0.05) when the confidence intervals were calculated according to [52].
In the case of WV2 single images, slightly higher OA and kappa values were achieved by using the WV2 2013 image.The OA for WV2 2013 took a value of 98.3%, statistically varying from 97.5% to 98.9% (confidence interval at a significance level of p < 0.05), while in the case of WV2 2015, the OA was 97.7%, with a confidence interval ranging from 96.8% to 98.4%.This tiny and non-significant difference in OA might be due to the different GSD (i.e., 2 m in 2015 and 1.6 m in 2013).When the information contained in WV2 single images was added to the L8 time series, the accuracy only improved in 2014 while no changes were registered in 2015.In other words, WV2 2015 was not able to add more information to the 2015 TS strategy.In this case, the OA values reached from both strategies were 98.4% and 98.8% for 2014 WV2 + TS and 2015 WV2 + TS, respectively (statistically non-significant differences at p < 0.05).
Figure 3c,d shows the Producer's and User's accuracy for the Greenhouse class from all the strategies.The PA_GH (Figure 3c) is an accuracy measure related to error of omission (i.e., the probability of leaving without classifying an object belonging to the Greenhouse class).The UA_GH indicates the expected error for the Greenhouse class when using the classification map in field or, in It is important to highlight that the DT model computed for 2015 TS and 2015 WV2 + TS strategies were exactly the same.Moreover, only two features were considered in these cases (i.e., the minimum value of MDI (MIN MDI) and the maximum value of NDVI (MAX NDVI) for 2015 L8 time series).Thus the dashed and solid red lines appear overlapped in Figure 3.At this point it is noteworthy that our DT classifier was always computed using the prune based on misclassification error as the stopping rule.In the same way as Breiman et al. [46], minimal cost-complexity cross-validation pruning was performed to find the "right-sized" tree from amongst all the possible trees.Usually, only two or three split features were selected for each DT.
Regarding the OA and kappa values (Figure 3a,c), although the accuracies reached from the single L8 images resulted to be very changeable, the results from 2015 single L8 images were overall better than the ones attained from 2014.We have to bear in mind that the segmentation and the class assigned to the 3000 objects were based on the WV2 image taken in September 2013 (i.e., close but out of 2014).Without any doubt, this fact has come to penalize the final accuracy attained from the single images taken in 2014.
When the statistical seasonal features from the L8 time series of 2014 and 2015 (2014 TS and 2014 TS) were considered, the accuracy was improved, especially in 2015.The OA values were 97.3% and 98.8% for 2014 TS and 2015 TS, respectively.Furthermore, these values turned out to be statistically significant (p < 0.05) when the confidence intervals were calculated according to [52].
In the case of WV2 single images, slightly higher OA and kappa values were achieved by using the WV2 2013 image.The OA for WV2 2013 took a value of 98.3%, statistically varying from 97.5% to 98.9% (confidence interval at a significance level of p < 0.05), while in the case of WV2 2015, the OA was 97.7%, with a confidence interval ranging from 96.8% to 98.4%.This tiny and non-significant difference in OA might be due to the different GSD (i.e., 2 m in 2015 and 1.6 m in 2013).
When the information contained in WV2 single images was added to the L8 time series, the accuracy only improved in 2014 while no changes were registered in 2015.In other words, WV2 2015 was not able to add more information to the 2015 TS strategy.In this case, the OA values reached from both strategies were 98.4% and 98.8% for 2014 WV2 + TS and 2015 WV2 + TS, respectively (statistically non-significant differences at p < 0.05).
Figure 3c,d shows the Producer's and User's accuracy for the Greenhouse class from all the strategies.The PA_GH (Figure 3c) is an accuracy measure related to error of omission (i.e., the probability of leaving without classifying an object belonging to the Greenhouse class).The UA_GH indicates the expected error for the Greenhouse class when using the classification map in field or, in other words, the commission error.Both Producer's and User's accuracy for the Greenhouse class achieved very good results when WV2 + TS strategy was applied.The reliability of the classification (UA) was very high for the Greenhouse class, with a probability of success greater than 98.9% for both years.

Importance of Features
The rankings of the 10 most important features used in both WV2 and L8 single images classification are depicted in Table 3.The ranking for WV2 is achieved by combining the importance of features for the 2013 and 2014 images.In the case of L8, it is performed by merging the importance of features from the 20 single images.For this comparison, the DT projects were carried out using the 3000 objects as training set in order to be more accurate in finding the most important features for each strategy.
The most important features for WV2 single images resulted to be the object-based Mean Coastal and Mean Blue features.These two features were also ranked as very relevant (second and third place) in the case of L8 single images.However, the most significant finding consisted of the MDI feature was selected the first and the most important split feature in 18 out of the 20 DT computed for the single L8 images.Moreover, the threshold or cutting value for this feature was extremely robust with respect to time.In fact, the mean and standard deviation (SD) values of MDI computed thresholds for nine out of 10 images taken in 2014 were 4.33 and 0.10, respectively, while in the case of the 2015 single images, very similar values of 4.38 and 0.08 (mean and SD) were achieved.

Pixel-Based Accuracy Assessment and Proposed Threshold Model
After analyzing the most relevant features and the most temporally stable cutting values for all the studied cases, a threshold model (TM) for greenhouse detection was proposed.The selected features were MIN MDI and DIF SWIR1 derived from the L8 time series, and the single value of MDI for the WV2 image considered for each year (Figure 4).It is worth noting that the classifier training phase is not required when using this approach.This TM was again implemented into eCognition to perform the pixel-based accuracy assessment.
Figure 5 shows the OA values for different accuracy assessment schemes studied in this work: (1) object-based using the 3000 reference and meaningful objects as training set and showing the OA confidence intervals (p < 0.05); (2) pixel-based scheme based on the ground truths manually digitized for the whole study area; and (3) the proposed TM in Figure 4.It is important to note that in this section we only used the top ten features shown in Tables 3 and 4. As expected, the OA values were always much higher in the case of object-based accuracy assessments than for the pixel-based ones.Figure 5 depicts these real accuracy results (pixel-based) where the OA values were ranging from 92.3% to 93.8%.When all available information was used (WV2 + TS), the achieved accuracies were 93.8% and 93.1% for 2014 and 2015, respectively.On the other hand, the worst (although quite good) results were attained only using the Landsat data.A very accurate and temporally stable classification can be achieved by only using the TM, including features from L8 and WV2 (TM L8 + WV2).OA values of 93.0% and 93.3%, and kappa coefficients of 0.856 and 0.861, were attained for 2014 and 2015 datasets, respectively (Figure 5).
Figure 6 presents the classification process sequence by means of the TM for a detailed area and each year.The first step (Figure 6a,b for 2014 and 2015) was addressed to obtain the best segmentations from WV2 MS images.In the second step, the two ground truths were manually digitized on the WV2 orthoimages for each year (i.e., Figure 6c for 2013 and Figure 6d for 2015).The reader can make out that there are few changes between the land cover of both ground truths.After applying the TM through an OBIA approach, a pixel-based comparison was performed between the ground truth and the classification map for each period (Figure 6e,f).A very accurate and temporally stable classification can be achieved by only using the TM, including features from L8 and WV2 (TM L8 + WV2).OA values of 93.0% and 93.3%, and kappa coefficients of 0.856 and 0.861, were attained for 2014 and 2015 datasets, respectively (Figure 5).
Figure 6 presents the classification process sequence by means of the TM for a detailed area and each year.The first step (Figure 6a,b for 2014 and 2015) was addressed to obtain the best segmentations from WV2 MS images.In the second step, the two ground truths were manually digitized on the WV2 orthoimages for each year (i.e., Figure 6c for 2013 and Figure 6d for 2015).The reader can make out that there are few changes between the land cover of both ground truths.After applying the TM through an OBIA approach, a pixel-based comparison was performed between the ground truth and the classification map for each period (Figure 6e,f).A very accurate and temporally stable classification can be achieved by only using the TM, including features from L8 and WV2 (TM L8 + WV2).OA values of 93.0% and 93.3%, and kappa coefficients of 0.856 and 0.861, were attained for 2014 and 2015 datasets, respectively (Figure 5).
Figure 6 presents the classification process sequence by means of the TM for a detailed area and each year.The first step (Figure 6a,b for 2014 and 2015) was addressed to obtain the best segmentations from WV2 MS images.In the second step, the two ground truths were manually digitized on the WV2 orthoimages for each year (i.e., Figure 6c for 2013 and Figure 6d for 2015).The reader can make out that there are few changes between the land cover of both ground truths.After applying the TM through an OBIA approach, a pixel-based comparison was performed between the ground truth and the classification map for each period (Figure 6e,f).

Discussion
The first and crucial step in the OBIA approach is the image segmentation.ED2 metric presented a very good relationship with the visual quality of the greenhouse segmentations.Moreover, this measure was clearly related to SP and Shape (values of Shape should be lower than 0.5), but had no

Discussion
The first and crucial step in the OBIA approach is the image segmentation.ED2 metric presented a very good relationship with the visual quality of the greenhouse segmentations.Moreover, this measure was clearly related to SP and Shape (values of Shape should be lower than 0.5), but had no relationship with Compactness.In fact, the weight of this last parameter has been generally set to a fix value of 0.5 [53,54].Although the attained segmentations based on ED2 were of high quality, further works are needed on this issue.Concretely, the optimal combination of bands, the use of VHR pan-sharpened images instead of MS ones and the possible introduction of elevation data should be investigated.
It is important to highlight that in Figure 3a,b, the accuracy was significantly improved by means of statistical seasonal features from the L8 time series.The seasonal features had already been used by Zillmann et al. [55] in grassland mapping from multi-sensor image time series.They reported that seasonal statistics should be more consistent over years than snapshot values attained from the original single images.
The texture features based on the Haralick's matrices tested in both WV2 single images were located at the final positions of the ranking of the importance of features (Section 5.3), being their contribution for the classification of plastic greenhouses very low.In this sense, Agüera et al. [3] reported that the inclusion of texture information in the per-pixel classification of greenhouses from VHR imagery did not improve the classification quality.More recently, Wu et al. [15], working with a single Landsat 8 pan-sharpened image within the context of an OBIA approach, found that texture information had very little influence on greenhouse detection.By contrast, the most important feature for greenhouse mapping based on single L8 images was the MDI.In fact, a feature specifically aimed at detecting plastic covering such as PLMI [17] showed much less relevance than MDI.
Figure 5 shows that the OA values were much better when the object-based accuracy assessment was used instead of the pixel-based ones.In that sense, the reader should bear in mind that the 3000 objects selected as ground truth were always meaningful segments.In other words, there were no mixed samples (objects) containing both greenhouse and non-greenhouse land covers.The misclassification problems due to errors in the segmentation stage were also avoided in the case of the object-based accuracy assessment.Moreover, the always very high OA values might undergo small changes as a result of the features selected by the DT model and the objects considered in the evaluation set (see OA in Figure 3a).Also, no measures were taken to test, reduce and minimize the effect of spatial autocorrelation in the datasets composed of 3000 objects.Thus, spatial autocorrelation may lead to inflated accuracy.
The TM developed in this work and shown in Figure 4 achieved very accurate and temporally stable classification results.In fact, these accuracy values resulted much more stable over the time than those reported by Lu et al. [17,18] working with time series from Landsat 5 TM and MODIS.The main source of error of the proposed TM was located at elongated objects situated between greenhouses where Non-GH objects were classified as GH (see objects in brown in Figure 6e,f).According to Wu et al. [15], mixed pixels are still commonly present in the pan-sharpened L8 data as a result of the heterogeneity of the landscape and the limitations imposed by the 15 m spatial resolution of the image.This fact is extremely important in the case of the outstretched objects placed between greenhouses.
As it was aforementioned, the use of seasonal statistics from L8 time series allowed us to improve the classification results based only on a single image.For example, the difference of SWIR1 band (DIF SWIR1) seems to be related to the fact that plastic sheets are usually painted white (whitewashed) during summer to protect plants from excessive radiation and to reduce heat inside the greenhouse [6].This unique whitewashing management provokes high reflectance values in the SWIR1 bands mainly in the months of July and August, and thus a high value of DIF SWIR1 in the time series (higher than 15 in the TM shown in Figure 4).Based on this fact, the already reported misclassification between white buildings and plastic greenhouses [6,56] can be avoided.In other words, white buildings can show high values of SWIR1 but not very variable over the time.Figure 6a,b shows two industrial white buildings marked with red letters (B), which are finally well classified thanks to the DIF SWIR1 threshold.Regarding the MDI object-based feature computed from both WV2 and L8 sensors, it turned out to be the most important feature related to plastic greenhouse detection.Moreover, it showed extreme robustness over time.All of the single L8 images presented an almost constant threshold value of MDI of around 4, while for WV2 single images the MDI cutting value was located between 0.5 and 0.55.Without any doubt, this has been one of the most important findings in this work.Moreover, the discriminatory power of the MDI was improved by using the MIN MDI in the case of L8 time series.
MDI was developed to take advantage of the information latent in the shape of the reflectance curve that is not available from other spectral indices [44].This spectral metric has been able to discriminate between GH and Non-GH from an OBIA approach.However, we have detected a few confusion problems mainly with building class.These misclassification errors can be solved using seasonal statistic features.Really, in this paper, we are only scratching the surface, so further investigations should be needed in order to understand the relationship between the MDI and plastic greenhouses.Furthermore, the threshold values of MDI or MIN MDI for this study were obtained from the same study area and should be further tested in other greenhouse areas and even using other sensors.

Conclusions
In this work, a new and accurate approach to address the problem of greenhouse mapping was presented.It consisted of the combined use of VHR satellite data (W2 Multispectral imagery in this case) and L8 OLI pan-sharpened time series within the context of an OBIA approach.Additionally, the classification results were assessed for two different years in order to test the desirable temporal stability.
The first stage was focused on achieving the optimal segmentation for individual greenhouses by using MRS algorithm on the single WV2 MS orthoimages.In this sense, ED2 metric presented a very good relationship with the visual quality of the greenhouse segmentations so allowing select the main MRS parameters (i.e., Scale, Shape and Compactness) for each WV2 orthoimage.
In the second step, the DT classifier was selected to determine the best object-based features for greenhouse mapping and estimate their more adequate threshold values.Many strategies were carried out involving different combinations of single images from L8 and WV2.L8 time series were also tested allowing use seasonal statistics, which significantly improved the accuracy attained through single L8 images.
Finally, a spectral metric feature, never before tested in greenhouse detection or in OBIA approaches, turned out to be the most important characteristic related to plastic greenhouse detection for both WV2 and L8 sensors.Moreover, it showed an extremely robustness over the time.In fact, a TM only comprising three features was proposed for the two considered years.The first two selected features, the minimum value of MDI (MIN MDI) and the difference between the maximum and minimum value of SWIR1 (DIF SWIR1), were derived from the L8 time series.The last one was the single value of MDI for the WV2 image.In this sense, it was proven that a very accurate and temporally stable classification of individual greenhouses can be achieved by only using the proposed TM in which the classifier training phase is not required.Very stable over the time OA values of 93.0% and 93.3% were attained for 2014 and 2015 datasets, respectively.
These findings are quite promising, though they should be further contrasted working on other greenhouse or plastic-mulched land cover locations and also testing others medium spatial resolution satellites such as the recently launched Sentinel-2.

Figure 1 .
Figure 1.Location of the study area on a RGB Landsat 8 image path 200 and row 34, covering the "Poniente" region.Coordinate system: ETRS89 UTM Zone 30°N.

Figure 1 .
Figure 1.Location of the study area on a RGB Landsat 8 image path 200 and row 34, covering the "Poniente" region.Coordinate system: ETRS89 UTM Zone 30 ˝N.

Figure 2 .
Figure 2. Manually digitized ground truths in ETRS89 UTM coordinate system: (a) Ground Truth for September 2013; and (b) Ground Truth for July 2015.

Figure 2 .
Figure 2. Manually digitized ground truths in ETRS89 UTM coordinate system: (a) Ground Truth for September 2013; and (b) Ground Truth for July 2015.

Figure 3 .
Figure 3. Accuracy assessment based on the 1500 segments belonging to the validation set for each considered year and strategy: (a) Overall Accuracy (OA); (b) Kappa coefficient (k); (c) Producer's accuracy for the Greenhouse class (PA_GH); and (d) User's accuracy for Greenhouse class (UA_GH).

Figure 3 .
Figure 3. Accuracy assessment based on the 1500 segments belonging to the validation set for each considered year and strategy: (a) Overall Accuracy (OA); (b) Kappa coefficient (k); (c) Producer's accuracy for the Greenhouse class (PA_GH); and (d) User's accuracy for Greenhouse class (UA_GH).

Figure 4 .
Figure 4. Threshold Model proposed for classifying greenhouse (GH) based on temporally stable features.

Figure 5 .
Figure 5. Overall accuracy achieved for each year through object-based (3000 segments and presenting confidence intervals), pixel-based (ground truth manually digitized) and Threshold Model using features from Landsat 8 and WV2 (TM L8 + WV2).

Figure 4 .
Figure 4. Threshold Model proposed for classifying greenhouse (GH) based on temporally stable features.

Figure 4 .
Figure 4. Threshold Model proposed for classifying greenhouse (GH) based on temporally stable features.

Figure 5 .
Figure 5. Overall accuracy achieved for each year through object-based (3000 segments and presenting confidence intervals), pixel-based (ground truth manually digitized) and Threshold Model using features from Landsat 8 and WV2 (TM L8 + WV2).

5 .
Overall accuracy achieved for each year through object-based (3000 segments and presenting confidence intervals), pixel-based (ground truth manually digitized) and Threshold Model using features from Landsat 8 and WV2 (TM L8 + WV2).

Figure 6 .
Figure 6.Pixel-based accuracy assessment process attained through the threshold model for a detailed area of 950 m × 900 m for each year: (a) segmentation on WV2 MS 2013 orthoimage.B letters indicate white buildings; (b) segmentation on WV2 MS 2015 orthoimage.B letters indicate white buildings; (c) manually digitized ground truth for 2013 (used for 2014 TS) with Greenhouses in red and Non-Greenhouses in green; (d) manually digitized ground truth for 2015 with Greenhouses in red and Non-Greenhouses in green; (e) classification results using the TM approach for 2013-2014 with errors in brown; and (f) classification results using the TM approach for 2015 errors in brown.

Figure 6 .
Figure 6.Pixel-based accuracy assessment process attained through the threshold model for a detailed area of 950 m ˆ900 m for each year: (a) segmentation on WV2 MS 2013 orthoimage.B letters indicate white buildings; (b) segmentation on WV2 MS 2015 orthoimage.B letters indicate white buildings; (c) manually digitized ground truth for 2013 (used for 2014 TS) with Greenhouses in red and Non-Greenhouses in green; (d) manually digitized ground truth for 2015 with Greenhouses in red and Non-Greenhouses in green; (e) classification results using the TM approach for 2013-2014 with errors in brown; and (f) classification results using the TM approach for 2015 with errors in brown.

Table 1 .
‚2015 WV2: A red square depicts the accuracy assessment attained from the WV2 image taken in July 2015 using all features shown in Table1.‚2014 TS: A horizontal blue dashed line depicts the results from the complete L8 2014 time series.In this case, only the statistical seasonal features for the 2014 L8 time series were considered.
‚2015 L8 SI: Single L8 images from 2015 time series (10 images).A red solid circle represents each accuracy value for each single image from the 2015 time series.In this case, all the features for L8 shown in Table2were applied on every image.‚2013 WV2: A blue square depicts the accuracy assessment attained from the WV2 image taken in September 2013 using all features shown in

Table 3 .
The top 10 features for WV2 and L8 single images ranked based on importance.Features extracted from WV2 are shown in italics.

Table 4
depicts the most important features for L8 time series when only statistical seasonal features were considered.Three object-based features (MIN MDI, AVG.DIF 8 and DIF SWIR1) stand out over the rest.Regarding the information added by WV2 single images to the L8 time series, only two features (Mean Coastal and Mean Blue) slipped into the top 10 (Table4).

Table 4 .
The top 10 features for L8 time series considering only statistical seasonal features without and with the WV2 information ranked based on importance.Features from WV2 are shown in italics.