Doñana temporary marshes are one of the largest protected wetlands in Europe with an extent of 340 km2
, of which 300 km2
are included in the Doñana National Park. With a typical Mediterranean climate, the flooding cycle starts in September and usually reaches maximum flooding levels at the end of boreal winter, mainly driven by rainfall [1
]. The dominant clay and silt substrates of Doñana marshes are soaked with the first showers and a shallow water layer (0.51 m average water column) spreads over the floodable area (Figure 1
, solid blue area). In late spring, evaporation becomes the most important factor in the hydrological balance and the marshes dry up almost completely by the end of July. During summer, groundwater plays an important role in maintaining certain humidity levels at seepages [2
Under natural conditions, interannual variability of continental wetlands is driven by meteorological conditions (rainfall or water table oscillations). However, Doñana marshes have experienced throughout the last century a continuous and severe transformation and degradation as result of human intervention (channelling of river arms, river course dredges, wetlands drainage for farming and aquaculture, widespread groundwater extraction, pollution spills, waterway shipping, etc.). Doñana marshes have undergone not only drastic reduction of the original extent (around 1500 km2
]) but also a permanent threat to the protected area, due to the progressive reduction of both water supply and quality. Overall, changes in the original marsh area at the beginning of the 20th century have led to the current 270 km2
occupied by shallow waters (with an estimated loss of 80% from its original extent). In addition to wetland area reduction, tidal influence, one of the main flooding drivers, was also controlled in 1984 by the construction of a dike along the right bank of the Guadalquivir River inside Doñana (Figure 1
). Among such dramatic changes, a massive toxic spill in April 1998 to the Guadiamar River (the major tributary), led the scientific community and policy-makers to set up and implement an ambitious restoration program, named ‘Doñana 2005’, aiming at recovering the quantity and quality of the tributary waters entering these vast wetlands [1
]. The project is still ongoing, assessing the success of restoration action.
Now more than ever we need to know the historical flooding patterns, both spatial and temporal, and their relationships with natural climatic variability or anthropic modifications (land use changes and water demand). Details on flooding process, such as interannual and seasonal variations, as well as the influence of human transformations have long been demanded by decision-makers in order to apply scientifically based management to the Doñana marshes. So far, hydrological management has been conducted on an “event–reaction” basis, leading to temporal solutions that afterwards became part of new problems [5
]. Hydroperiod, the length of time each point remains flooded along an annual cycle [6
], is a critical ecological parameter that shapes aquatic plants’ and animals’ distribution and determines available habitat for many of the living organisms in the marshes [7
]. Annual hydroperiod is also the ultimate result of flooding dynamics due to human activities and global trends.
Thus, the sustainable management of the Doñana wetlands requires a better understanding of the structure and dynamics of the flooding process. This knowledge can be achieved through systematic long-term monitoring, although access limitations may make the cost of direct monitoring prohibitive. Remote sensing is cost-effective in terms of the extent surveyed and has the potential to scale up ground-based surveys and studies [9
] by repeatedly imaging large areas. Many methods are available to represent accurately on-the-ground information and phenomena through collected imagery. One of the main advantages of remote sensing information sources comes from the possibility to access, for any place in the world, historical scenes revealing past land uses and cover.
Optical images have been widely proved to be an excellent tool to discriminate continental water bodies [10
] from other targets, or to monitor inland water characteristics [12
], but their use is less frequent for shallow wetlands covered by aquatic plants [17
]. Many previous studies have used optical images to reveal and assess the damage caused by catastrophic floods as well [19
]. Low reflectance in the Short Wave Infrared (SWIR) region (1.4–3 µm) is characteristically related to the presence of water [21
]. However, several water characteristics such as depth and turbidity, as physical constraints, reduce the ability to discriminate accurately, in a simple way, shallow water’s extent [19
]. In addition, shallow and clean waters make the bottom visible, which increases radiometric signal from the soil. Turbid waters usually show reflectance peaks between 700 and 900 nm, related to the presence of phytoplankton and/or suspended sediments [22
]. Moreover, the presence of aquatic plants contributes to the noise when delineating wetlands due to their heterogeneous distribution. These constraints enhance the challenge of systematically producing flood maps for shallow wetlands such as the Doñana marshes.
Alternatively, many studies have relied on microwave imagery to map flood levels [24
]. Radar sensors have several advantages over optical and thermal ones that make them particularly applicable to monitoring hydrologic conditions of wetlands. Radar imaging operates independent of cloud cover and solar illumination, and microwaves are very sensitive to the moisture content of the elements being imaged. However, wetlands usually present complex vegetation cover and structure patterns and differences in surface roughness that affect the way the microwave radiation interacts with the target and ultimately how much energy is returned to the satellite [28
]. In addition, roughness caused by aquatic plants or wind on the water surface in shallow marshes may contribute to backscattering signal [29
] according to the SAR band used. Despite such limitations, radar is being widely used for continental wetlands delineation in many studies [30
Long time series of images have been stored and made available for several remote sensing sensors, allowing researchers to revisit any place in a former time. Moreover, for Landsat platforms, different sensors have been acquiring comparable images since 1972 [32
]. This valuable spatial and temporal information is enhanced by spectral resolution, which makes Landsat remote sensing images a very useful tool for land cover change mapping [33
]. In the case of large study areas, the cost-benefit rate rises when dealing with long-term monitoring programs and land use change analysis [34
In this work we assess the performance of different digital image analysis techniques with Landsat images seeking to accurately determine flooded areas in the Doñana marshes. Different empirical modeling techniques and supervised classification methods were evaluated. Method selection was based on ground-truth assessment of different flood classes. Once the optimal method was selected, our main goal was to implement a semiautomatic procedure to produce systematically reliable and comparable flood maps of the Doñana marshes for the last 40 years [36
]. The selected method should be consistent enough to be applied over a radiometrically normalized long time series of Landsat images. The resulting flooding masks allowed us to reconstruct historical flooding patterns and temporal trends for every single 30-m pixel in the Doñana marshes. Systematic flood mapping has enabled the reconstruction of an annual hydroperiod for every flooding cycle. This critical and ecologically meaningful variable is being used to assess the effects of global change processes and human management on the Doñana marshes.
2. Material and Methods
2.1. Study Area
The study area is located at the estuary of the Guadalquivir River (37°0′N 6°37′W), a large and shallow floodplain in Southwestern Spain (Figure 1
). The Doñana marshes are a vast seasonal freshwater wetland of international importance, considered as Western Europe’s largest sanctuary for migratory birds. In 1969 they were declared a national park, then a Biosphere Reserve in 1980, an Important Wetland Site under the Ramsar Convention in 1982, and a Natural World Heritage Site in 1984 [1
]. Doñana is also recognized as a Long-Term Socio-Ecological Research (LTSER) platform integrated in the LTER-Europe network [37
] by applying harmonized protocols for long-term socio-ecological research.
The climate of the area is subhumid Mediterranean with Atlantic influence. The mean annual temperature is 17 °C, the coldest month is January with 10 °C, and the warmest month is July with 24 °C. The average annual rainfall is 550 mm, with dramatic interannual variation, going from 170 mm (2004–2005) to 1000 mm (1995–1996). Rainfall concentrates between October and April, whereas the dry season occurs from May to September [38
]. The variable winter rainfall floods Doñana marshes, forming a wide floodplain that dries up during summer. Flooded areas are variable in depth, turbidity, and vegetation cover, being driven mainly by the amount and seasonal pattern of rainfall. In dry years, flooding occurs only on the sparsely distributed depressions (lucios
) and historical tidal courses (caños
). Maximum water volume may reach 265 hm3
during exceptionally wet flooding cycles such as the 1995–1996 one, achieving 2.47 m of water column at the deepest points, although average annual maximum depth is about 0.51 m for the whole marshes. Later in the season, the evaporation rate, going from 2 mm/day in January to 10 mm/day in July, controls the water bodies’ permanence [39
2.2. Satellite Imagery
We acquired all available cloud-free Landsat MSS, TM and ETM+ scenes for Doñana National Park (Path: 202, Row: 34). Altogether, 462 images covering the 1974–2014 period form the time series for this study (70 MSS, 262 TM and 130 ETM+). From them, up to 71 scenes were discarded due to different and unexpected acquisition problems such as missing lines and columns, radiometric incoherences, line shifts, and cloud cover over the study area. The minimum number of scenes per year was three for 1980, while a maximum of 27 scenes was available for 2009; the average number was 10 valid scenes per year. The final number of valid scenes was 391, evenly distributed along the flooding cycle (December to May, 45% of total available scenes).
2.3. Image Processing
In order to make the time series comparable, a semi-automatic robust and coherent pre-processing protocol was applied. The complete procedure included metadata retrieval from raw 1G data images, geometric and atmospheric corrections, followed by time series radiometric normalization using ordinary least squares regression towards a reference cloud-free and clear atmosphere scene (18 July 2002 for Landsat ETM+ and TM images and 28 August 1985 for Landsat MSS images) using pseudo-invariant areas [40
The panchromatic band (0.52–0.90 µm) of a Landsat 7 ETM+ image from 18 July 2002 was georeferenced with 100 ground control points (GCP) using the the 1998–1999 1:60,000 aerial orthophoto as reference [42
]. The rest of the TM and ETM+ images were automatically co-registered to this reference image with RMSE < 1 pixel. The resampling method was cubic convolution in order to improve spatial coherence of final classification maps [43
]. Average RMSE calculated with an independent set of GCP on co-registered images was 17.5 m. For MSS images we followed the same approach, using as reference a Landsat 4 summer scene acquired in 1985 by both sensors, MSS and TM, aboard the same platform. MSS output pixel size was set to 60 m × 60 m from the nominal 79 m × 57 m. RMSE for MSS co-registration was independently estimated in 23 m. For Landsat 7 ETM+ SLC-off images acquired after 31 May 2003 we applied gap filling based on the segmentation model approach [44
]. Such a method does not use external data to interpolate gaps but segments from the original image elements. With a similar method Chen et al. [45
] demonstrated similar land cover map classification accuracies compared to gap-free scenes.
Images were then atmospherically corrected and transformed into reflectance values using the Pons and Solé-Sugrañes [46
] method based on the dark object model [47
]. No topographic correction was applied since in the study area relief is negligible (the study area range being less than 10 m a.s.l.).
Finally, all the images were radiometrically normalized to allow comparisons across different sensors, dates, and atmospheric conditions by using a set of pseudo-invariant areas [40
]. Pseudo-invariant areas comprised over 60,000 pixels inside the scene, covering the full reflectance range, and selected from eight area types: deep sea, reservoirs, sand dunes, rocky outcrops, bare soils, airport runways, urban areas, and open mines [22
]. They were confirmed to have low seasonal changes in reflectance and to be present from the beginning of the image time series. The method includes a cloud masking process based on empirical slicing of the Landsat thermal and red bands. Accordingly, pseudo-invariant pixels were rejected whenever covered by clouds. We used the same Landsat 7 ETM+ image as the radiometric reference image, which provided a high sun angle and overall good radiometric quality. For every combination of scene and band, we calculated a linear regression model using all pseudo-invariant pixels. Pixels with regression residuals greater than 1 Standard Deviation were assumed to have changed significantly in reflectance between the scenes and were therefore rejected. Once removed, a new regression model was then adjusted. Offsets and gains from the fitted regression models were used to normalize the time series of images to the reference one (band to band processing). More than 97% of the scenes bands were fitted with R2
values greater than 0.90. As expected, offset values were consistently lower for winter scenes. Certain bands of up to 18 scenes could not be normalized due to radiometric inconsistencies and they were discarded.
2.4. Ground Truthing
The Doñana marshes are a highly dynamic seasonal wetland. They may appear as a dry cracked clay soil in summer, bare water pools during the flooding season, or pools densely covered by emergent and floating plants. In addition, the marshes usually exhibit patchy turbidity patterns due to differences in suspended sediment concentrations (Figure 2
). In order to capture such variability, we carried out 31 extensive ground-truth field campaigns coincident with 29 different Landsat overpasses (17 TM and 12 ETM+) from 2003 to 2013.
Ground-truth transects were designed to cross heterogeneous flooded areas and sampled simultaneous or recently after a satellite overpass. Sampling points were taken every 60 m in homogeneous covers collecting field information from a radius of 15 m, representative of the 30 m × 30 m Landsat TM and ETM+ pixel size. In order to collect the complete flooding gradient we recorded four different flooding classes (Figure 2
) in relation to the percent of soil covered by water: dry soil (0%), wet (1%–25%), waterlogged (25%–75%), and flooded (>75%). Additionally, we recorded ancillary information on many other relevant variables such as water turbidity, water depth, percent of bare ground, plant and open water cover, and plant species abundance and dominance (Table 1
). Such complementary information might help in assessing the discriminative ability of flooded pixels under different conditions. Geolocation of every point was recorded by means of PDA-GPS units with less than 4 m horizontal position error on average.
A total of 6005 field sampling points were collected by transects on foot and on horseback—and, in certain flood occasions, by canoe—to extend the ground-truth dataset. Ground-truth data were used to train (80% of the sample) and validate (20%) the classification procedures.
2.5. Flood Mapping and Accuracy Assessment
Clear water bodies have a very low reflectance in the optical spectrum, especially in the Near and Mid Infrared bands (bands 4, 5, and 7 of TM and ETM+). Reflectance in the visible and IR regions depends on the reflectance of the submerged soil, the water depth, the amount of suspended particles, and their optical properties [50
]. The abundance of optically active components, such as phytoplankton, suspended minerals, and dissolved organic carbon directly affect water turbidity and colour [52
]. The more turbid water bodies have higher reflectance values in the green and red visible bands [22
Several procedures have been proposed to identify shallow inundated areas based on the low reflectance values of water bodies in the SWIR (Short Wave InfraRed) region. We tested various optical indices proposed in the literature ([53
]) and several empirical classification approaches (Maximum Likelihood, Discriminant Function Analysis, and Logistic Regression using all six Landsat optical bands, and Classification Trees [18
] using Landsat TM and ETM+ band 5) to automatically determine the flooding level in Landsat scenes. Kyu-Shun et al. [56
], working in shallow wetlands with different turbidity levels, showed that Landsat TM band 5 (1.55–1.75 µm) is less sensitive to sediment-charged waters and therefore the best at delineating the limits between water and dry ground in turbid waters. Similar results were found by Bustamante et al. [57
], showing that a classification tree on band 5 of Landsat TM and ETM+ sensors performed better when discriminating the Doñana shallow flooding levels.
As our study is focused on flood and hydroperiod mapping, we generated from the four flooding classes a simple flooding binary masks by merging the dry and wet classes as non-flooded and the waterlogged and flooded classes as flooded class.
We tested the different classifiers on a set of eight Landsat TM and ETM+ scenes (between 2004 and 2007) with coincident ground-truth data identifying flooded and non-flooded classes. The set of scenes was selected in order to use a balanced sample of ground-truth points (flooded vs. non-flooded), to work with exact matching between acquisition dates and sampling dates (see Table S1
for acquisition dates and ground-truth data). We built up classification trees with every band 5 of the training images by using a subset of ground-truth data as training sample (80% of points). For every scene we obtained different thresholds to provide the two classes: flooded and non-flooded. Classification trees on band 5 for every training date provided slightly different threshold values. Classification performance was assessed with coincident ground-truth observations for every single scene. Predictive accuracy was assessed using a random set from 20% of all sampling points (183 samples) and by computing the Cohen’s Kappa statistic [58
] in order to measure the degree of agreement between classified and observed data and to test whether the prediction exceeded one that could be produced by chance alone (where k = 1 means perfect agreement and k = 0 no more agreement than by chance alone). Additionally, confusion matrices were calculated to retrieve Global Accuracy Values for every single classification tree, as an additional measure of the total accuracy of the classified maps [59
However, in order to select the best threshold for the whole time series, we carried out a global accuracy assessment by following a Jackknife inter-scene cross-validation approach [18
] (Figure 3
). The proposed threshold should be consistent throughout years, sensors, illumination angles, and land cover changes. The Jackknife procedure estimates the classification agreement for each subsample of images omitting the i
th observation to estimate the previously unknown agreement value. This inter-scene cross-validation consisted of three steps:
Computing classification trees, using the whole set of training images except for one of them each time, which was used as a validation image.
Computing the classification tree using all the training images.
Evaluating threshold accuracies from the all training images classification tree over the validation image with the ground-truth data for the corresponding date.
Once the optimal threshold was selected we applied it to the whole time series of images. Finally, as our study area corresponds to the Doñana marshes, we built up a background mask with the limits of the marshes, excluding the surrounding area where a network of water ponds may appear as flooded. Such a binary mask was systematically applied over every resulting flooding mask.
2.6. Hydroperiod Mapping and Trend Analysis
Once the best flood mapping method was identified, we applied a simple method to retrieve hydroperiod value (H
) per pixel for every flooding cycle (Hc
). We computed Hc
according to the following equation:
stands for Day of Cycle (flooding cycle is set to 1 on the 1st of September of the previous year and to 365 on 31st of August of the year that follows). For every flooding cycle, the i
th image corresponds to the first date the pixel was classified as flooded, while the n
image is the last date when the pixel was labelled as flooded. The n
value may eventually equal the total available number of flooding masks per cycle when a pixel is classified as flooded at every flooding mask for the complete flooding cycle. The procedure assumes constant flooding for a pixel classified as flooded between two consecutive available scenes. We calculated two different indexes in order to assess the representativeness of Hc
for inter-cycle comparisons:
A low number of flood masks per flooding cycle produced a systematic bias in permanent inundated areas like the ocean or the river. In order to remove such bias we applied a linear stretch to the Hc estimate multiplying Hci by 365 and dividing by Hcmax, with Hci the value of the pixel and Hcmax the maximum value of Hc obtained for permanent waters like the sea. Such post-processing eliminates this systematic bias for the initial flooding cycles in the time series with a low number of flood masks.
Final hydroperiod maps indicate the number of days a pixel was detected as flooded (from 0 to 365) during the flooding cycle. Independent accuracy assessment of hydroperiod estimates by remote sensing was carried out using historical readings of water column values from a network of permanent limnimetric scales distributed across the Doñana marshes (see Figure S1
containing a location map of the scales in the marsh together with an example of a full flooding cycle reading values). Fixed limnimetric scales are periodically visited and the water column recorded by national park rangers. We selected the readings of the N28 limnimetric scale (Figure 4
) since it has been agreed to be the most representative of the flooding levels for the whole marshland. In addition, N28 is the one with the longest continuous record of readings (1394 readings from 1995 to 2009). From such data, the ground-truth hydroperiod was therefore calculated by subtracting the last flooding day (water column reading = 0 cm) from the first observed flooding day (water column reading > 0 cm).
Time series of annual hydroperiod maps were used to calculate the Theil–Sen median trend, which uses a robust non-parametric trend operator highly recommended for assessing the rate of change in short or noisy time series [63
]. Theil–Sen analysis calculates the slope of the trend from the median value of every pairwise combination of cycles. An interesting feature of the median trend is its breakdown bound. The breakdown bound for a robust statistic is the number of wild values that can occur within a series before it will be affected. For the median trend, the breakdown bound is approximately 29%. Thus the trends expressed in the image have to have persisted for more than 29% of the length of the series in flooding cycles (i.e., more than nine years for Doñana marshes).
Finally, in order to assess significant interannual changes, we computed hydroperiod anomalies for every flooding cycle as the difference between Hc
and the mean hydroperiod of the whole time series [64
]. For comparison purposes, MSS flooding masks were resampled to 30 m pixel size.