Checking the Consistency of Volunteered Phenological Observations While Analysing Their Synchrony

Mehdipoor, Hamed; Zurita-Milla, Raul; Augustijn, Ellen-Wien; Van Vliet, Arnold J. H.

doi:10.3390/ijgi7120487

Open AccessArticle

Checking the Consistency of Volunteered Phenological Observations While Analysing Their Synchrony

by

Hamed Mehdipoor

^1,*

,

Raul Zurita-Milla

¹

,

Ellen-Wien Augustijn

¹ and

Arnold J. H. Van Vliet

²

¹

Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands

²

Environmental Systems Analysis Group, Wageningen University, P.O. Box 47, 6700 AA Wageningen, The Netherlands

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2018, 7(12), 487; https://doi.org/10.3390/ijgi7120487

Submission received: 21 October 2018 / Revised: 9 December 2018 / Accepted: 13 December 2018 / Published: 19 December 2018

(This article belongs to the Special Issue Volunteered Geographic Information: Analysis, Integration, Vision, Engagement (VGI-ALIVE))

Download

Browse Figures

Versions Notes

Abstract

The increasing availability of volunteered geographic information (VGI) enables novel studies in many scientific domains. However, inconsistent VGI can negatively affect these studies. This paper describes a workflow that checks the consistency of Volunteered Phenological Observations (VPOs) while considering the synchrony of observations (i.e., the temporal dispersion of a phenological event). The geographic coordinates, day of the year (DOY) of the observed event, and the accumulation of daily temperature until that DOY were used to: (1) spatially group VPOs by connecting observations that are near to each other, (2) define consistency constraints, (3) check the consistency of VPOs by evaluating the defined constraints, and (4) optimize the constraints by analysing the effect of inconsistent VPOs on the synchrony models derived from the observations. This workflow was tested using VPOs collected in the Netherlands during the period 2003–2015. We found that the average percentage of inconsistent observations was low to moderate (ranging from 1% for wood anemone and pedunculate oak to 15% for cow parsley species). This indicates that volunteers provide reliable phenological information. We also found a significant correlation between the standard deviation of DOY of the observed events and the accumulation of daily temperature (with correlation coefficients ranging from 0.78 for lesser celandine, and 0.60 for pedunculate oak). This confirmed that colder days in late winter and early spring lead to synchronous flowering and leafing onsets. Our results highlighted the potential of synchrony information and geographical context for checking the consistency of phenological VGI. Other domains using VGI can adapt this geocomputational workflow to check the consistency of their data, and hence the robustness of their analyses.

Keywords:

volunteered geographical information; citizen science; climate change; phenology; synchrony; geocomputation; data quality; constraint satisfaction

Graphical Abstract

1. Introduction

1.1. VGI Attribute Consistency

Because of the progress in information, communication and mobile location-aware technologies, the use of volunteered geographic information (VGI) has dramatically increased in recent times [1,2]. VGI is useful since it permits citizens to act as human sensors and to, for instance, contribute to environmental research. However, the quality of the information provided by volunteers remains a concern [3,4]. In VGI, contributors produce geographic information without necessarily being experts in the study domain. Furthermore, volunteers have different levels of expertise in recognizing geographical phenomena [5]. As a result, the quality of VGI has always been a concern [6,7]. An essential VGI concern is information consistency [8].

VGI consistency refers to the degree of adherence to logical rules of data structure, attribution and relationships as described in the information specifications [4,9,10]. Attribute consistency refers to the consistency of attributes associated with specific locations, objects or events placed in geographic space, such as time (e.g., date). Traditionally, VGI attributed consistency is assessed using a direct comparison with reference data collected with defined quality standards [4,6,11]. For instance, Jokar Arsanjani et al. [7] investigated the attribute consistency of OpenStreetMap (www.openstreetmap.org) project data using the Corine Land Cover database and the pan-European Global Monitoring for Environment and Security Urban Atlas (GMESUA) dataset as the authoritative reference.

Yet, for time-critical domains such as disaster response [7] and nature observations, reference datasets are not always available [7]. An alternative approach consists of evaluating the available data along three intrinsic dimensions [12]: (1) crowdsourcing evaluation, (2) social measures and (3) geographic context. In crowdsourcing evaluation, VGI is manually checked and edited by multiple contributors to ensure the consistency. Social measures focus on the assessment of the contributors themselves as a proxy measure to the quality of their contributions. For example, the number of contributors [13] and positive and negative edits [14] are used as a measure of VGI consistency. Although both dimensions contribute to checking VGI consistency, they only rely on information about VGI contributors and disregard the geographic context in which the data was collected. The third dimension relates to Tobler’s first law of geography, which states that “all things are related, but nearby things are more related than distant things” [15].

The geographic context approach allows considering spatial aspects in the consistency checks. For example, Schlieder and Yanenko [10] built a framework to determine the probable spatial inconsistencies in social reporting and Bonter and Cooper [16] tested the use of a smart filter system that uses the geographic location of species observations to identify counts of species that are too high or species that do not frequently appear on standard lists. Existing approaches to check consistency using the geographic context often use the geographic distance between VGI as the main and only parameter to include in the checks. Yet, contextual geoinformation are available now more than ever before, and they can be used to improve the consistency checks. For instance, checking VGI that is collected for ecological studies, such as phenology, can benefit from existing environment contextual geoinformation such as weather datasets.

1.2. Synchrony of Phenological VGI

Phenology is the study of periodic plant and animal life cycle events (phenophase) and how seasonal and inter-annual variations in weather conditions affect them [17,18,19]. Phenological studies benefit from VGI developments to acquire observations that support climate change studies at various scales [20,21,22]. National and regional phenological networks promote phenological research and curate ever-increasing collections of Volunteered Phenological Observations (VPOs) collected by large crowds of volunteers [1,2]. VPOs are available at fine spatial scales and provide timely and low-cost information [23,24,25]. Hence, VPOs support novel phenological studies that study the impact of climate change on plants and animals [26].

The analysis of the temporal dispersion of a phenophase across individuals of the same species, or phenological synchrony, could provide information about how context derives the timing of phenophase. Changes in phenological synchrony have ecological consequences for individual survival and ecosystem stability [27,28,29]. For example, low synchrony in plant flowering can hamper the expected random mating pattern because early bloomers will likely get pollinated by early plants, and late plants by late plants [30]. In regions with a marked seasonality, synchrony is strongly controlled by temperature variability [21,31]. Phenological synchrony is typically quantified by the standard deviation of Day of Year (DOY) of all the observations collected in a given area and year [21,32,33,34]. This quantification method is sensitive to inconsistent VPOs [34,35]. Although VPOs support climate change studies such as synchrony analysis, the consistency of the observations provided by volunteers remains a concern.

Synchrony analysis is sensitive to anomalously early or late VPOs regarding their associated environmental conditions [35,36]. These so-called inconsistent VPOs can bias the results of trend analysis and modelling because they are not representative and decrease the ratio of signal to noise [25,37]. Inconsistent VPOs are expected because volunteers may perform observations at locations with environmental conditions that are not representative of the phenophases being monitored (e.g., they might report data for an individual plant growing under a unique micro-climate). Most VPO consistency checks rely on applying a threshold to identify outliers. These checks use only VPOs and they assume that most of the observations are consistent and can be used to estimate the (frequency) distribution of the reported dates and to recognize outliers [38]. However, this assumption is not always correct when working with volunteered observations.

The commonly used Tukey boxplot [39] uses 1.5 times the absolute value of the difference between the first and third quartiles of the annual DOYs to highlight outliers. This is a crucial step of abovementioned checks. The integration of historical and independent sources of geoinformation with VGI has been found to be an effective approach to improve the quality of VGI [40,41]. In phenology, there is contextual environmental information (e.g., temperature) that plays a key role in many phenological studies. Yet, this information is not used in consistency checking [37,42]. This contextual information has the potential to check the consistency of VPOs by integrating assumptions about the effect of context on VPOs.

This study checks the consistency of VPOs while taking the effect of inconsistent observation on phenological synchrony into account. In the next section, we describe a geocomputational workflow that uses the geographic location and DOY of VPOs, and the associated daily temperature at the locations where the observations were made. Here, we tested the workflow using VPOs from the Dutch phenological network and daily temperature time-series from the Dutch national weather service. The added value of our workflow is evaluated by comparing it with the results of a more classical outliers identification method, namely the Tukey boxplot.

2. Materials and Methods

2.1. VPOs and Temperature Datasets

This study is illustrated with VPOs from the Dutch phenological network Natuurkalender (www.natuurkalender.nl) (Nature’s Calendar). This volunteer-based network was established in 2001 to monitor phenophases for a wide range of species in the Netherlands [43]. Every volunteer that can recognize plants and phenophases can contribute his/her observations via the website of the project. The number of observers and the locations of the VPOs vary from year to year. For more details regarding the VPOs and the available datasets, see Vliet et al., 2009. Considering the spatio-temporal coverage of the available observations, spring phenophases of four plants species were selected for this study: flowering of lesser celandine (Ficaria verna Huds), wood anemone (Anemone nemorosa L.), and cow parsley (Anthriscus sylvestris (L.) Hoffm), and leafing of pedunculate oak (Quercus robur L.). For each species, the annual number of VPOs is presented in Table 1.

Lesser celandine and wood anemone are herbs that grow in forests, grasslands, and next to roads and waterways. Both herbs are intensively studied in Europe as an indicator of global environmental changes on the distribution and abundance of forest understorey [44]. Lesser celandine starts to flower early in March and wood anemone flowers around mid-March. Cow parsley is also an herb, which grows in rich, moist soils along roadsides and forest paths. It flowers from late March onwards. Pedunculate oak is a tree, often planted on sandy soils, which starts leafing around late April. Cow parsley and pedunculate oak have been found to be indicators of climate change by studies performed in Europe. For the period 2003–2015, the Natuurkalender database has a total of 3042, 1298 and 1811 flowering observations for lesser celandine, wood anemone and cow parsley, and 586 leafing observations for pedunculate oak. These observations only contain the geographic location provided by a volunteer (in the Dutch National Coordinate System; EPSG: 28992) and the date that a phenophase was first observed in a given year.

In the Netherlands and other temperate regions, temperature is the main driver of plant phenophases [33,45,46]. Hence, the annual temperature regimes of each observation were characterized using 1 km gridded estimates of daily average temperature, as downloaded from the Royal Netherlands Meteorological Institute (KNMI; https://data.knmi.nl/datasets). These grids, created by interpolating daily temperature records collected by 150 meteorological stations distributed across the country, were used to calculate Growing Degree Days (GDDs) by adding up the daily average temperature (T) above a base temperature (

T_{b}

) from the first of January of the year of the observation until the reported date of observation of each VPO [47]. We selected 0 °C as the base temperature (Equation (1)). This base temperature has been used for modelling the timing of leafing and flowering and their regional variation for several species in Western Europe [48,49,50].

GDD = (\begin{matrix} 0 & if T < T_{b} \\ \frac{(T_{Max} + T_{Min})}{2} - T_{b} & if T \geq T_{b} \end{matrix})

(1)

2.2. Analysing Consistency and Synchrony of VPOs

The proposed workflow consists of four steps (Figure 1) and uses the geographic location of the VPOs, their observed DOY and the GDDs accumulated until that DOY. The first three steps are used to check annual VPOs consistency. These steps connect, model and compare VPOs that are near to each other to identify sets of inconsistent observation. The fourth step focuses on modelling synchrony and optimizing the identification of inconsistent VPOs. The next paragraphs describe each of the steps in more detail.

In the first step, the geographic locations of the annual VPOs were used to spatially group observations by connecting observations that are near to each other. This creates a “network graph”. This approach was chosen because near observations locations are more likely to have similar weather and observations that are further apart. The spatial variability of meteorological parameters, other than temperature, is small across the Netherlands because of the relatively homogeneous topography (maximum difference in elevation of about 300 m), and a gradual and smooth steepness [51]. These annual graphs were made using a Delaunay triangulation and edges longer than a given threshold distance were pruned [52]. This triangulation method was chosen because it is computationally efficient and avoids long edges [9,53]. The pruning distance was set to 100 km because this value ensured good connectivity (each observation was connected to at least two other observations).

In the second step, the difference in the observed DOY of connected observations was modelled using their corresponding difference in GDDs. For each year, the Pearson product-moment correlation coefficient was calculated and a linear regression model was fitted [21,46]. The slopes of the regression lines (which represent the rate of spatial change in DOY per unit of GDDs) are used to define consistency constraints. More precisely, constraints were defined by fixing a maximum difference in DOY (ΔMax) for observations performed under similar environmental conditions. In practical terms, this indicates that for a given difference in GDDs, the difference in DOY should not exceed ΔMax, otherwise, the two reported DOYs are inconsistent to each other.

In the third step, the consistency constraint was checked by using a range of ΔMax values. Since ΔMax cannot be known a priori, the chosen range varies from one week to one month in steps of one day. The minimum ΔMax was set to one week to take genetic variation into account because two individuals of the same species that grow at the same location (i.e., under the same temperature regimes) may not start flowering (or leafing) at the same DOY. For each species and year, the inconsistent observations corresponding to each ΔMax value (i.e., VPOs refuted by more than one connected VPO) were highlighted and excluded. This resulted in 24 sets of consistent VPOs for each species. The optimal ΔMax for each species, and consequently the final set of inconsistent VPOs, was selected after performing the phenological synchrony analysis that takes place in the last step (step 4—Figure 1) of the workflow.

In the fourth step, the synchrony of consistent VPOs was analysed and modelled using the inter-annual variability in temperature as an explanatory variable. Wang et al. (2016) and Thackeray et al. (2016) proved that warmer days in late winter and early spring lead to less synchronous flowering and leafing onsets than colder transitions [34,54]. They showed that the annual standard deviations of flowering and leafing onset are significantly higher in years with a lower rate of change of temperature from winter to spring. In such years, the number of days prior to flowering and leafing onset that have high temperature increases. This results in a larger average GDD than in years with a higher rate of temperature change. For each set of consistent VPOs, the Pearson product-moment correlation coefficient between the annual standard deviation of the DOYs and the annual average of their GDD values was calculated. For each species, the optimal ΔMax corresponds to the value that maximizes the correlation. A linear regression to the annual standard deviation of the DOYs and average GDD of the final set of consistent VPOs was also fitted to model synchrony.

The fitness of this model was evaluated by its coefficient of determination (R-squared). The R-squared values of the synchrony model driven with consistent VPOs were compared with the R-squared values produced by the models fitted using: (1) the original and (2) the outlier-free (boxplot outliers excluded) VPOs. This comparison helps to understand the added value of the proposed workflow. The percentage of annual inconsistent and outlier VPOs was also calculated to compare the amount of data lost through the cleaning of observation by the proposed workflow and by the outlier-filtering method (based on the boxplot).

3. Results

The pruned Delaunay triangulation of VPO locations results in annual graphs of VPOs locations. For lesser celandine (Figure 2), wood anemone (Figure A1) and cow parsley (Figure A2), these graphs cover the whole of the Netherlands. However, the VPOs of pedunculate oak (Figure A3) are mostly located in the centre of the country. For all species, a large dispersion of VPO locations from year to year was observed, which is intrinsic to volunteer monitoring network data. For example, the spatial variability found for lesser celandine shows that volunteers observed this species almost everywhere in the Netherlands however, in some years (e.g., 2007 and 2008) observations are more clustered in the western part of the country (Figure 3 and Figure A4, Figure A5 and Figure A6). The four largest cities of the country (Amsterdam, Rotterdam, The Hague and Utrecht) are located in the western part. The level of spatial connectivity of VPOs is higher surrounding these cities. In these areas, the results of consistency check are more reliable because a larger number of VPOs are available to confirm or refute the consistency of the connected observations.

The correlation between the difference in DOYs and the difference in GDDs was significant for all species. The average correlation coefficient was 0.91 for lesser celandine (Figure 3), 0.90 for wood anemone (Figure A4), 0.93 for cow parsley (Figure A5) and 0.90 for pedunculate oak (Figure A6). This indicates the considerable influence of the accumulated daily temperature on the timing of flowering. The slopes of the fitted regression lines show the rate of change of the difference in reported DOYs per unit of difference in GDD (i.e., the steeper the slope, the larger the difference in the timing of the phenophase). The comparison of the annually modelled and reported difference in DOYs is used to identify 24 sets of inconsistent observations.

The correlation coefficient between the annual standard deviation of consistent DOY corresponding to each ΔMax value and the annual average GDD were calculated for lesser celandine (Figure 4), wood anemone, cow parsley and pedunculate oak (Figure A7). The ΔMax that lead to the phenological synchrony model with the largest coefficient of determination was 13 days for lesser celandine, 20 days for wood anemone, seven days for cow parsley and 15 days for pedunculate oak. For example, the optimal ΔMax value indicated that the maximum difference in the timing of flowering of two lesser celandine plants (located less than 100-km apart and growing under similar temperature regimes) was 13 days. The small value of ΔMax for cow parsley flowering, a plant with a shallow root system, might be caused by the impact of other environmental parameters such as soil moisture and light intensity [55].

For all species, optimal inconsistent VPOs show a large difference in the reported DOY compared to their surrounding VPOs (Figure 5). For all species, except cow parsley, the annual percentages of VPOs, highlighted as possible inconsistent observations, is smaller than the annual percentages of boxplot outliers (Figure 6 and Figure A8). Inconsistent VPOs refer to unusually early or late DOYs with respect to the regional temperature regime of the observation site whereas boxplot outliers only identify very early or late observations for a given set of annual observations and, in consequence, do not consider the effect of regional contextual information. The annual boxplots for lesser celandine (Figure 7d), wood anemone, cow parsley and pedunculate oak (Figure A8) show that outliers are mostly located below the lower whisker of the boxplot. This indicates that the distribution of observed DOYs is not normally (Gaussian) distributed as the boxplot assumes. As a result, boxplot filters out several early VPOs that are consistent observations. Such consistent early VPOs can provide reliable information about advancement in the timing of phenophases.

The synchrony analysis resulted in models that predict the standard deviation of the DOY of the selected phenophases using the annual average GDD. For consistent VPOs, the correlation coefficient between the standard deviation of the DOY and the annual average GDD was 0.78 for lesser celandine, 0.63 for wood anemone, 0.61 for cow parsley and 0.60 for pedunculate oak. These results suggest that the timing of flowering and leafing onsets for the species under study is more synchronous in cold late winters and early springs than in warm ones. The comparison between the synchrony models, made from original, outlier-free and consistent VPOs, shows that using boxplots negatively impacts the quality of the model (Figure 7 and Figure A9, Figure A10 and Figure A11). For all species, the R-squared value of the models based on consistent VPOs is larger than that of the models based on outlier-free data (Table 2). Moreover, the models based on consistent VPOs are more in line with the models made using original VPOs. Removing a large number of outliers (Figure 6) by using the Tukey boxplot method leads to strongly distorted models at large geographical scales (c.f. Figure 7b). The application of our workflow improves the quality of the model (notice the improved R-squared values in Figure 7c).

4. Discussion

This study presents a workflow to check the consistency of VPOs. Unlike purely statistical methods, our workflow uses the geographic location and the corresponding accumulation of daily temperature as independent sources of information. The workflow defines and evaluates consistency constraints based on the correlation between VPOs synchrony and the rate of change of temperature from winter to spring. We used the workflow to filter out phenological observations that do not provide regionally representative species-specific flowering and leafing DOYs and labels them as “inconsistent”. Unlike existing threshold-based outlier detection methods, our workflow is based on a geographic context approach to identify neighbours of VPOs. Moreover, the workflow not only considers the geographic context of VPOs, as used by Schlieder and Yanenko [11] and Bonter and Cooper [18], but it integrates this information with environmental contextual information to check the VPOs consistency. As a result, the workflow avoids filtering unusually early or late observations that are consistent with their environment.

Inconsistent VPOs can be caused by either species and/or phenophase misidentification or they can be true observations influenced by a microscale temperature regime that is hard to model using 1 by 1 km temperature data. In either case, inconsistent observations are not representative of the phenology of this species in the Netherlands. The high correlation found between the difference in DOYs and the difference in GDD of near observations on flowering and leafing onsets indicates that daily temperature is indeed relevant for the analysis of the selected species and phenophases. This is in line with the fact that daily temperature is a dominant factor for plant phenology in temperate and boreal regions [56]. This highlights the importance of storing metadata about volunteered observations to improve the temporal consistency in phenological databases.

Considering ΔMax as a proxy for the spatial variability of the timing of a phenophase under similar temperature conditions helps to constrain the temporal window in which the occurrence of a VPO is consistent. As ΔMax takes into account the geographic context, a quality control mechanism of VPOs based on this metric outperforms alternative methods solemnly based on data distributions. Given the increasing popularity of citizen science networks, we expect to get better estimates of ΔMax in the near future. The ΔMax metric can also help to understand the phenology of species. For instance, the relatively small value of ΔMax for cow parsley (seven days) indicates that the flowering onset of this species is controlled more strongly by temperature as opposed to the other selected species or phenophases. In consequence, this species shows the highest correlation between the difference in DOYs and the difference in GDD of VPOs. The initial ΔMax value (i.e., one week) that sets a maximum difference in DOY for observations performed under similar environmental conditions is not known. This value might be different in other study areas or for other plant species.

In this study, we used a pruned Delaunay triangulation to connect VPOs to their closest neighbour [57,58,59]. This approach has advantages over using a distance-based method, which connects all VPOs within a specific distance to the under-check observation. In the distance-based approach, a distance threshold has to be chosen. However, such a distance threshold is not always known and selection of an arbitrary distance threshold could have a negative impact on the analysis. Moreover, a distance-based approach would potentially lead to VPOs with no neighbours within their vicinity. However, the applied Delaunay triangulation method is more flexible and data-driven. It uses closer neighbours when available (leading to smaller triangles) and more distant VPOs when needed (leading to larger triangles).

Here, we assume that GDD accumulations drive the synchrony of the selected phenophases within a 100-km distance. The annual number of observations does not substantially affect the synchrony model of species. For instance, the annual number of observations of cow parsley is higher than the number of observations of wood anemone and pedunculate oak. Yet, the R-squared of the synchrony model found for this species is smaller than that of wood anemone and pedunculate oak. In the Netherlands, the level of spatial connectivity of VPOs using this distance threshold is high, however, this might not be the case when analysing larger areas. In such areas, the analysis might be hampered by annual graphs with a low level of spatial connectivity. Thus, advanced spatial point pattern tests could help to evaluate the spatial distribution of VPOs as well as to identify the similarity of VPOs distributions over the study period. For example, Andresen and Malleson provide various tests to measure the degree of density and of similarity of spatial point patterns [60].

Other temperature-driven metrics than GDD such as the average temperature, could be tested with our workflow. For example, Calinger et al. [61] showed the suitability of average monthly temperatures during the month of the phenophase and some number of months prior to the event to model phenological responses to temperature across many species. Moreover, GDD accumulations may not be the only or main driver of flowering and leafing onset in other study areas. The length of the chilling period [62,63], photoperiod [24,63], and precipitation and elevation [64] might also drive flowering onset in spring. For example, Studer et al. [65] used a multivariate regression to model the timing of wood anemone flowering as a function of temperature and precipitation. A similar model could be used during the consistency check phase of our workflow because it is generic enough to accommodate other phenological drivers.

Our workflow works for events that are synchronized, yet, this is the case for several ecological phenomena. In citizen science, several networks monitor environmental events that are weather-driven. Examples of such types of monitoring are the reporting of tick and mosquitoes bites and observation of migrating birds. For these types of phenomena, different types of weather data can be used to find inconsistent observations. The developed workflow can also be useful in these domains.

5. Conclusions

VGI has greatly contributed to phenological studies, leading to an improved understanding of plant and animal seasonality across the globe. In this respect, checking the consistency of volunteered phenological observations or VPO is a pre-requisite to ensure the validity and representativeness of VPO-based results. In this paper, we present a workflow designed to use geographical and contextual information associated with phenological observations to check the consistency of observations while analysing their synchrony. This workflow was used to improve our knowledge of the local impact of inter-annual temperature variations on the consistency and synchrony of VPOs from various plant species in the Netherlands.

Our results reveal that the most common method (boxplot) to filter outliers in VPOs substantially biased synchrony analysis of the timing of the spring flowering and leafing. Our results indicate that climate change and inter-annual weather variability determine changes in the synchrony of spring plant phenology. Given that several national and international initiatives facilitate and actively support the collection of VGI for ecological studies and that the open data movement is resulting in more contextual environmental information becoming available, the proposed workflow provides a unique opportunity to check the consistency of volunteered observations. Considering its general character, we think that this geocomputational workflow could be adapted to other kinds of VGI, hence contributing to the curation of this interesting source of geospatial data.

Author Contributions

Conceptualization, H.M., R.Z.-M. and E.-W.A.; methodology, H.M., R.Z.-M., E.-W.A. and A.J.H.v.V.; implementation, H.M.; original draft preparation, H.M.-M., R.Z., E.-W.A.; writing—review and editing, H.M., R.Z.-M., E.-W.A. and A.J.H.v.V.

Funding

This research received no external funding.

Acknowledgments

We thank the Royal Netherlands Meteorological Institute (KNMI) for kindly providing the temperature datasets used in this study. The Dutch phenological network Natuurkalender and the many volunteers who contribute phenological observations to this network are deeply thanked. We also thank J.A. Horn MPhil for his careful reading of our early draft of our manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Figure A1. Annual graphs for wood anemone flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned.

Figure A2. Annual graphs for cow parsley flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned.

Figure A3. Annual graphs for pedunculate oak flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned.

Figure A4. Linear regression fit between the difference in observed flowering DOY (number of days) and the difference in modelled GDD (number of degree days) for pairs of spatially connected observations of wood anemone flowering onset (see Figure A1). Correlation coefficient and rate of spatial change in DOY per unit of GDDs (i.e., slope of regression line) are given for each observation year.

Figure A5. Linear regression fit between the difference in observed flowering DOY (number of days) and the difference in modelled GDD (number of degree days) for pairs of spatially connected observations of cow parsley flowering onset (see Figure A2). Correlation coefficient and rate of spatial change in DOY per unit of GDDs (i.e., slope of regression line) are given for each observation year.

Figure A6. Linear regression fit between the difference in observed flowering DOY (number of days) and the difference in modelled GDD (number of degree days) for pairs of spatially connected observations of pedunculate oak leafing onset (see Figure A3). Correlation coefficient and rate of spatial change in DOY per unit of GDDs (i.e., slope of regression line) are given for each observation year.

Figure A7. Correlation coefficient (r) between the standard deviation of the reported DOYs for the flowering onset of wood anemone, cow parseley and pedunculate oak and the average GDD for ΔMax values varying from one week to 30 days.

Figure A8. Annual percentages of inconsistent observations and of boxplot outliers in volunteered observations of wood anemone, cow parseley and pedunculate oak. Inconsistent observations are unusually early or late DOYs with respect to the regional temperature regime of the observation sites, while the outliers are only very early or late DOY.

Figure A9. Wood anemone flowering onset synchrony models for (a) original, (b) outlier-free and (c) consistent observations. Panel (d) shows annual boxplots of the reported DOYs for the original observations.

Figure A10. Cow parsley flowering onset synchrony models for (a) original, (b) outlier-free and (c) consistent observations. Panel (d) shows annual boxplots of the reported DOYs for the original observations.

Figure A11. Pedunculate oak leafing onset synchrony models for (a) original, (b) outlier-free and (c) consistent observations. Panel (d) shows annual boxplots of the reported DOYs for the original observations.

References

Beaubien, E.G.; Hamann, A. Plant phenology networks of citizen scientists: recommendations from two decades of experience in Canada. Int. J. Biometeorol. 2011, 55, 833–841. [Google Scholar] [CrossRef] [PubMed]
Ferster, C.J.; Coops, N.C. A review of earth observation using mobile personal communication devices. Comput. Geosci. 2013, 51, 339–349. [Google Scholar] [CrossRef]
Ballatore, A.; Zipf, A. A Conceptual Quality Framework for Volunteered Geographic Information. In Spatial Information Theory; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2015. [Google Scholar]
Senaratne, H.; Mobasheri, A.; Ali, A.L.; Capineri, C.; Haklay, M. A review of volunteered geographic information quality assessment methods. Int. J. Geogr. Inf. Sci. 2017, 31, 139–167. [Google Scholar] [CrossRef]
Feick, R.; Roche, S. Understanding the Value of VGI. In Crowdsourcing Geographic Knowledge; Springer: Dordrecht, The Netherlands, 2013; pp. 15–29. [Google Scholar]
Antoniou, V.; Skopeliti, A. Measures and indicators of VGI quality: An overview. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, II-3/W5, 345–351. [Google Scholar] [CrossRef]
Arsanjani, J.J.; Mooney, P.; Zipf, A.; Schauss, A. Quality Assessment of the Contributed Land Use Information from OpenStreetMap Versus Authoritative Datasets. In OpenStreetMap in GIScience; Springer: Cham, Switzerland, 2015. [Google Scholar]
Elwood, S.; Goodchild, M.F.; Sui, D.Z. Researching Volunteered Geographic Information: Spatial Data, Geographic Research, and New Social Practice. Ann. Assoc. Am. Geogr. 2012, 102, 571–590. [Google Scholar] [CrossRef]
Yanenko, O.; Schlieder, C. Enhancing the Quality of Volunteered Geographic Information: A Constraint-Based Approach. In Bridging the Geographic Information Sciences: International AGILE’2012 Conference, Avignon (France), 24–27 April 2012; Gensel, J., Josselin, D., Vandenbroucke, D., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 429–446. ISBN 978-3-642-29063-3. [Google Scholar]
Schlieder, C.; Yanenko, O. Spatio-temporal proximity and social distance. In Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Location Based Social Networks, San Jose, CA, USA, 2 November 2010; ACM Press: New York, NY, USA, 2010; p. 60. [Google Scholar]
Barron, C.; Neis, P.; Zipf, A. A Comprehensive Framework for Intrinsic OpenStreetMap Quality Analysis. Trans. GIS 2014, 18, 877–895. [Google Scholar] [CrossRef]
Goodchild, M.F.; Li, L. Assuring the quality of volunteered geographic information. Spat. Stat. 2012, 1, 110–120. [Google Scholar] [CrossRef]
Haklay, M. How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. Environ. Plan. B Plan. Des. 2010, 37, 682–703. [Google Scholar] [CrossRef]
Mooney, P.; Corcoran, P. The Annotation Process in OpenStreetMap. Trans. GIS 2012, 16, 561–579. [Google Scholar] [CrossRef]
Ali, A.L.; Schmid, F. Data Quality Assurance for Volunteered Geographic Information. In Geographic Information Science; Duckham, M., Pebesma, E., Stewart, K., Frank, A.U., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; Volume 8728. [Google Scholar]
Bonter, D.N.; Cooper, C.B. Data validation in citizen science: A case study from Project FeederWatch. Front. Ecol. Environ. 2012, 10, 305–307. [Google Scholar] [CrossRef]
Van Vliet, A.J.H.; de Groot, R.S.; Bellens, Y.; Braun, P.; Bruegger, R.; Bruns, E.; Clevers, J.; Estreguil, C.; Flechsig, M.; Jeanneret, F.; et al. The European Phenology Network. Int. J. Biometeorol. 2003, 47, 202–212. [Google Scholar] [CrossRef] [PubMed]
Mayer, A. Phenology and Citizen Science. Bioscience 2010, 60, 172–175. [Google Scholar] [CrossRef]
Chmielewski, F.M. Phenology in Agriculture and Horticulture. In Phenology: An Integrative Environmental Science; Springer: Dordrecht, The Netherlands, 2013; pp. 539–561. ISBN 978-94-007-6924-3. [Google Scholar]
Doi, H.; Katano, I. Phenological timings of leaf budburst with climate change in Japan. Agric. For. Meteorol. 2008, 148, 512–516. [Google Scholar] [CrossRef]
Gordo, O.; Sanz, J.J. Impact of climate change on plant phenology in Mediterranean ecosystems. Glob. Chang. Biol. 2010, 16, 1082–1106. [Google Scholar] [CrossRef]
Zurita-Milla, R.; Goncalves, R.; Izquierdo-Verdiguier, E.; Ostermann, F.O. Exploring Vegetation Phenology at Continental Scales: Linking Temperature-Based Indices and Land Surface Phenological Metrics. In Proceedings of the 2017 Conference on Big Data from Space, Toulouse, France, 28–30 November 2017. [Google Scholar]
Devictor, V.; Whittaker, R.J.; Beltrame, C. Beyond scarcity: citizen science programmes as useful tools for conservation biogeography. Divers. Distrib. 2010, 16, 354–362. [Google Scholar] [CrossRef]
Ault, T.R.; Schwartz, M.D.; Zurita-Milla, R.; Weltzin, J.F.; Betancourt, J.L.; Ault, T.R.; Schwartz, M.D.; Zurita-Milla, R.; Weltzin, J.F.; Betancourt, J.L. Trends and Natural Variability of Spring Onset in the Coterminous United States as Evaluated by a New Gridded Dataset of Spring Indices. J. Clim. 2015, 28, 8363–8378. [Google Scholar] [CrossRef]
Rosemartin, A.H.; Denny, E.G.; Weltzin, J.F.; Lee Marsh, R.; Wilson, B.E.; Mehdipoor, H.; Zurita-Milla, R.; Schwartz, M.D. Lilac and honeysuckle phenology data 1956-2014. Sci. Data 2015, 2, 150038. [Google Scholar] [CrossRef]
Soroye, P.; Ahmed, N.; Kerr, J.T. Opportunistic citizen science data transform understanding of species distributions, phenology, and diversity gradients for global change research. Glob. Chang. Biol. 2018, 24, 5281–5291. [Google Scholar] [CrossRef]
Ims, R.A. The ecology and evolution of reproductive synchrony. Trends Ecol. Evol. 1990, 5, 135–140. [Google Scholar] [CrossRef]
English-Loeb, G.M.; Karban, R. Consequences of variation in flowering phenology for seed head herbivory and reproductive success in Erigeron glaucus (Compositae). Oecologia 1992, 89, 588–595. [Google Scholar] [CrossRef]
Bolmgren, K.; Eriksson, O. Are mismatches the norm? Timing of flowering, fruiting, dispersal and germination and their fitness effects in Frangula alnus (Rhamnaceae). Oikos 2015, 124, 639–648. [Google Scholar] [CrossRef]
Weis, A.; Kossler, T. Genetic variation in flowering time induces phenological assortative mating: Quantitative genetic methods applied to Brassica rapa. Am. J. Bot. 2004, 91, 825–836. [Google Scholar] [CrossRef] [PubMed]
Both, C.; Van Asch, M.; Bijlsma, R.G.; Van Den Burg, A.B.; Visser, M.E. Climate change and unequal phenological changes across four trophic levels: constraints or adaptations? J. Anim. Ecol. 2009, 78, 73–83. [Google Scholar] [CrossRef] [PubMed]
Henderson, A.; Fischer, B.; Scariot, A.; Whitaker Pacheco, M.A.; Pardini, R. Flowering phenology of a palm community in a central Amazon forest. Brittonia 2000, 52, 149–159. [Google Scholar] [CrossRef]
Menzel, A.; Sparks, T.H.; Estrella, N.; Koch, E.; Aasa, A.; Ahas, R.; Alm-KÜBler, K.; Bissolli, P.; BraslavskÁ, O.; Briede, A.; et al. European phenological response to climate change matches the warming pattern. Glob. Chang. Biol. 2006, 12, 1969–1976. [Google Scholar] [CrossRef]
Wang, C.; Tang, Y.; Chen, J. Plant phenological synchrony increases under rapid within-spring warming. Sci. Rep. 2016, 6, 25460. [Google Scholar] [CrossRef] [PubMed]
Sparks, T.H.; Huber, K.; Tryjanowski, P. Something for the weekend? Examining the bias in avian phenological recording. Int. J. Biometeorol. 2008, 52, 505–510. [Google Scholar] [CrossRef] [PubMed]
Mihorski, M.; Sparks, T.H.; Tryjanowski, P. The weekend bias in recording rare birds: mechanisms and consequencess. Acta Ornithol. 2012, 47, 87–94. [Google Scholar] [CrossRef]
Mehdipoor, H.; Zurita-Milla, R.; Rosemartin, A.; Gerst, K.L.; Weltzin, J.F. Developing a Workflow to Identify Inconsistencies in Volunteered Geographic Information: A Phenological Case Study. PLoS ONE 2015, 10, e0140811. [Google Scholar] [CrossRef] [PubMed]
Kelling, S.; Gerbracht, J.; Fink, D.; Lagoze, C.; Wong, W.-K.; Yu, J.; Damoulas, T.; Gomes, C. eBird: A Human/Computer Learning Network for Biodiversity Conservation and Research. In Proceedings of the Twenty-Fourth Innovative Applications of Artificial Intelligence conference (IAAI 2012), Toronto, ON, Canada, 22–26 July 2012. [Google Scholar]
Frigge, M.; Hoaglin, D.C.; Iglewicz, B. Some Implementations of the Boxplot. Am. Stat. 1989, 43, 50–54. [Google Scholar]
Nasiri, A.; Ali Abbaspour, R.; Chehreghan, A.; Jokar Arsanjani, J. Improving the Quality of Citizen Contributed Geodata through Their Historical Contributions: The Case of the Road Network in OpenStreetMap. ISPRS Int. J. Geo-Inf. 2018, 7, 253. [Google Scholar] [CrossRef]
Sester, M.; Jokar Arsanjani, J.; Klammer, R.; Burghardt, D.; Haunert, J.-H. Integrating and Generalising Volunteered Geographic Information. In Abstracting Geographic Information in a Data Rich World; Springer: Cham, Switzerland, 2014; pp. 119–155. [Google Scholar]
Hochachka, W.M.; Fink, D. Broad-scale citizen science data from checklists: prospects and challenges for macroecology. Front. Biogeogr. 2012, 4. [Google Scholar] [CrossRef]
Van Vliet, A.J.H.; Bron, W.A.; Mulder, S. De Natuurkalender 2004–2008; Wageningen University: Wageningen, The Netherlands, 2009. [Google Scholar]
Baeten, L. Recruitment and Performance of Forest Understorey Plants in Post-Agricultural Forests Lander Baeten. Ph.D. Thesis, Ghent University, Ghent, Belgium, 2010. [Google Scholar]
Jolly, W.M.; Nemani, R.; Running, S.W. A generalized, bioclimatic index to predict foliar phenology in response to climate. Glob. Chang. Biol. 2005, 11, 619–632. [Google Scholar] [CrossRef]
De Frenne, P.; Kolb, A.; Verheyen, K.; Brunet, J.; Chabrerie, O.; Decocq, G.; Diekmann, M.; Eriksson, O.; Heinken, T.; Hermy, M.; et al. Unravelling the effects of temperature, latitude and local environment on the reproduction of forest herbs. Glob. Ecol. Biogeogr. 2009, 18, 641–651. [Google Scholar] [CrossRef]
McMaster, G.S.; Wilhelm, W.W. Growing degree-days: One equation, two interpretations. Agric. For. Meteorol. 1997, 87, 291–300. [Google Scholar] [CrossRef]
Lappalainen, H.; Heikinheimo, M. The effect of temperature on the phenology of perennial plant species. Finn. Res. Program. Clim. 1994, 203–208. [Google Scholar]
Zavalloni, C.; Andresen, J.A.; Flore, J.A. Phenological Models of Flower Bud Stages and Fruit Growth of ‘Montmorency’ Sour Cherry Based on Growing Degree-day Accumulation. J. Am. Soc. Hortic. Sci. 2006, 131, 601–607. [Google Scholar]
Cook, B.I.; Wolkovich, E.M.; Davies, T.J.; Ault, T.R.; Betancourt, J.L.; Allen, J.M.; Bolmgren, K.; Cleland, E.E.; Crimmins, T.M.; Kraft, N.J.B.; et al. Sensitivity of Spring Phenology to Warming Across Temporal and Spatial Climate Gradients in Two Independent Databases. Ecosystems 2012, 15, 1283–1294. [Google Scholar] [CrossRef]
Van Vliet, A.J.H.; Bron, W.A.; Mulder, S. The how and why of societal publications for citizen science projects and scientists. Int. J. Biometeorol. 2014, 58, 565–577. [Google Scholar] [CrossRef]
Paul Chew, L. Constrained delaunay triangulations. Algorithmica 1989, 4, 97–108. [Google Scholar] [CrossRef]
Shi, Y.; Deng, M.; Yang, X.; Liu, Q. Adaptive detection of spatial point event outliers using multilevel constrained Delaunay triangulation. Comput. Environ. Urban Syst. 2016, 59, 164–183. [Google Scholar] [CrossRef]
Thackeray, S.J.; Henrys, P.A.; Hemming, D.; Bell, J.R.; Botham, M.S.; Burthe, S.; Helaouet, P.; Johns, D.G.; Jones, I.D.; Leech, D.I.; et al. Phenological sensitivity to climate across taxa and trophic levels. Nature 2016, 535, 241–245. [Google Scholar] [CrossRef] [PubMed]
Jie, C.; Bao-hua, G.; Ying, G.N.; Yuk-sing Gilbert, C. Comparative studies on phenotypic plasticity of two herbs Changium smyrnioides and Anthriscus sylvestris. J. Zhejiang Univ. Sci. A 2004, 5, 656–662. [Google Scholar]
Shen, M.; Cong, N.; Cao, R. Temperature sensitivity as an explanation of the latitudinal pattern of green-up date trend in Northern Hemisphere vegetation during 1982-2008. Int. J. Climatol. 2015, 35, 3707–3712. [Google Scholar] [CrossRef]
Van de Weygaert, R.; Schaap, W. The Cosmic Web: Geometric Analysis. In Data Analysis in Cosmology; Springer: Berlin/Heidelberg, Germany, 2008; pp. 291–413. [Google Scholar]
Boissonnat, J.-D.; Cazals, F. Smooth surface reconstruction via natural neighbour interpolation of distance functions. Comput. Geom. 2002, 22, 185–203. [Google Scholar] [CrossRef]
Sibson, R. A brief description of natural neighbour interpolation. In Interpreting Multivariate Data; John Wiley & Sons: New York, NY, USA, 1981; pp. 21–36. [Google Scholar]
Andresen, M.A.; Malleson, N. Testing the Stability of Crime Patterns: Implications for Theory and Policy. J. Res. Crime Delinq. 2011, 48, 58–82. [Google Scholar] [CrossRef]
Calinger, K.M.; Queenborough, S.; Curtis, P.S. Herbarium specimens reveal the footprint of climate change on flowering trends across north-central North America. Ecol. Lett. 2013, 16, 1037–1044. [Google Scholar] [CrossRef] [PubMed]
Sogaard, G.; Johnsen, O.; Nilsen, J.; Junttila, O. Climatic control of bud burst in young seedlings of nine provenances of Norway spruce. Tree Physiol. 2008, 28, 311–320. [Google Scholar] [CrossRef]
Laube, J.; Sparks, T.H.; Estrella, N.; H?fler, J.; Ankerst, D.P.; Menzel, A. Chilling outweighs photoperiod in preventing precocious spring development. Glob. Chang. Biol. 2014, 20, 170–182. [Google Scholar] [CrossRef]
Penuelas, J.; Filella, I.; Comas, P. Changed plant and animal life cycles from 1952 to 2000 in the Mediterranean region. Glob. Chang. Biol. 2002, 8, 531–544. [Google Scholar] [CrossRef]
Studer, S.; Appenzeller, C.; Defila, C. Inter-Annual Variability and Decadal Trends in Alpine Spring Phenology: A Multivariate Analysis Approach. Clim. Chang. 2005, 73, 395–414. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the proposed workflow that uses the coordinates of the volunteered phenological observations, their observed date and the GDDs accumulated until that date to check their consistency and to model phenological synchrony.

Figure 2. Annual graphs for lesser celandine flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned.

Figure 3. Linear regression lines fitted between the difference in observed flowering DOY (number of days) and the difference in modelled GDD (number of degree days) for pairs of spatially connected observations of lesser celandine flowering onset (see Figure 1). Correlation coefficient and rate of spatial change in DOY per unit of GDDs (i.e., slope of regression line) are given for each observation year.

Figure 4. Correlation coefficient (r) between the standard deviation of the reported DOYs for the flowering onset of lesser celandine and the average GDD for ΔMax values varying from one week to 30 days.

Figure 5. Examples of inconsistent observations for the flowering onset of lesser celandine. The highlighted observation is refuted by more than one other observation. Given the difference in GDDs between the highlighted observation and those connected to it, their difference in DOY exceeds the predicted difference by about two weeks. The numbers next to each observation show the reported DOYs.

Figure 6. Annual percentages of inconsistent observations and of boxplot outliers in volunteered observations of lesser celandine flowering onset. Inconsistent observations are unusually early or late DOYs with respect to the regional temperature regime of the observation sites, while the outliers are only very early or late DOY.

Figure 7. Lesser celandine flowering onset synchrony models for (a) original, (b) outlier-free and (c) consistent observations. Panel (d) shows annual boxplots of the reported DOYs for the original observations.

Table 1. The annual number of flowering observations for lesser celandine, wood anemone, and cow parsley and of leafing observations for pedunculate oak.

Year	Lesser Celandine	Wood Anemone	Cow Parsley	Pedunculate Oak
2003	160	67	73	23
2004	202	83	154	43
2005	279	105	179	58
2006	309	124	159	41
2007	303	117	157	59
2008	330	118	195	57
2009	259	109	148	50
2010	239	104	118	53
2011	262	121	160	39
2012	197	96	145	32
2013	190	91	118	45
2014	163	90	121	47
2015	149	73	83	40

Table 2. The coefficient of determination (R-squared) of phenological synchrony models driven from original, outlier-free and consistent observations.

	Original	Outlier-Free	Consistent
Lesser celandine	0.54	0.00	0.61
Wood anemone	0.43	0.28	0.40
Cow parsley	0.12	0.07	0.37
Pedunculate oak	0.35	0.16	0.36

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mehdipoor, H.; Zurita-Milla, R.; Augustijn, E.-W.; Van Vliet, A.J.H. Checking the Consistency of Volunteered Phenological Observations While Analysing Their Synchrony. ISPRS Int. J. Geo-Inf. 2018, 7, 487. https://doi.org/10.3390/ijgi7120487

AMA Style

Mehdipoor H, Zurita-Milla R, Augustijn E-W, Van Vliet AJH. Checking the Consistency of Volunteered Phenological Observations While Analysing Their Synchrony. ISPRS International Journal of Geo-Information. 2018; 7(12):487. https://doi.org/10.3390/ijgi7120487

Chicago/Turabian Style

Mehdipoor, Hamed, Raul Zurita-Milla, Ellen-Wien Augustijn, and Arnold J. H. Van Vliet. 2018. "Checking the Consistency of Volunteered Phenological Observations While Analysing Their Synchrony" ISPRS International Journal of Geo-Information 7, no. 12: 487. https://doi.org/10.3390/ijgi7120487

APA Style

Mehdipoor, H., Zurita-Milla, R., Augustijn, E.-W., & Van Vliet, A. J. H. (2018). Checking the Consistency of Volunteered Phenological Observations While Analysing Their Synchrony. ISPRS International Journal of Geo-Information, 7(12), 487. https://doi.org/10.3390/ijgi7120487

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Checking the Consistency of Volunteered Phenological Observations While Analysing Their Synchrony

Abstract

1. Introduction

1.1. VGI Attribute Consistency

1.2. Synchrony of Phenological VGI

2. Materials and Methods

2.1. VPOs and Temperature Datasets

2.2. Analysing Consistency and Synchrony of VPOs

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI