Next Article in Journal
A Comparison of OCO-2 SIF, MODIS GPP, and GOSIF Data from Gross Primary Production (GPP) Estimation and Seasonal Cycles in North America
Previous Article in Journal
A LSTM Algorithm Estimating Pseudo Measurements for Aiding INS during GNSS Signal Outages
Open AccessArticle

How Response Designs and Class Proportions Affect the Accuracy of Validation Data

1
Earth and Life Institute–Environment, Université catholique de Louvain, Croix du Sud 2, Louvain-la-Neuve B-1348, Belgium
2
CSIRO Agriculture & Food, Queensland Bioscience Precinct, 306 Carmody Road, St Lucia 4067 QLD, Australia
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2020, 12(2), 257; https://doi.org/10.3390/rs12020257
Received: 28 November 2019 / Revised: 24 December 2019 / Accepted: 7 January 2020 / Published: 11 January 2020
(This article belongs to the Section Remote Sensing Image Processing)
Reference data collected to validate land-cover maps are generally considered free of errors. In practice, however, they contain errors despite best efforts to minimize them. These errors propagate during accuracy assessment and tweak the validation results. For photo-interpreted reference data, the two most widely studied sources of error are systematic incorrect labeling and vigilance drops. How estimation errors, i.e., errors intrinsic to the response design, affect the accuracy of reference data is far less understood. In this paper, we analyzed the impact of estimation errors for two types of classification systems (binary and multiclass) as well as for two common response designs (point-based and partition-based) with a range of sub-sample sizes. Our quantitative results indicate that labeling errors due to proportion estimations should not be neglected. They further confirm that the accuracy of response designs depends on the class proportions within the sampling units, with complex landscapes being more prone to errors. As a result, response designs where the number of sub-samples is predefined and fixed are inefficient. To guarantee high accuracy standards of validation data with minimum data collection effort, we propose a new method to adapt the number of sub-samples for each sample during the validation process. In practice, sub-samples are incrementally selected and labeled until the estimated class proportions reach the desired level of confidence. As a result, less effort is spent on labeling univocal cases and the spared effort can be allocated to more ambiguous cases. This increases the reliability of reference data and of subsequent accuracy assessment. Across our study site, we demonstrated that such an approach could reduce the labeling effort by 50% to 75%, with greater gains in homogeneous landscapes. We contend that adopting this optimization approach will not only increase the efficiency of reference data collection, but will also help deliver more reliable accuracy estimates to the user community. View Full-Text
Keywords: validation; reference data; remote sensing; resolution; Accuracy assessment; response design; quality control; sub-sampling; overall accuracy validation; reference data; remote sensing; resolution; Accuracy assessment; response design; quality control; sub-sampling; overall accuracy
Show Figures

Figure 1

MDPI and ACS Style

Radoux, J.; Waldner, F.; Bogaert, P. How Response Designs and Class Proportions Affect the Accuracy of Validation Data. Remote Sens. 2020, 12, 257.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop