Users ’ Assessment of Orthoimage Photometric Quality for Visual Interpretation of Agricultural Fields

Land cover identification and area quantification are key aspects of implementing the European Common Agriculture Policy. Legitimacy of support provided to farmers is monitored using the Land Parcel Identification System (LPIS), with land cover identification performed by visual image interpretation. While the geometric orthoimage quality required for correct interpretation is well understood, little is known about the photometric quality needed for LPIS applications. This paper analyzes the orthoimage quality characteristics chosen by authors as being most suitable for visual identification of agricultural fields. We designed a survey to assess users’ preferred brightness and contrast ranges for orthoimages used for LPIS purposes. Survey questions also tested the influence of a background color on the preferred orthoimage brightness and contrast, the preferred orthoimage format and color composite, assessments of orthoimages with shadowed areas, appreciation of image enhancements and, finally, consistency of individuals’ preferred brightness and contrast settings across multiple sample images. We find that image appreciation is stable at the individual level, but preferences vary across respondents. We therefore recommend that LPIS operators be enabled to personalize photometric settings, such as brightness and contrast values, and to choose the displayed band combination from at least four spectral bands. OPEN ACCESS Remote Sens. 2015, 7 4920


Introduction
With the ongoing prevalence of digital sensors, both airborne and space-borne imaging techniques allow efficient identification of land cover.Accurate land cover identification and area quantification are of key importance for implementation of the European Common Agriculture Policy (CAP) [1], because payments to European farmers are area-based.Eligibility for CAP support is managed via the LPIS, as individually implemented by the Member States (MS) of the European Union (EU).LPIS data allow localization of the agricultural parcels claimed by farmers and quantification of their eligible areas [2].Failures of LPIS may lead to under-or over-declaration, which implies substantial financial risk to the EU [2], as CAP payments amount to around 30% of the EU budget (data from 2011) [3].In 2013, CAP expenditures totaled roughly 44 billion euros [4], underscoring the importance of LPIS precision.Quality assessment of the identification system for agricultural parcels is regulated by the Article 6 of Commission Delegated Regulation (EU) No 640/2014 and is performed by the MS on a yearly basis.In order to assess and improve LPIS quality, the Joint Research Centre (JRC) established the LPIS Quality Assurance (LPISQA) framework in 2010.Within this framework, satellite and aerial orthoimages (acquired every year) are used as the main basis for inspection procedures, which include land cover identification, data description and reporting [2].
Within the LPISQA framework, land cover identification is performed by visual interpretation of orthoimages.Visual image interpretation includes determination of the nature of objects on an image and a judgment on their significance [5].Object significance is particularly important for parcel delineation and for distinguishing eligible land.The outcome of the inspection process is highly dependent on correct image interpretation, which, in turn, depends on operator skills, experience and knowledge of the area of interest, as well as on orthoimage quality [6].Currently, most airborne and satellite orthoimages are multispectral.They are provided in digital format, with a spatial resolution of 0.5 m or less.The LPISQA protocol requires that orthoimages be of sufficient quality to allow determination of the nature of objects and, especially, identification of eligible land cover types.
The quality of imagery data and their fitness for use are influenced by many factors.Among these are spatial resolution, which affects the distinguishability of features and the scale at which the orthoimage can be displayed, and geometric quality, which affects geolocation accuracy [7].Geometric orthoimage quality is well understood and can be described using standardized measures such as mean-squared error [8].The LPISQA guidelines recommend that the spatial resolution of geometric orthoimages be 1 m or less and that the visual scale at which image interpretation is performed be larger than 1:5000 [9].Temporally, orthoimages are acquired on a yearly basis in the crop-growing season.Although the LPISQA framework does not define a required spectral resolution, in practice for visual image interpretation of agricultural land four bands are used: blue, green, red and near infrared [10,11].
The LPISQA guidelines recommend a radiometric resolution of orthoimages of at least 8 bits, but 10-11 bits per channel is strongly advised.Imagery quality is also addressed by the minimum look angle, which is recommended to be at least 56 degrees [10][11][12].In contrast to the geometric orthoimage quality required for LPISQA applications [11], little is known about the photometric orthoimage quality that is needed [13].There is no standard measure for photometric image quality, yet data objectivity and their comparison potential depend, among other things, on the photometric quality of the orthoimages [14].Assessment of the first two years of LPISQA implementation revealed a suboptimal quality of some of the orthoimage data used, e.g., degraded photometric quality or use of a low-quality orthoimage where a better alternative was available [15].
Studies of the photometric quality of images do exist, but the topic is multifaceted and hence complex [16].While there is a broad literature on radiometric and photometric aspects of images, its main focus is on radiometry during image production [14,17,18].Typically, quality assessment metrics are designed for natural, close-range images (e.g., tested using the Tampere Image Database 2008 (TID2008)) and intended for evaluation of full-reference image [19] or video [20].There are also proposals for no-reference image quality assessment models [21][22][23].Furthermore, studies in the medical domain have examined, e.g., digital radiography and base images quality parameters applicable to a wide variety of imaging tasks [24].A study by Pyka [25] considers photometric image quality for orthophoto map production.Another study compares the radiometric quality of satellite images (GeoEye-1 versus WorldView-2) using diverse quality indicators, including visual assessment [26].
As the LPISQA framework offers no standards for orthoimage photometric quality and image processing is in MS management, the current study investigates users' assessments of the suitability of orthoimages with various characteristics for the task of visual identification of land cover and agricultural fields within the LPIS context.The aim here is to determine whether there are general preferences for orthoimage brightness and contrast settings, color composite, noise and shadows and file formats.This research focuses on the image settings set by the operator before the actual delineation step.Understanding users' preferences regarding these properties may help in deriving an orthoimage standard for LPISQA applications.

Data and Methods
To investigate users' preferences, we developed an online survey in which multiple orthoimages were presented on a computer screen and respondents were asked to indicate the best and sometimes also the worst orthoimage sample for the purpose of agricultural land delineation.The survey is available as ancillary data to the article.The survey was relatively short (containing 29 questions), so the average time for participants to complete it did not exceed 15 min.Some of the images used were repeated in later questions presented in a slightly different way.In order to ensure that respondents did not go back and reconsider earlier choices, the process of answering was one-way: answers were saved as they were submitted, and could not be changed later.

Orthoimage Selection and Processing
The aerial and satellite orthoimages included in the survey were sampled from those used for the LPISQA in the years 2011 and 2012.The pixel size of aerial orthoimages was 0.25 m or 0.50 m; of satellite images the pixel size was 0.50 m.Pansharpened georeferenced satellite orthoimages (WorldView-2, QuickBird-2 and GeoEye-1) were taken from the JRC Community Image Data Portal [27].Both false color composites and normal color composites were selected and the orthoimages were presented in either the lossless tagged image file format (TIFF, without compression) or the lossy Enhanced Compression Wavelet (ECW) format, as delivered by the MS.The ECW format compress large images and retain their visual quality.TIFF and ECW are the two most commonly used orthoimage formats for LPISQA purposes [28,29].Orthoimage samples were chosen from the broadest possible range of European landscapes.They were from the following MS: Bulgaria, Germany, France, Ireland, the Netherlands, Poland and Slovenia (Figure 1).The survey asked participants to compare orthoimages displayed with various levels of brightness and contrast and with other enhancements typically used in LPISQA.The selected orthoimages were modified for the survey.The original image, as delivered by the MS, served as the default one, i.e., with image brightness and contrast centered at zero.Brightness and contrast were then varied using the open-source GIMP2 software [30].Orthoimage enhancement was done using ENVI [31] interactive display functions.All default histogram stretch options were used: linear stretch using the data minimum and maximum to perform a linear contrast stretch (without clipping), linear 0-255 (with the digital number (DN) values of the pixels displayed as a range from 0 to 255), square root stretch (taking the square root of the values in the input histogram and then applying a linear stretch), and finally the linear 2% stretch (a linear method with a 2% clip on both ends of the distribution of each band).
The survey consisted of four main sections.The first section (seven questions) focused on characteristics of the respondent.Specifically, it queried respondents' level of education, geographic location of practice, years of experience in visual image interpretation, and activities and years of experience with LPIS.Since we were unable to control for color blindness in our sampling, the participants' color vision was tested using part of the Ishihara test [32].We used three plates, designed to give a quick assessment of color vision deficiency.The second section of the survey (two questions) focused on preferred image brightness and contrast.Seven versions of a single orthoimage were provided in each question, and participants were asked to choose the one they considered best for the purpose of agricultural land delineation.The third section (11 questions) tested preferences for combinations of brightness and contrast.Each question presented two or three renditions of a single orthoimage.
Participants were asked either to indicate the best image for the purpose of agricultural land delineation or to choose both the best and the worst image for this purpose.Some of these questions repeated the same orthoimage samples used earlier but with a different color background (either black or white, Figure 2).The fourth section of the survey (8 questions) tested preferences for false color composite or natural color composite images, as well as for TIFF or ECW formats and standard image enhancements.

Sample Selection
Respondents in our sample represented three groups: 1. Technical and administrative staff involved in the LPISQA, 2. Professionals with visual image interpretation experience, and 3. Students.
The majority of respondents were LPISQA technical and administrative staff from a variety of MS.This group represents the main "target group" of the study, as they are the major users of orthoimage evaluations.They completed the survey during a LPISQA workshop in Baveno, Italy, in 2013.At that workshop three stations were set up for completing the online survey.Each was equipped with a similar laptop and using identical screen settings.The second group of respondents consists of employees of JRC (Ispra, Italy) and Wageningen University (Wageningen, The Netherlands).These were all experts in orthoimage interpretation but not directly involved with LPIS activities.The final group of participants was made up of students from the Laboratory of Geo-information Science and Remote Sensing at Wageningen University and from the Remote Sensing, Photogrammetry and Geoinformation Department of the Krakow Academy of Science and Technology (Krakow, Poland).The survey was administered to all three groups from September to November 2013.During that period, 197 valid complete records were collected (Table 1).
Visual inspection of processed orthoimages is the traditional means of evaluation, as human observers can recognize distortion and degradation of orthoimagery without referring to the original image [33].Our survey clearly specified the aim of orthoimage interpretation as to delineate agricultural fields.The aim here was to pinpoint specific issues connected with orthoimage interpretation for this purpose.

Analysis Methods
Table 2 lists the methods used to analyze the survey results.Most analyses were performed in R [34].
The brightness and contrast levels chosen as best in the first questions were later considered as the reference values in the checks for individual consistency of choices, later on referred to as "centers".
For a given question, the estimated Shannon entropy (as a measure of dispersion) was calculated as follows: where ( ) = the estimated Shannon entropy, xi = the event of choosing image i in the question, ( ) = the estimated probability mass from the histogram, g = the number of images used in the question (either two or three).

Table 2. Methods of analysis of the survey results (the survey can be found as ancillary data to the article).
No.
Aspect Items Analysis Method

Brightness
Question about the preferred brightness, to be chosen from seven different levels (survey section II, item 1) -Count and plot (line chart)

Contrast
Question about the preferred contrast, to be chosen from seven different levels (survey section II, item 2) -Count and plot (line chart) 3.

Brightness and contrast combined
Questions asking the participant to choose the best image or the best and worst images out of two or three samples, each with a different combination of brightness and contrast.Three duplicate questions were included to determine the effect of a white or black background (section III of the survey, items 1-11) -For choice of best brightness and contrast: Count and plot (bar chart) -Estimated Shannon entropy [35][36][37] and its estimated standard deviation [38] (for image triplets, boxplot) -Plot (boxplot) of distance from the preference indicated earlier and the answer to the current question for each respondent 4.

Format and color composite
Questions asking the participant to choose the best or the best and worst of two or three images of different format and color composite, all with default (as delivered by the MS) brightness and contrast (survey section IV, items 1, 4 and 6) -Normalized index (the ratio of the frequency of an image being chosen as best and as worst, normalized to the [0,1] interval) -Contingency table (the worst image properties as a function of best ones)

5.
Standard enhancement Questions asking the participant to choose the best or the best and worst of two or three images with different enhancements: -Three questions, each presenting three images with four different types of enhancements used throughout (survey section IV, items 2, 5 and 8) -Two questions, each presenting a pair of images, one with default settings (as delivered by the MS) and a second with 2% stretch applied.In one question, part of the border of land under inspection is obscured by a shadow (survey section IV, items 3 and 7) -Percent of count -Estimated Shannon entropy [35][36][37] To analyze the precision of the estimated Shannon entropy values, the standard deviation was also calculated [38].This enabled us to assess the spread of the estimated entropy over the number of possible responses to a question: where ( ) = the estimated standard deviation of the estimated Shannon entropy, N = the sample size (in our study 197).Orthoimage format and composite appreciation (Table 2; row 4) was calculated for each given image sample as the ratio between the frequency of it being chosen as best and as worst; normalized to the [0,1] interval.Normalization allowed to compare survey questions that use different numbers of sample images.A contingency table was developed of the image format appreciations using the least appreciated image properties as a function of the most appreciated ones.The index was calculated using the following equations: = and = ∑ g where ci = the partial ratio value, = the normalized ratio value, ai = the number of times the image was chosen as best in response to the question, bi = the number of times the image was chosen as worst in response to the question.In the discussion below, the estimated Shannon entropy and its standard deviation are termed simply entropy and standard deviation of entropy, respectively.
Preferred brightness and contrast values were expected to be relatively consistent throughout the survey.Both are expressed as the distance from the center, or the preferred value for each respondent.First, the difference was calculated between the preferred brightness and contrast values (chosen in response to the first two questions of Section II of the survey) and the contrast or brightness values chosen in the further questions.Second, within a given question, the difference was determined between the sample image "closest" to the initially chosen brightness and contrast values and these values in the respondent's actual answer.

Results and Discussion
In total, 197 respondents completed the survey.Of these, 11 failed the Ishihara test, indicating color blindness.Less than 6% of respondents were likely to have been color blind, as this percentage corresponds with averages for populations of Northern European origin.Among them, the frequency of red-green color vision defects is around 8% for men and 0.5% for women [39].All responses for respondents who could have been colorblind were included in the further analysis.

Preferred Brightness and Contrast Ranges
Figure 3 presents preferences for brightness and contrast modification levels applied to false color composite orthoimages.The central zero values correspond to the default, unprocessed orthoimages as delivered by the MS.Note that based on a preliminary comparison, the range of contrast levels of modification was set to twice the range of brightness modification levels.More than half of the participants preferred the default brightness setting (zero adjustment).On average, there was a tendency to favor slightly reduced brightness settings (-4.2).There was also a slight preference for somewhat increased contrast levels (3.9).
The spread of the preferred brightness and contrast indicate clear variation between participants in their preferences.This suggests that LPIS operators should be enabled to personalize brightness and contrast settings for visual image interpretation.This finding should be taken into account in the LPISQA technical guidelines for delivering georeferenced map orthoimages.

Influence of Background Color on the Orthoimage Brightness and Contrast Appreciation
Figure 4 presents responses to three pairs of questions (A, B and C) in which each pair depicts the same sample images on, respectively, a white background and a black background.Note that Pair A compares brightness values only, as the contrast value was set at default (0) for all of the image samples presented.Pair B compares contrast values only, as here the brightness value was left at default.Against the white background, the orthoimage chosen as best was the one with the higher contrast (set at 20).Against the black background, the most appreciated orthoimage was the one with contrast set at 0. Pair C compares images with default values with those with modified brightness and contrast.
Figure 5 presents the entropy and its standard deviation (see Equation ( 2)) times the 97.5% quantile of the student distribution with 196 degrees of freedom across all respondents.Each question asks users to indicate the best and the worst orthoimage, first against the white background and then against the black background.The entropy was greater for the choice of best image than for the worst one, meaning that there was more consensus on the choice of the worst orthoimage.In contrast, the standard deviation of entropy was greater for the choice of the worst image, meaning that the entropy value is less precise.The background color (white or black) did not significantly influence the choice of best and worst orthoimages.
The entropy value was highest for Pair B, which means that a less consistent preference was expressed for a specific orthoimage.In this Pair, the least appreciated orthoimage against both background colors was the one with contrast reduced to −20.However, the high entropy values indicate that respondents' choices were rather dispersed.The lower entropy values found here indicate more consistency in the choices of best and worst orthoimages against a white background.Entropy was highest for the questions presenting samples against a black background, and there was a more outspoken preference for a higher contrast against a white background.These results suggest that preferences of image brightness and contrast may vary depending on the background color used.Again, this suggests that some means of personalizing brightness and contrast should be made available to operators involved in LPISQA visual image interpretation.

Orthoimage Format and Color Composite Appreciation
For questions on orthoimage format and color composite, Table 3 presents the normalized index of the most appreciated orthoimages, calculated using Equation (3).The most appreciated orthoimage type was a false color composite in TIFF format.The least appreciated type was a natural color composite in ECW.
Table 3. Normalized index of most appreciated orthoimages from survey items testing image format and color composite preferences (FCC = false color composite, NCC = natural color composite, TIFF = tagged image file format, ECW = Enhanced Compression Wavelet).When the orthoimage chosen as best was a natural color composite (in either ECW or TIFF format), the majority of respondents (72% and 85%) selected the false color composite as the worst orthoimage (Figure 6).This means that in the context of the survey, appreciation of color composite was more decisive in determining overall appreciation than the orthoimage format type.However, when the orthoimage chosen as best was a false color composite TIFF, the ECW was selected as the worst sample image.These results imply a strong recommendation for LPISQA administrators to acquire or order images in at least four bands-visible (red, green and blue) and near infrared-to allow production of false color composites.Beyond having four (or more) bands, operators should be given a means to change the bands displayed.

Orthoimages with Shadowed Areas
One of the biggest challenges in visual image interpretation and mapping is retrieving information from areas obscured by shadows.One of the survey questions included a parcel boundary that was partially shaded.After manipulation of one of the images (application of a 2% stretch), the shadowed border was no longer distinguishable (please consult Section IV, Question 7 of the online survey available as ancillary data).
Table 4 presents respondents' choice of the best orthoimage for two questions in which one of the orthoimages was not stretched and a 2% stretch was applied to the second.In both, the orthoimage chosen as best was the one with the 2% stretch, even if this implied that the parcel border was blurred by shadow.A possible explanation for this rather counterintuitive result is that respondents did not really consider the intended use of the orthoimage, which was indication of the land represented by the Reference Parcel.The LPISQA methodology already incorporates an angle of view restriction to minimize areas with occlusions.Although the effect of shadow is recognized, there are no measures as yet addressing shadow length and information loss.

Standard Image Enhancement Appreciation
Image enhancement appreciation was investigated by testing four standard enhancements, ranging from no stretch to square root stretch.Table 5 presents the results.
Table 6 presents the entropy for the best and worst orthoimage choices for the same questions as in Table 5.The most favored enhancement was the 2% stretch, followed by the linear stretch and then the unenhanced orthoimage.The least favored one was the square root stretch.The display enhancement of 2% stretch is commonly used in orthoimage processing and GIS software.It was probably familiar to the respondents and perhaps therefore most appreciated.

Consistency of Brightness and Contrast Preferences
Respondents' brightness and contrast preferences were expected to be relatively consistent throughout the survey.This was measured by examining how far from the "center" respondents' answers were; that is how similar they were to the brightness or contrast values initially indicated as preferred.First, the difference was calculated between the preferred brightness and contrast values (chosen in response to the first two questions of Section II of the survey) and the image contrast or brightness chosen in the further questions.Second, within a given question, the difference was determined between the sample image closest to the initially chosen brightness and contrast and these values in the respondent's actual answer.This is depicted on the horizontal axis of Figure 7.These values thus show the difference between the closest possible answer in a given question and the preferred brightness or contrast.The size of the point marker in Figure 7 reflects the number of times a choice was made.The gray dashed line indicates the theoretically most consistent choice.Figure 7A's preferred brightness value is well represented in answers to the further questions (in Section III of the survey).Moreover, this same preference at individual level was chosen again in later questions in the majority of cases.For brightness, respondents' second preference (when the favorite was not available) was slightly less bright.The most frequent answers are concentrated around the dashed gray line, confirming a high consistency in choices.Figure 7C shows that the least preferred brightness is quite dispersed, and the images selected as worst differ markedly from the preferred brightness.Furthermore, the least preferred brightness values are higher than the brightness most preferred.Figure 7B,D show similar plots for preferred contrast values.Throughout the survey, the preferred contrast is largely consistent with that chosen as preferred in the second question of Section II (indicated in Figure 7B as the largest value at the 0,0 intersection).We see on closer inspection of Figure 7B that sometimes, even when the preferred contrast value was represented in the question, a higher contrast value than the preferred one was likely to be chosen.
To summarize, the preferred brightness values were not always confirmed in the further questions.Nonetheless, there was substantial consistency in the brightness chosen as preferred.When the preferred brightness value was not represented, a lower brightness value was typically selected.Higher brightness values were less appreciated.There was much consensus among respondents on the preferred contrast values.Moreover, in the majority of cases, the preferred contrast was chosen again and again in the further questions.Otherwise, orthoimages with higher contrast values were preferred over those with lower contrast.
These results demonstrate that image preferences are relatively consistent for an individual respondent.However, they do vary within the group of respondents.This again suggests the merit of providing LPISQA operators a means to personalize settings and make photometric adjustments in images.

Conclusions and Recommendations
This study confirms that brightness and contrast are both important attributes of LPISQA orthoimages.Users generally prefer higher contrast values combined with lower brightness values.Where the background color for the orthoimage comparison is white, an even higher contrast value is preferred.Our findings, furthermore, reveal variety among users in their preferred brightness and contrast, although individual respondents exhibited a high degree of consistency in the choices they made.This suggests the usefulness of providing individual operators with a means to personalize orthoimage settings.Respondents to our survey were quite consistent in choosing a specific brightness and contrast value.Among respondents, there was also overall agreement on the least preferred orthoimages, though there was less consensus on the most appreciated ones.
The false color composite was revealed to be preferred over the natural one.This finding should encourage MS to order four-band orthoimages (with visible bands and near infrared) rather than limiting images to only the visible bands.The preferred orthoimage format was TIFF, which was favored over the lossy ECW.
Considering the standard stretches and orthoimage enhancements, the 2% stretch was preferred.This popular display enhancement is commonly applied in orthoimage processing and GIS software and therefore was probably familiar to the respondents.Where an orthoimage contained a shadowed area, loss of information (in the shadow) appeared less important to the respondents than the overall orthoimage appearance, as the image sample chosen as best was one with a higher contrast and lower brightness, though this rendered the land use delineation indiscernible.This finding suggests that there is still a role for expert input in designing orthoimage quality standards for shaded areas.Orthoimages with shadows require further investigation, as it is crucial to find an optimal balance between visual appreciation of an orthoimage and loss of information.
The results of our survey indicate that, for the purpose of agricultural land delineation, the best orthoimage format is a TIFF, false color composite, with the enhancement of a 2% stretch and providing operators a means to adjust brightness and contrast to their own individual needs.Further research should focus on evaluation of existing metrics for assessing the photometric image quality [25] for LPISQA objectives, or if needed, the design of new metrics.

Figure 1 .
Figure 1.The orthoimages selected for the online survey are from the zones outlined in yellow (indicated with a black pin).

9 Figure 4 .
Figure 4. Respondents' choice of the best and the worst images against the white and the black background (the brightness and contrast combinations used in the sample images are specified).Figures A, B and C each represent a pair of questions using the same sample images against a different background color (best on the left and worst on the right; white column = white background, black column = black background).

Figure 5 .
Figure 5.Estimated Shannon entropy (points) and its standard deviation times the 97.5% quantile of the Student distribution with 196 degrees of freedom (whiskers) for choices of the most and the least appreciated image triplets, paired with the white background (gray line) and black background (black line).The pairs of questions referred to as A, B and C are the same as those in Figure 4.

Figure 6 .
Figure 6.Influence of best image format choice on the subsequent worst one (FCC = false color composite, NCC = natural color composite, TIFF = tagged image file format, ECW = Enhanced Compression Wavelet).

Figure 7 .
Figure 7. Number of choices (indicated by the marker size) for most and least preferred brightness and contrast.The gray dashed line indicates the theoretically most consistent choice possibilities.

Table 1 .
Three groups of respondents completed our online survey.

Table 4 .
Percentage of respondents that chose the respective image samples (no stretch versus 2% stretch) as best for the purpose of delineating agricultural fields.

Table 5 .
Frequency of image samples being chosen as best (%).Each question (I, II and III) offered three orthoimages from which to choose.

Table 6 .
Estimated Shannon entropy of responses to the three questions referred to in Table5.