In Field Detection of Downy Mildew Symptoms with Proximal Colour Imaging

This paper proposes to study the potentialities of on-board colour imaging for the in-field detection of a textbook case disease: the grapevine downy mildew. It introduces an algorithmic strategy for the detection of various forms of foliar symptoms on proximal high-resolution images. The proposed strategy is based on structure–colour representations and probabilistic models of grapevine tissues. It operates in three steps: (i) Formulating descriptors to extract the characteristic and discriminating properties of each class. They combine the Local Structure Tensors (LST) with colorimetric statistics calculated in pixel’s neighbourhood. (ii) Modelling the statistical distributions of these descriptors in each class. To account for the specific nature of LSTs, the descriptors are mapped in the Log-Euclidean space. In this space, the classes of interest can be modelled with mixtures of multivariate Gaussian distributions. (iii) Assigning each pixel to one of the classes according to its suitability to their models. The decision method is based on a “seed growth segmentation” process. This step exploits statistical criteria derived from the probabilistic model. The resulting processing chain reliably detects downy mildew symptoms and estimates the area of the affected tissues. A leave-one-out cross-validation is conducted on a dataset constituted of a hundred independent images of grapevines affected only by downy mildew and/or abiotic stresses. The proposed method achieves an extensive and accurate recovery of foliar symptoms, with on average, a 83% pixel-wise precision and a 76% pixel-wise recall.


Introduction
Epidemiological surveillance is a crucial issue related to problems regarding food safety and security, public health and environmental protection. These problems are especially concern viticulture, where the use of phytosanitary products can be intensive, the staff exposition is frequent and the vicinity between residents and vineyards is a cause for concern or even conflicts.
Currently, crop protection practices in vineyards rely mostly on preventive chemical controls. Spraying strategies are usually scheduled according to climatic risks and to the health history of the vineyard [1]. Over the last decade, several alternative spraying strategies have been developed to improve the inputs efficiency. These strategies consider local information regarding the vineyard health status [2]. However, the assessment of risks and health scores (such as the frequency and severity of the disease) still requires the mobilisation of experts scouting vineyards for symptoms. This task is inherently time-consuming and labour-intensive and can only provide a partial and sparse picture of propose a more in depth and complete processing strategy to ensure greater accuracy in the detection of the disease.
The purpose of this work is to propose methodological tools and an image processing chain. Both are dedicated to the detection of symptoms due to downy mildew (Plasmopara viticola) on proximal colour images acquired directly in the vineyard. These images are obtained thanks to an autonomous embedded device, able to operate in farming conditions with high throughput. It shows the potential and benefits of "frugal" artificial vision for farming applications. The advantages of this strategy is that it is operable "on the go", i.e., in the same vein as a phytopathology expert that would browse the canopy seeking symptoms. In this case, it is a high-throughput sensor that acquires scenes of vegetation from a farming equipment cruising at a conventional working rate. It is designed to identify small symptoms (<5 cm), which sometimes represent only 12-pixel radius patches on the images, spread in a complex pattern of organs and textures. There is no manual extraction of the affected tissues from the plant nor specific targeting to constitute the database. Downy mildew is an interesting case study because it affects the majority of the world's vineyards and represents a substantial financial and logistical cost and a major environmental impact. In addition, this pathology presents a wide variety of symptoms with very discrete forms for early occurrences, which is an ideal context to develop versatile algorithms able to adjust to different environments while being trained on moderate databases.
The proposed methodology relies on the parametric modelling of structure-colour features within leaves affected by Downy mildew as well as healthy vine tissues. Based on these models, the detection of symptoms is achieved through a seed growth segmentation. The paper presents a new structure-colour representation called TC-LEST (Tensorial Colour Log-Euclidean Structure Tensor), which is an adaptation of the CELEST representation, previously presented in [20] for the pixel-wise classification of healthy vine organs. In addition, it introduces two statical criteria ("Within" and "Between"), based on Mahalanobis distances between features and models. These criteria enable to determine the affiliation of samples to statistical models. Altogether, these contributions are gathered within a framework dedicated to the application of interest, i.e., the detection of downy mildew symptoms. This strategy is conceived to be a relevant alternative to applications relying on deep learning. The purpose is to produce efficient models with minimal data, resulting in agronomically interpretable indicators.

Cultivation Environment
Images were acquired at Le Domaine de la Grande Ferrade, a public experimental facility of INRAE ((the French) National Institute of Agriculture, Food and Environmental Research ) in the area of Bordeaux. Images were taken on two 0.3 ha plots planted with the red wine grape variety Merlot Noir. One of the plots is cultivated with integrated crop protection and the other according to organic standards. For both plots, phytosanitary inputs are reduced to 50% of the conventional prescribed dose. The plants were affected only by downy mildew and abiotic stresses. At the end of July 2018, the plots were extensively photographed weekly with examples of healthy vinestocks and examples of vinestocks with early and late symptoms corresponding to phenological stages between BBCH (Biologische Bundesanstalt, Bundessortenamt und CHemische Industrie) 75 (berries pea-sized, bunches hang) and 79 (majority of berries touching) [21].

Instrumentation
The imaging system is composed of a global shutter 5 Mpx industrial Basler Ace (acA2500-14gc GigE) RGB camera with a 55 • horizontal field of view lens. To overcome the weather-and time-dependent variations of outdoor illuminations in outdoor environments, the imaging system includes a high-power 58GN xenon flash (Neewer speedlite 750ii) used with a short exposure time (250-300 µs). All the components are powered by a 12 V battery. The device is equipped with an on-board industrial computer that simultaneously controls the shooting of the camera and the trigger of the flash, and stores the acquired image data. The computer is built around a low consumption 4-core ARM chip robust to vibrations and watertight. The device (Figure 1b) is embedded on a vineyard tractor at 70 cm above ground and at 50cm from the target (Figure 1a). At this distance, each image (2592 × 2048 px) covers a 1.3 m 2 area which enables to approximately capture a vinestock and its full canopy at a resolution of 4 px·mm −1 . With this resolution, even some early and discrete symptoms are clearly visible. However, at this scale, they are faded within a complex tangle of tissues (see Figure 2). Acquisitions are adapted for the work rate speed in vineyards (3-8 km·h −1 ), i.e., to ensure one image every meter.

Image Processing Pipeline
The image processing pipeline is designed to detect the presence of downy mildew symptoms among a wide range of organs and textures (leaves, fruits, stems, necrosis, etc.). The process is primarily based on the statistical modelling of the local structure-colour properties in each of the considered classes. The statistical models enable to determine the likelihoods to the considered classes for each pixel. Then, the likelihoods are evaluated through statistical tests combined with spatial coherence criteria within a seed growth segmentation that determines which pixels could constitute symptoms. The following sub-parts aim at describing the main steps of the proposed processing chain (Figure 3).  The first step of the pipeline is the thresholding of images in the Hue Saturation Value (HSV) colour space. The purpose is to discard, on the sole basis of colour, irrelevant pixels that do not constitute plant tissues (such as the sky, poles, wires, etc.). To a lesser extent, it also eliminates overly bright or overly dark pixels. This step is achieved simply by calculating the histogram of hue vales and then applying the Otsu threshold [22]. After this step, the core processes are applied to a limited number of relevant pixels. Images are then processed through several filters to extract features, estimate models and eventually determine iteratively and for each pixel the affiliation to symptoms of mildew. These processes are illustrated with a practical example in Figure 4. When applied to a single 5 Mpx image, the whole detection process requires a moderate computational cost with a unit execution time below 1 s. While the estimation of models from a hundred 5 Mpx images can require several hours with a standard high-frequency CPU, the application itself can be conducted in real-time once the offline modelling phase is achieved.

Pixel-wise parametric classification
The remainder of the section will detail the three following major steps; LE mapping, modelling and seed growth algorithm.

Local Structure Tensor: A Tool to Extract and Represent Textural Information
The LST is a reference tool [23] that extracts geometric information and orientation trends in local patterns within greyscale images. It is commonly defined as the local covariance of gradients [24,25]. The computation of a LST field is a two step process, starting with estimating local gradients in the neighbourhood of every pixel in an image. Given a greyscale image I of size [M × N], the gradient image ∇I is estimated as where t denotes the matrix transpose operator; * denotes convolution; and g x and g y represent, respectively, the estimates of the horizontal and vertical derivatives of image I obtained by applying Gaussian derivative kernels ∂G(x, y) ∂x and ∂G(x, y) ∂y . The LST field is then computed by smoothing the outer product ∇I ∇I t with a Gaussian filter W T : Thus, for every pixel (i, j) ∈ [1, N] × [1, M] there is a corresponding local structure tensor, Y(i, j) in the form of a 2 × 2 symmetric matrix:

Log-Euclidean (LE) Mapping of LST's
LST's being covariance matrices, they belong to the Riemannian manifold of Symmetric Positive-Definite (SPD) matrices. The use of standard tools of Euclidean geometry and Gaussian statistics on such variables is not straightforward and shall be carried out by considering the properties of the Riemannian manifold [26]. The mapping of LST's into the Log-Euclidean (LE) space, as proposed by [26], enables successful image classification in agricultural applications [19,20,27] The mapping of a tensor Y onto the LE space is achieved by computing its matrix logarithm. Let us consider the eigen decomposition of a LST Y as where D = λ 1 0 0 λ 2 is the eigenvalues diagonal matrix with λ 1 ≥ λ 2 and R = cos θ − sin θ sin θ cos θ is the rotation matrix defined by its angle θ. Then,

Rotation Invariance
Here, we propose to express a rotation invariant form of Arsigny's representation in the LE space. Indeed, orientation itself is not a relevant information, as any given tissue, healthy or diseased, should be considered the same regardless of their orientation in the image. As the diagonal matrix of a given tensor provides a unique set of eigenvalues for different possible rotation matrices, it is possible to ensure rotation invariance by retaining only the eigenvalues.
Then, a rotation invariant [20] form of the LE representation log m (Y) can be easily expressed as

Describing Grapevine Healthy and Symptomatic Tissues with LSTs
Some textural differences between healthy limbus and downy mildew foliar symptoms can be highlighted through features derived from LSTs, computed from the luminance of the greyscale image. Figure 5 illustrates the discriminative potential of LSTs for different facies of the disease. The figure presents the eigenvalues of the LST field mapped into the LE space. Eigenvalues are normalised into an 8-bit scale. Three examples are presented: (a) a large circular advanced oil spot, a corolla of small early oil spots and (c) irregular symptoms on edges. In case (a), it is simple to discern the spot, which medium eigenvalues ([∼100, ∼120]) differ greatly from smooth limbus presenting low values both for λ 1 and λ 2 . In case (b), it is also possible to distinguish some of the symptomatic spots. However, the eigenvalues, especially in the centre of the spots, can be easily mistaken with some leaf edges or some veinlets. In case (c), the interpretation of eigenvalues is more complex due to the irregular shape of symptoms. The inner parts of symptoms present properties and eigenvalues are similar to the previous cases. However, unlike cases (a) and (b), the texture in (c) is not isotropic, thus in the parts adjacent to edges, the structures present dominant orientations and are indistinguishable from healthy leaf edges or limits between organs. Therefore, the structural information alone is not always sufficient to properly discriminate healthy grapevine tissues from pathological ones [19].

Joint Representation of Structure and Colour
Texture and colour are two naturally related properties. It is this relation that enables the human psychovisual system to construct images [28]. Therefore, several methods were developed to extract and represent texture-colour features. In particular, different colour extended LST or LST defined within colour spaces proved to be relevant for image processing applications [29][30][31][32].
Considering the properties of the existing colour LST, and adapting them to the proposed modelling and likelihood based decisions, a novel structure-colour representation is introduced: TC-LEST (Tensorial Colour Log-Euclidean Structure Tensor) . TC-LEST is a refinement of CELEST (Colour Extended Log-Euclidean Structure Tensor), a previous representation proposed in [20]. TC-LEST is obtained by mapping LST's into the Log-Euclidean metric space and then concatenating local colorimetric information. The originality of TC-LEST is that the colour components are expressed with as tensor. The result is a low-dimensional vectorial representation describing jointly structure and colour that can be modelled and exploited through common statistical and Bayesian tools.
Tensorial Representation of Colour in the HSL Colour Space: TC-LEST Representation An astute method to concatenate colorimetric data to structural information is inspired by the authors of [31]. They proposed to transform the RGB triplet into an Hue, Saturation, Luminance (HSL) triplet before representing it by an ellipse or, equivalently, by a tensor, i.e., a SPD matrix Z. The ellipse is constructed so that its orientation ϕ, eccentricity and magnitude are given respectively by the H, S and L channels. As for the tensor form Z, the orientation ϕ and eigenvalues η 1 and η 2 are deduced from the H, S and L channels as follows, The tensorial representation of colours is then expressed as In this form, the computation of dissimilarity measurements and likelihoods with colour components is more relevant and consistent with LST's statistical models.
Subsequently, it is proposed to process colour with respect to the properties of SPD's, following three steps. (i) The matrix Z is smoothed by convolution with a Gaussian kernel consistently with the work in [26]). Unlike for the structural component, it is senseless to produce rotation invariant colour matrix and it would discard the information contained in ϕ, i.e., relative to Hue.
Eventually, TC − LEST representation is obtained by concatenating structure and colour into a single 5-dimensional vector:

Modelling Structure-Colour Features
Several authors successfully modelled different classes of texture with Gaussian probability density functions that describe the distribution of LSTs within each class. These models have been considered both for the matrix form with Gaussian Riemannian models and for the vectorial form in the Log-Euclidean space and have been applied to the classification of remote-sensed natural textures [20,33,34]. In this article, it is proposed to evaluate such models in the LE space for structure-colour variables derived from proximal sensing images.
Within each of the considered classes, the distribution of colour extended LSTs can be described in the LE space by a multivariate Gaussian function. For a given class, the probability density function is defined by the following equation, where µ denotes the mean vector of size [D × 1], Σ, its [D × D] covariance matrix and |Σ| the corresponding determinant. By nature, the classes of texture are inherently heterogeneous. Assuming that each class results from the grouping of sub-classes (e.g., leaves = upper limbus + under limbus ), it seems then relevant to consider Gaussian mixtures for stochastic modelling. The probability density function corresponding to a mixture of K Gaussian multivariate distributions is defined by the following equation, where ω k denotes the weight of the k th Gaussian component of the mixture. µ k and Σ k denote, respectively, the barycentre and the covariance matrix of the k th component.
As the structure and colour joint representations presented above lie in high dimension (5) LE-spaces, observing the adequacy of their distributions to multivariate Gaussian probability is not trivial. A solution is to rely on a remarkable property of the transformation of matrices from the Riemannian manifold to the LE space. Saïd et al. [33] enunciate the following equivalence: if the set of matrices is distributed according to a Gaussian Riemannian function , the distribution of the log-determinants is Gaussian as well and the vectorial representations in the LE space are distributed according to a multivariate Gaussian distribution. Figure 6 presents the distribution of log-determinants of LST's for the classes "healthy leaves" (a), "healthy berries" (b), "foliar symptoms of downy mildew" (c) and "symptoms on berries" (d). Samples are collected from 100 images with a headcount varying between 1.5 × 10 4 and 1.9 × 10 6 depending on the relative abundance of classes within the data base. The histogram of the log-determinants computed from the samples are shown in blue together with the Gaussian distributions of equivalent mean and standard deviation (μ,σ) represented in red. Gaussian mixtures distributions are shown for the foliar symptoms (c) and the symptoms on berries (d) in teal. Apart from the class "symptoms on berries" (Figure 6d), all distributions are assimilable to Gaussian probability density functions with their respective empirical mean and variance (μ,σ) as parameters. However, the distribution for the class "symptoms on berries" can be represented by a mixture of three Gaussian functions (Figure 6d). In addition, the other classes seem to be better represented with mixture models. Indeed, their histograms present some moderate asymmetries and an offset of the distribution's mode regarding the empirical mean. An example of the fitting improvement provided by mixture models (in red) compared to a single Gaussian model (in teal) is shown for the class "foliar symptoms of downy mildew" in Figure 6c.
TC-LEST representations is obtained by concatenating vectorial forms of LSTs in the LE space with colorimetric information. To demonstrate the adequacy of these new variables it is then sufficient to observe their colorimetric components. TC-LEST, consists of a vectorial form resulting from the LE transform of a matrices holding the same properties as covariance matrices. Alike LSTs, the distributions of its log-determinants enables then to understand the adequacy to multivariate Gaussian probability functions. Figure 7 presents the distribution of the log-determinants of the colour component of TC-LEST for foliar symptoms and their adequacy to a Gaussian probability density function (a) and to a mixture model (b). In this case, the colour component seem to be globally adequate to a Gaussian model, yet with difficulties to represent the mode and the tails. The mixture model is then much more appropriate.
In conclusion, the structure-colour representation can altogether be modelled in each class of interest with Gaussian mixture probability density functions.

Seed Growth Segmentation
The use of this segmentation method is motivated by the analogy between "artificial vision" and the human psycho-visual perception system [35]. Indeed, although symptoms of downy mildew present some very distinctive properties at a pixel-wise scale, it is mainly a larger pattern, i.e., a "spot", which is recognised by an observer. In addition, some elements within symptoms are not easily differentiable from confounding factors. To address this problem, seed growth segmentation considers "spatial coherence". The purpose is to reconstruct symptoms as continuous connected components constituted of distinctive pixels, arranged coherently in a spatial pattern.
Seed growth [36] is a segmentation method intended to recover connected spaces from a pixel-based classification. This method hinges on two major steps. The first consists in detecting seeds, i.e., the most characteristics pixels within the objects to recover. The seeds are detected by applying restrictive criteria to the decision outcomes of the classification process. The purpose is to maximise the probability that the selected seeds are actually adequate to the model. The second steps consists in aggregating new pixels to the seeds thanks to more permissive and relaxed criteria and under the condition of connexity to the seeds. This second step is intended to propagate confident decisions to recover at best the targeted areas, i.e., foliar symptoms of downy mildew in this case.
In this case, the criteria used are related to the likelihoods between the local structure-colour properties of pixels and the models of the considered classes. These criteria are meant to evaluate Mahalanobis distances [37] between the features describing pixels and the barycentres of the models. These distances convey an information very similar to likelihoods. However, in practice, it is much simpler to evaluate and compare distances than likelihoods. Two criteria of the sort are proposed and combined: a "within" criteria and a "between" criteria that are described in the following.

Within Criteria: Retaining the Most Relevant Pixels of Downy Mildew Symptoms
The within criteria is meant to assess the relevance of a pixel in the class foliar symptoms. It consists in comparing the Mahalanobis distance d mahal (Y, µ mildew ) between a feature Y describing a pixel and the barycentre µ mildew of the model describing the class of downy mildew foliar symptoms.
Under the hypothesis that a pixel described by Y belongs to the mildew class under the normality hypothesis then: µ mildew et Σ mildew being, respectively, the barycentre and the covariance matrix of the mildew model and [37].
Following this hypothesis, where χ 2 (N) is a Chi square distribution with N degrees of freedom and N is defined by the dimension of the considered descriptors. A distance threshold δ α = χ 2 α is then determined so that p(χ 2 < χ 2 α ) = α, where α ∈ [0, 1] and χ 2 α is the α-quantile of the law χ 2 (N). Discarding instances such as d mahal (Y, µ mildew ) > δ α is equivalent to retain only a proportion α of the most significant and of the closest instances to the barycentre of the mildew model:

"Between" Criteria: Discarding Uncertain Pixels
It is possible in some cases that a healthy tissue, yet presenting anomalies, displays properties similar to downy mildew symptoms. In the same way, it is possible that some pixels located at the edges of symptoms resemble, in terms of structure-colour properties, to healthy pixels. In these cases, likelihoods for both healthy and symptomatic classes could be high. Thus, the first criteria cannot prevent such errors.
The "between" criteria consists then in comparing distances between a descriptor and the barycentres of models describing respectively a healthy and a symptomatic class. It enables to determine the pixels for which there is no significant difference in likelihoods between two classes. Considering that symptoms are rare, in such a case where there is a reasonable doubt between two classes, it is more relevant to discard such instances from the mildew class.
This criteria consists then in determining a minimum ratio R min , so that if the observed ratio R obs between the squared Mahalanobis distances is lower than this threshold, the instance is discarded from the mildew class: Only values such as R min > 1 are considered so that the decision criteria always ensure that the maximum likelihood is obtained for the class mildew.

The Seed-Growth Process
To summarise the approach, for all pixels of a given image, the conditions (i) determined by Equation (16) are checked. Pixels that satisfied this equation are considered as seeds. Then, the conditions (ii) determined by Equation (17) are iteratively checked. For each iteration, the pixels satisfying the growth conditions are incorporate to the seeds for the next step. The process eventually stops when no more pixels satisfy the conditions.

Data Set and Validation Protocol
Results of downy mildew detection are produced from a dataset containing 100 images acquired mid-June 2018 at the stage "pea-sized berries" (BBCH 75). The dataset contains either healthy plants or plants affected only by downy mildew or abiotic stresses expressed as necrosis or yellowing. The dataset is labelled into seven classes: "Limbus", "Leaf edges", "Berries", "Stems", "Foliar mildew", "Berries mildew" and "Anomalies" (i.e., symptoms of abiotic stress). For each image, approximately 5 · 10 4 pixels were labelled by photo-interpretation and phytopathological expertise. In total, the database includes 4.7 × 10 7 labelled pixels of which 5.5 × 10 4 belong to the "Foliar mildew" class. Figure 8 presents the example of an image sparsely labelled into into the seven considered classes. In this paper, it is solely the results regarding foliar symptoms which are taken into account. Symptoms on berries are too rare and not well enough modelled so far to be considered in the study. This dataset constitutes both the learning and validation set. Validation is conducted through a leave-one-out cross-validation, i.e., to produce a classification for one of the 100 images, the models are estimated from the 99 remaining images of the set.

Determination of the Seeds
The first step of the reconstruction process (seeds detection) is meant to initialise the detection of symptoms. In practice, it consists in determining for the considered criteria and a couple of thresholds (δ α , R min ) so that the resulting seeds are both reliable and extensive. The seeds are reliable if they are composed only from pixels within symptoms. They are extensive if there is at least one seed within each symptom to be retrieved. Figure 9 presents a set of precision-recall [38] curves describing the classification performances for the classification of downy mildew foliar symptoms depending on different combinations of thresholds (δ α , R min ). Each of the five curves present the evolution of performances with varying values of δ α , α ∈ [0.01, 0.3] and for a fixed value of R min ∈ {1.0, 2.0, 2.5, 3.0, 3.5}. The red dotted curve corresponding to R min = 1 constitutes a baseline, i.e., no correction from the maximum likelihood.
The reliability of the seeds is then directly described by the precision. However, extensiveness is only indirectly related to the recall metric. Nonetheless, seeds are all the more likely to be extensive as the recall is high. The purpose is then to determine from the figure a couple of thresholds that satisfy both conditions of seeds, i.e., a trade-off allowing to maximise the precision while ensuring a recall sufficient to initiate each symptom with at least one seed.
Concerning the choice of the "between" criterion threshold, the value R min = 2.5 stands out. Indeed, this value maximises the precision regardless the value of α, while ensuring the higher recall for any given precision. On another note, the choice concerning the "within" criterion threshold is not trivial. The value for α also has to be determined so that the threshold δ α maximises the precision while ensuring the extensiveness of seeds. Based on Figure 9, several values for α seem to satisfy condition of reliability of the seeds. Values for α comprised between [0.01 : 0.15] enable reaching precisions above 80%, but for very variable corresponding recalls, comprised between 3% and 48%. However, pixel-based metrics such as recall cannot fully describe the repartition of seeds within symptoms. For this purpose, it is proposed to evaluate seeds with an additional object oriented metric. Table 1 enumerates for the 100 images of the dataset the number of symptomatic spots which are initialised by at least one seed pixel depending on α values. Thanks to this metric, it can be experimentally determined that the pair of threshold (R min = 2.5, α = 0.1) offers the best compromise with reliable seeds (precision = 88%) and is nearly extensive (98% of symptoms initialised).

Seeds Rowth
The determination of seeds only initialises the process of reconstruction of the symptoms. The following expansion around the seeds is a process consisting in aggregating additional pixels to the existing seeds. The purpose is to supplement each seed to form a connected space that represents at best the area covered by the real symptoms. The process is iterative; initially the connected space is composed only by a seed. At each iteration, pixels within the close neighbourhood of seeds that satisfy the resemblance criteria regarding the model for symptoms are added to the pre-existing seed. In this application, the "8-connectivity" neighbourhood is considered for the expansion of seeds. The process stops when there are no more pixels in the vicinity of the connected spaces that satisfy the criteria.
In this particular case, the "within" and "between" criteria used for the determination of seeds are also used for the expansion of seeds, but with relaxed thresholds. The criteria have to be permissive enough to completely reconstruct the symptoms without downgrading the precision initially obtained for seeds. The new thresholds are then determined experimentally. Figure 10 presents the set of recall-precision curves describing the reconstruction performances corresponding to different pairs of relaxed thresholds (R min , α) and resulting from seeds obtained with (R min = 2.5, α = 0.1). The "within" criteria is evaluated for relaxed values of α ∈ [0.1, 0.3]. This means that a greater proportion of the population constituting the model for symptoms is considered relevant, i.e., pixels for which the distance δ α to µ mildew is greater can be added to the connected spaces. The "between" criteria is evaluated for relaxed values comprised between R min ∈ [1.0, 2.5].  Figure 10 shows that the reconstruction which is the closest to full completion is obtained for the value R min = 1. In this case, the "between" criterion does not differ from maximum likelihood with an additional condition of connexity. For values of R min ≤ 1, the recalls are slightly better but the precisions drop drastically. Concerning the threshold for δ al pha , values of α greater than 0.2 are too permissive. A good compromise can be found for the threshold α = 0.2, which results in an 83% precision and a 76% recall. Figure 11 shows a representative result for the reconstruction of symptoms resulting from the thresholds (R min = 1, α = 0.2) for the expansion and seeds determined with (R min = 2.5, α = 0.1). The image exhibits 13 symptomatic spots that are all detected (circled in blue). Three errors (false positives framed in red) are produced around some necroses on stems. In terms of area detected for the estimation of sanitary risks, these errors are minor. However, if the objective was to detect the first early symptoms in a plot, these errors could lead to a very different interpretation of the results in terms of epidemiology.  Figure 12 shows the reconstruction of symptoms with finer details. The larger symptoms, or the most severe ones, are generally better retrieved than the smaller and more discrete or faded symptoms. Moreover, if the general area covered by symptoms is rather well estimated and located, the morphology of the symptoms is not necessarily transcribed accurately. However, in the specific case of downy mildew, the shape of the symptoms is not a crucial parameter, as they can coalesce or grow according to very local shades, humidities or wounds. Furthermore, the reconstruction can result in multiple partial detections of a single symptom as a group of close but unconnected symptoms.
Indeed, for most symptoms, multiple seeds are detected. In some cases, the expansion phase lead to their coalescence (cf. Figure 12a,b). In other cases the expansion does not fully recover the symptom but solely enables to better estimate the area it covers (cf. Figure 12c,d ). This phenomenon of multiple partial detections could constitute a difficulty for the epidemiological interpretation of the results as the countdown of elementary symptoms is accounted for.
When applied to the whole validation set of a 100 images, the process seems rather efficient. Table 2 shows the variability of the performances within the dataset. For a global estimation within a plot, the results are satisfying. With adequate thresholds, symptoms are retrieved with an 83% precision and a 76% recall. The standard deviation (σ) of these performances is quite modest at the considered scale. However, at the scale of a plant, for some images, the process can be incomplete or lacking in precision. Yet these extreme counter performances do not occur together as described by the minimum F 1 score equals to 75% (F 1 = 2· precision · recall precision + recall ). (1)

Conclusions
The purpose of this work was to evaluate the potential on on-board high-resolution colour imaging for the monitoring of cryptogamic diseases affecting grapevine thanks to a case study: downy mildew (Plasmopara viticola). To address the problem, a pixel-wise classification strategy has been validated. First, the relevance of local structure tensors when describing healthy or infected vine tissues was demonstrated. Then, it was proposed to enhance LSTs with colorimetric information. To do so, a novel structure-colour representation (TC-LEST) was developed. In addition, it has been shown that thanks to a logarithmic transformation, compact vectorial descriptors were conveniently obtained. These descriptors were proved to be easily modelled within different classes of vine tissues thanks to Gaussian mixture distributions. Finally, these models were used to define statistical criteria that were in turn integrated within a seed growth segmentation process. The proposed strategy was applied to database of a hundred images of vinestocks. Results were evaluated with a leave-one-out cross-validation process in terms of pixel-wise precision and recall. With the best parameters, classification performances reached 83% precision and 76% recall.
These first promising results show that it is possible to discriminate, count and measure foliar symptoms of downy mildew. This contribution should lead to further validation in conditions closer to the agronomical requirements.  Acknowledgments: Authors would also like to thank the Experimental Unit "Vigne et Vin Bordeaux Grande Ferrade" (UE 1442) and the French National Institute of Vines and Wine (IFV) who made available the vineyards as well as the farming equipments and who also provided data and expertise regarding the monitoring of agronomic parameters of the plots under study.

Conflicts of Interest:
The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript.