Decision Trees to Forecast Risks of Strawberry Powdery Mildew Caused by Podosphaera aphanis

: Powdery mildew ( Podosphaera aphanis ) is a major disease in day-neutral strawberry. Up to 30% yield losses have been observed in Eastern Canada. Currently, management of powdery mildew is mostly based on fungicide applications without consideration of risk. The objective of this study is to use P . aphanis inoculum, host ontogenic resistance, and weather predictors to forecast the risk of strawberry powdery mildew using CART models (classification trees). The data used to build the trees were collected in 2006, 2007, and 2008 at one experimental farm and six commercial farms located in two main strawberry-production areas, while external validation data were collected at the same experimental farm in 2015, 2016, and 2018. Data on proportion of leaf area diseased (PLAD) were grouped into four severity classes (1: PLAD = 0; 2: PLAD > 0 and <5%; 3: >5% and <15%; and 4: PLAD > 15%) for a total of 681 and 136 cases for training and external validation, respectively. From the initial 92 weather variables, 21 were selected following clustering. The tree with the best balance between the number of predictors and highest accuracy was built with: airborne inoculum concentration and number of susceptible leaves on the day of sampling, and mean relative humidity, mean daily number of hours at temperature between 18 and 30 ◦ C, and mean daily number of hours at saturation vapor pressure between 10 and 25 mmHg during the previous 6 days. For training, internal validation, and external validation datasets, the sensitivity, specificity, and accuracy ranged from 0.70 to 0.90, 0.87 to 0.98, and 0.82 to 0.97, respectively. The classification rules to estimate strawberry powdery mildew risk can be easily implemented into disease decision support systems and used to treat only when necessary and thus avoid preventable yield losses and unnecessary treatments.


Introduction
In Canada, the province of Quebec is the most important strawberry producer, with 14,117 MT produced from 1921 ha of strawberry plantings, representing 57% of the Canadian production in 2018 [1]. As a response to consumer demand for longer period of availability and for high-quality locally produced strawberries, the production has evolved. Until the end of 1980s, almost exclusively short-day (June bearing) varieties were grown. Strawberries were harvested during a 3 to 4-week period starting mid-to late June. This production system was gradually replaced by new production systems such as day-neutral cultivars, winter row covers, and production in tunnels. The combination of these techniques allows for a better distribution of the strawberry harvest throughout the growing season. However, with these new techniques came new challenges, one of them being the emergence of strawberry powdery mildew (SPM) in the early 2000s [2,3].
The disease, caused by the ascomycete Podosphaera aphanis (Wallr.), can affect all aerial parts of the strawberry plant, including leaves, stolons, flowers, and berries [4]. First signs of powdery mildew are usually white patches of mycelium and conidia on the abaxial leaf surface [5]. As the disease progresses, patches can cover entire leaves and lead to index value, a fungicide and an interval between treatments are suggested [26]. With the exception of the DDSS developed by Hoffman and Gubler [26], the decision rules are not available and the DDSS were developed mostly for strawberry production in large tunnels. The DSS developed by Hoffman and Gubler [26] was evaluated under the conditions of eastern Canada in both June-bearing (Jewel) and day-neutral strawberry production (Seascape) [15]. For the two years of evaluation, the DDSS did not allow for a reduction in the number of fungicide treatments [15].
Risk assessment generally relies on variable-centered statistical techniques such as logistic, linear, or multiple regression to model the relationship between a group of independent variables, often weather variables, and disease risk or severity [27,28]. Unless specified within the model, these methods generally do not allow for consideration of complex interactions among the independent variables or for different assemblages of variables that may lead to disease development. Pattern-centered statistical techniques provide a different approach to determine disease risk by identifying the subgroups within a number of disease observations that share similar characteristics, allowing for the identification of patterns of risk factors. Methods of segmenting disease risk observations into subgroups, such as tree-based models, can be used to derive prediction rules for determining disease risk classes. As opposed to linear and additive models, pattern-centered approaches are nonparametric and relatively easy to implement and interpret. Also, algorithms such as CART models may reveal interactions among variables generally ignored by other modeling approaches [29,30]. The objective of this work was to use weather predictors (variables) alone, or combined with inoculum and with host susceptibility predictors, to forecast the risk of SPM expressed as classes (no, low, moderate and severe risk) using decision tree classification procedures. These classes can be used as early warning, warnings, and action thresholds for strawberry mildew management actions [3].

Description of the Sampling Sites
The data used in this study were described in Carisse et al. [3]. Briefly, data were collected at the Agriculture and Agri-Food Canada experimental farm located in Frelighsburg, Quebec (latitude 45 • 03 12 N; longitude 72 • 51 42 W), and in commercial strawberry plantings located in Saint-Paul-d'Abbotsford (latitude 45 • 25 60 N; longitude 72 • 52 60 W) and in Île d'Orléans (latitude 46 • 55 06 N; longitude 70 • 58 35 W). For the purpose of the study, only data from non-sprayed plots were used. The data were collected at three, five, and four sites in 2006, 2007, and 2008, respectively, for a total of 12 epidemics. At the site of Frelighsburg (experimental farm), plots were also established in 2015, 2016, and 2018. At all sites, data were collected in plots planted with the day-neutral strawberry cultivar Seascape, which was planted during the last 2 weeks of May in raised beds set 1.4 m apart, with two rows of strawberry plants per bed covered with black polyethylene mulch. On each bed, strawberry plants were spaced 30 cm apart. At the sites in Frelighsburg, Saint-Paul-d'Abbotsford and Île d'Orléans, plots comprised of 13 raised beds of 15 m long, 15 raised beds of 30 m long, and 13 raised beds of 12 m long, respectively. At all sites, flower trusses were removed until mid-June.

Disease, Host, Inoculum, and Weather Monitoring
At each site, the same 25 plants, randomly selected at the first sampling date, were assessed for powdery mildew severity. Severity was assessed every 2 days from the first week of June to the first week of October as percent leaf area diseased (PLAD) on the three youngest fully expanded leaves. In 2015, 2016, and 2018 disease severity was assessed twice weekly. Severity was estimated using a diagrammatic scale with 5% steps (0%, 5%, 10%, 15% . . . 100%). Host susceptibility was assessed based on the number of susceptible leaves per 1 m of row. A leaf was considered susceptible if leaflets were not completely unfolded [6]. Airborne conidia concentration was assessed every 2 days using two rotating-arm impaction spore samplers placed in the central row, with one at 5, 10, and 4 m upward and the other one at 5, 10, and 4 m downward from the middle of the plots at the Frelighsburg, Saint-Paul-d'Abbotsford, and Île d'Orléans sites, respectively. In 2015, 2016, and 2018 airborne conidia concentration was assessed twice weekly. At all sites, the samplers ran for 20 min every hour (10 min on and 20 min off) from 8:00 a.m. to 8:00 p.m. The number of P. aphanis conidia on the sampling surface was counted under a microscope at ×250 magnification. Conidia of P. aphanis were identified based on their size (20-23 × 13-20 µm), their barrel shape when turgid, and the presence of granules inside the conidia [31]. The concentration of airborne conidia (ACC) was expressed as the number of conidia per cubic meter of air (mean over the two samplers). Weather data were monitored using automatic weather stations (CR-21X; Campbell Scientific Inc., Edmonton, AB, Canada) placed in an unobstructed area 3 m from the plot edge. Data were measured every 15 min and saved as hourly averages or totals (rain). Temperature and relative-humidity probes were positioned in a white shelter at 1.5 m above the ground. Rainfall was recorded using a tipping bucket rain gauge (Geneq, Montreal, QC, Canada) at 50 cm above the ground.

Description of the Response Variable and Classification Trees Predictors
The severity of strawberry powdery mildew, expressed as percent leaf area diseased (PLAD, mean over the 25 plants assessed, three leaves per plant), was transformed into severity classes to generate ordinal response variable as: Class 1: PLAD = 0; Class 2: PLAD > 0 and <5%; Class 3: PLAD > 5% and <15%; and Class 4: PLAD > 15% and considered as the dependent variable. The interval used to create the severity classes was selected based on the reported relationship between leaf disease severity and yield losses [3,32] and distribution of cases within each class. The classification tree predictor related to airborne inoculum was airborne conidia concentration (ACC), which was logtransformed and expressed as a continuous variable (log10(ACC + 1)). The predictor related to strawberry leaf receptivity was the number of susceptible leaves per 1 m row, expressed as a continuous variable (LVS). Weather variables were summarized over a 1-or 6-day period. Air temperature ( • C) and relative humidity (%) were expressed as averages, minima, and maxima. Rainfall was expressed as accumulated values (millimeters), or number of hours with rain (>2 mm). For the 6-day period, temperature and humidity were expressed as averages, duration (hours), or sum within pre-set thresholds [5,12,13,22]. The description of the 94 predictors is provided in Table 1. A total of 681 and 136 disease cases, each one corresponding to unique combination of year, site, and sampling day, were generated from the first and last three years of the study, respectively. Classification tree is a supervised learning method, hence bootstrap validation (internal validation) method [33] was used to validate the models (trees), and the 136 independent cases collected in 2015, 2016, and 2018 were used to evaluate their prediction accuracy (external validation).

Development of Decision Trees
We used the decision tree classification technique, a supervised learning method described in detail in several text books [34,35]. Briefly, the decision tree is the result of an algorithm which produces a model consisting of a set of classification rules that are represented as a tree [36]. The tree is built following a top-down approach and consists of root nodes, which are split into more branches. In a tree, each node represents a value of a predictor, and each branch descending from the node corresponds to one of the possible values that the predictor can take. The decision tree was developed following several steps: reducing redundancy; defining classification rules, constructing the decision trees, implementing the decision trees, evaluating the classification results for both training and validation data sets, and determining the reliability of the trees with independent data (2015, 2016, and 2018 data).
First, to facilitate segmentation while developing the decision trees, redundancy among the 94 weather-based predictors (independent variables) was reduced using clustering [37]. Clustering was used to identify the groups (clusters) of highly correlated predictors, with the smallest correlation between groups. Clustering was conducted using the VARCLUS procedure in SAS with an eigenvalue threshold of 0.7 (SAS PROC VAR-CLUS) [38]. Clustering results were used to select groups of weather-based predictors within a cluster using two approaches. First, within each cluster, Spearman's rank-based correlation coefficients were computed between weather-based predictors and used to identify highly correlated predictors (r > 0.95). Second, a discriminant analysis was performed to identify the predictors that influenced the categories of strawberry powdery mildew severity [38]. All predictors were given an equal weight.
In this study, the CART algorithm was used to build the decision trees [39]. CART uses binary recursive partitioning to systematically identify the best predictor among all predictors that splits the dataset into the best low-and high-risk groups with respect to the classes of severity of SPM [34,39]. During the process, for each variable all possible separations are evaluated to determine which splits are the best at predicting the classes of severity of SPM. Using this procedure, a decision tree was built using the tree function in DTREG (version 10.9.1). The tree starts with parent nodes, which were further split into child nodes based on the next best variable and split criteria. Splitting of the child nodes was continued until within-node deviance was ≤0.01 of the root node [34]. The minimal number of cases in a terminal node was set to 6 [40], and the GINI criterion was used to determine the best split at each node. A 10-fold cross-validation procedure was used to determine the optimal tree by randomly splitting the learning dataset into 10 subsets of data and then repeating the tree-building process ten times [41]. The trees were then pruned to fewer nodes by removing the least-important splits based on the misclassification cost. Because of the differential availability of host, inoculum, and weather data, four decision trees were built using all predictors, weather and inoculum predictors (excluding LVS), weather and host susceptibility predictors (excluding log10(ACC + 1)), and only weather predictors (excluding both log10(ACC + 1) and LVS).
The performance of the decision trees was assessed based on the reliability of the trees in assessing classes of SPM severity. Within each SPM class, all sampling dates (disease cases) were divided in two groups, with cases and controls defined as the sampling dates associated with SPM severity within the class or not, respectively. Each tree was evaluated for its ability to classify the severity of SPM by recording the number of cases and controls that were correctly classified [27]. For each SPM class and for each tree, true positive (TP) and true negative (TN) were calculated as the number of cases and controls correctly classified by the decision tree, while the false positive (FP) and false negative (FN) were calculated as the number of cases and controls incorrectly classified by the decision tree. The true positive proportion (TPP, sensitivity) was calculated by dividing the number of TP classifications by the total number of cases. The true negative proportion (TNP, specificity) was calculated by dividing the number of TN by the total number of controls. The false positive proportion (FPP) was calculated as 1-TNP, and the false negative proportion (FNP) was calculated as 1-TPP. The overall accuracy was calculated as the proportion of correct classifications (TP+TN) [27]. This procedure was conducted using the training, internal validation, external validation data sets.

Results
For the data collected in 2006, 2007, and 2008 used as training and internal validation data, 170, 289, and 222 cases were analyzed respectively, for a total of 681 cases. Among all cases, 163, 144, 177, and 197 cases fell within the SPM severity classes 1, 2, 3, and 4, respectively. For the data collected in 2015, 2016, and 2018 used as external validation data, 44, 46, and 46 cases were analyzed, for a total of 136 cases from which 29, 26, 31, and 50 cases fell in SPM severity classes 1, 2, 3, and 4, respectively. The distribution of cases for each year, in each class is presented in Figure 1. From the initial set of 92 weather-based predictors (Table 1), 21 were selected following clustering, intra-cluster correlation, and discriminant analysis (Table 1) for building the trees. The tree with the highest accuracy was developed with the following five predictors: airborne inoculum concentration (log 10 ACC + 1), number of susceptible leaves (LVS), mean relative humidity during the previous 6 days, mean daily number of hours at temperature between 18 and 30 • C during the previous 6 days, and mean daily number of hours at saturation vapor pressure between 10 and 25 mmHg during the previous six days (h). The distributions of these predictors for each SPM class are presented in Figures 2 and 3. An increase in predictor values with increasing classes of SPM severity was observed for inoculum and number of susceptible leaves (Figure 2A,B). Average log 10 (ACC + 1) was 0.09, 0.71, 1.48, and 2.38 log number of airborne conidia per m3 +1 for classes 1, 2, 3, and 4, respectively (Figure 2A), while the number of susceptible leaves per meter of row was 0.80, 3.40, 12.39, and 17.94 for classes 1, 2, 3, and 4, respectively ( Figure 2B). For the selected weather-based predictors, an increase in the number of hours at temperature between 18 and 30 • C during the preceding 6 days, with increasing class of SPM severity, was observed with 9.25, 11.27, 12.16, and 15.40 h for classes 1, 2, 3, and 4, respectively ( Figure 3A). However, such a pattern was not observed for mean relative humidity during the previous 6 days, with 75.06, 77.14, 78.88, and 78.44(%), and for number of hours at saturation vapor pressure between 10 and 25 mmHg during the previous 6 days, with 20.24, 20.69, 21.48, and 21.51 h, for SPM classes 1, 2, 3, and 4, respectively ( Figure 3B,C). For the training, internal validation, and external validation data, the sensitivity ranged from 0.68 to 0.90, the specificity ranged from 0.87 to 0.99, and the accuracy ranged from 0.82 to 0.97 ( Table 2). The tree built with these predictors is represented in Figure 4.  Table 2). The tree built with these predictors is represented in Figure 4.      [33] was used to validate the models (trees), and the 136 independent cases collected in 2015, 2016, and 2018 were used to evaluate their prediction reliability (external validation); c the sensitivity was calculated as the true positive proportion (TPP = number of true positive classification/total number of cases). The specificity was calculated as the true negative proportion (TNP = number of true negatives/total number of controls). The overall accuracy was calculated as the proportion of correct classifications (TP + TN) [27].
When the classification tree was built with inoculum-and weather-based predictors (excluding LVS), five predictors were retained (log10(ACC + 1), 6T1825, 6RHMAX, 6VP1025, 6RAINH). For the training, internal validation, and external validation data, the sensitivity ranged from 0.56 to 0.88, the specificity ranged from 0.84 to 0.97 and the accuracy ranged from 0.81 to 0.95 ( Table 2). The tree built with these predictors is represented in Figure 5. When the classification tree was built with host-and weather-based predictors (excluding log10(ACC + 1)), six predictors were retained (LVS, 6T13, 6RH, 6RAINH, 6T1525, and 6RHMAX). The tree built with these predictors is represented in Figure 6. For the training, internal validation, and external validation data, the sensitivity ranged from 0.59 to 0.89, the specificity ranged from 0.82 to 0.98 and the accuracy ranged from 0.79 to 0.93 (Table 2). When only weather-based predictors were used (excluding both log10(ACC + 1) and LVS) to build the classification tree, 15 predictors were used (6T13, 6RH, 6T1525,  6RAINH, 6T1825, 6RHMAX, 6T, 6VP5, 6T1830, NTMIN, 6VP1025, NRH, RAINH, T, and  NT1525). For the training, internal validation, and external validation data, the sensitivity ranged from 0.42 to 0.76, the specificity ranged from 0.81 to 0.93 and the accuracy ranged from 0.81 to 0.89 (Table 2).  When the classification tree was built with inoculum-and weather-based predictors (excluding LVS), five predictors were retained (log10(ACC + 1), 6T1825, 6RHMAX, 6VP1025, 6RAINH). For the training, internal validation, and external validation data, the sensitivity ranged from 0.56 to 0.88, the specificity ranged from 0.84 to 0.97 and the accuracy ranged from 0.81 to 0.95 ( Table 2). The tree built with these predictors is represented in Figure 5. When the classification tree was built with host-and weather-based predictors (excluding log10(ACC + 1)), six predictors were retained (LVS, 6T13, 6RH, 6RAINH, 6T1525, and 6RHMAX). The tree built with these predictors is represented in Figure 6. For the training, internal validation, and external validation data, the sensitivity ranged from 0.59 to 0.89, the specificity ranged from 0.82 to 0.98 and the accuracy ranged from 0.79 to 0.93 (Table 2). When only weather-based predictors were used (excluding both log10(ACC + 1) and LVS) to build the classification tree, 15 predictors were used (6T13, 6RH, 6T1525, 6RAINH, 6T1825, 6RHMAX, 6T, 6VP5, 6T1830, NTMIN, 6VP1025, NRH, RAINH, T, and NT1525). For the training, internal validation, and external validation data, the sensitivity ranged from 0.42 to 0.76, the specificity ranged from 0.81 to 0.93 and the accuracy ranged from 0.81 to 0.89 (Table 2).

Discussion
Regardless of the type of disease management, whether it is based on a pre-established schedule, reasoned, integrated or organic, risk estimation is an essential component. In order to ensure the economic, environmental, and social sustainability of agricul- Figure 5. Representation of the classification tree built with airborne inoculum concentration (log 10 ACC + 1), mean number of hours at temperature between 18 and 25 • C during the previous six days (6T1825), mean maximum % relative humidity during the previous six days (6RHMAX), mean number of hours at saturation vapor pressure between 10 and 25 mmHg during the previous six days (6VP1025), and mean number of rainy hours during the previous six days (6RAINH). Regardless of the complexity of disease management decisions, the key element is the estimation of risk. Informed and rational disease management decision cannot be taken without some knowledge about the risk; which can be defined as the probability that a disease reach a critical level usually expressed as potential yield losses. Prediction of risk allows growers to respond in a timely and efficient manner by adjusting their crop management actions. A prediction of high disease risk may result in reduced yield losses whereas low disease risk may result in reduced pesticide applications with positive economic and environmental effects. There are several types of disease prediction models with a range of complexity from rule-based to complex simulation models [23][24][25][26]32]. Nevertheless, disease risk prediction models are based on the interactions of all or some of the factors (predictors) that govern epidemic development: the host, the pathogen, and the environment, developed from controlled or field experiments. Strawberry powdery mildew is not an exception and there are several different types of disease risk prediction models, each one with their advantages and limitations [28]. In this study we explored a relatively new approach for plant disease risk estimation, classification trees. Because we used only field data, the trees were developed and validated using a large number of observations representing different combinations of SPM classes and inoculum-, host susceptibility-, and weather-based predictors (681 and 136 for a total of 817).
Despite the need to monitor airborne inoculum, the tree built with inoculum, host, and weather predictors was the most reliable and simplest one. Basically, it estimates the risk of SPM, first from airborne inoculum and amount of susceptible leaves, and then from humidity, duration of favorable temperatures and of saturation vapor pressure during the previous 6 days. In other words, the tree is coherent with the epidemiology of strawberry powdery mildew [3,6,10,[21][22][23]26]. Among the four trees, it is the most intuitive and easy to interpret (Figure 4). It uses only three weather-based predictors readily available from weather stations, simple calculation, or forecasts (Table 2, Figure 4). The second best tree Figure 6. Representation of the classification tree built with number of susceptible leaves (LVS), mean number of hours at temperature above 13 • C during the previous six days (6T13), mean elative humidity during the previous six days (6RH), mean number of rainy hours during the previous six days (6RAINH), mean number of hours at temperature between 15 and 25 • C during the previous six days (6T1525), and mean maximum relative humidity during the previous six days (6RHMAX).

Discussion
Regardless of the type of disease management, whether it is based on a pre-established schedule, reasoned, integrated or organic, risk estimation is an essential component. In order to ensure the economic, environmental, and social sustainability of agricultural production, it is essential to rationalize the use of pesticides, including fungicides, whether synthetic or not. Strategically, it is expensive to treat when the risk is low or not to treat when it is high [27]. In other words, a risk-estimation tool must optimize the true positives (sensitivity) and the true negatives (specificity). There are several ways to determine a risk, but in most cases the risk is estimated as being below or above a threshold. Thresholds are established from the relationship between disease intensity and economic damage or yield losses [42,43]. Subsequently, it is possible to determine an action threshold corresponding to the moment when, if a treatment is not applied, the cost in yield losses will be higher than the cost of the treatment. In practice, since it is not generally possible to treat instantly when the action threshold is reached, for example because the treatment conditions are not favorable (rain, wind, availability of workers), a warning threshold is used to choose the best time to act.
In the case of strawberry powdery mildew, the relationship between severity of powdery mildew on leaves and losses of fruit yield has been established [3,10]. More recently, Fall and Carisse [32] reported a linear relationship between severity (PLAD) on leaves and yield losses on fruits (%) and determined that yield losses of 1% and 5% corresponded to severities on leaves of 5% and 15%. In other words, the PLAD of 5% and 15% can be used as warning and action thresholds, respectively. It is from these observations that the severity classes were established in the present study as Class 1: PLAD = 0; Class 2: PLAD > 0 and <5%; Class 3: >5% and <15%; and Class 4: PLAD > 15%.
Over the past decades, many disease risk estimation tools have been developed. These tools vary in complexity from simple decision rules to expert systems including forecasting and dynamic simulation models [28,32,44]. Since many factors related to the pathogen, host, or environment influence the epidemic dynamics, some risk-estimation tools are based only on weather conditions or a combination of variables related to weather conditions, susceptibility of the plant (varietal and ontogenic resistance), and population of the pathogenic agent (size of the population and virulence). The development of strawberry powdery mildew is no exception. Van der Heyden et al. [21] reported a linear relationship between P. aphanis airborne conidial concentration and proportion of leaf area diseased (PLAD). Our observations are in accordance with those reported by Van der Heyden et al. [21], with 0.09, 0.71, 1.48, and 2.39 log 10 (conidia + 1)/m 3 for SPM severity class 1, 2, 3 and 4, respectively (Figure 2A). Leaf disease severity is also influenced by leaf age [6,16]. Very young leaves that are still folded (angle of less than 45 between leaflet) are very susceptible, whereas leaves partially unfolded (angle of more than 45 between leaflet) are moderately susceptible, and completely unfolded and pale-green leaves are practically resistant [6]. In our study we expressed leaf susceptibility in terms of number of susceptible leaves per meter of row; a leaf was considered susceptible if leaflets were not completely unfolded [6]. Considering that the window of susceptibility is narrowed, it was expected that leaf susceptibility would be a good indicator of disease risk. In fact we observed that the number of susceptible leaves per meter of row increases with increasing disease severity classed, with an average of 0.80, 3.40, 12.39, and 17.94 for severity classes 1, 2, 3, and 4, respectively ( Figure 2B). For the weather conditions, there were not clear relationships between individual weather variables and disease severity ( Figure 3). Nevertheless, weather data averaged over the preceding six days generally had a higher correlation with severity classes than daily values (Table 2, Figure 3).
In practice, although these three groups of variables influence the development of strawberry powdery mildew, several combinations of these variables can cause the same disease severity. For example, conditions characterized by low inoculum and few susceptible leaves but highly favorable weather conditions can cause the same severity as conditions characterized by high inoculum, few susceptible leaves and less-favorable weather conditions. The interaction between these variables is therefore very important because, in theory, regardless of whether the weather conditions are favorable or not, if the inoculum is absent or very few susceptible leaves are present, the severity of the disease will be zero, or very low. Classification trees make it possible to identify all combinations of variables that cause the same severity, expressed as classes [29,30,45]. In this study, the best classification tree, based on accuracy, was built using airborne inoculum concentration, anumber of susceptible leaves, mean relative humidity (%), mean number of hours at temperature between 18 and 30 C, and mean number of hours with saturation vapor pressure between 10 and 25 mmHg, all with weather variables being averages over the previous six days (Table 2, Figure 4). The accuracy for this tree ranged from 0.82 to 0.97 (Table 2, Figure 4). The second best tree was built using airborne inoculum concentration, mean number of hours at temperature between 18 and 25 • C, mean maximum relative humidity (%), mean number of hours with saturation vapor pressure between 10 and 25 mmHg, and mean number of rainy hours; all weather variables being averages over the previous six days (Table 2, Figure 5). The accuracy for this tree ranged from 0.81 to 0.95 (Table 2, Figure 5). The tree built with weather and host susceptibility predictors had a similar overall accuracy, but lower sensitivity mainly for SPM classes 2 and 3 (Table 2, Figure 6). Also, for the tree built with only weather predictors, 15 predictors (variables) were needed to classify SPM severities with accuracy ranging from 0.70 to 0.89 (Table 2). Nevertheless, the tree is complex (many branches and splits) which may restrict its implementation and adoption by crop advisors and growers (tree not shown).
Regardless of the complexity of disease management decisions, the key element is the estimation of risk. Informed and rational disease management decision cannot be taken without some knowledge about the risk; which can be defined as the probability that a disease reach a critical level usually expressed as potential yield losses. Prediction of risk allows growers to respond in a timely and efficient manner by adjusting their crop management actions. A prediction of high disease risk may result in reduced yield losses whereas low disease risk may result in reduced pesticide applications with positive economic and environmental effects. There are several types of disease prediction models with a range of complexity from rule-based to complex simulation models [23][24][25][26]32]. Nevertheless, disease risk prediction models are based on the interactions of all or some of the factors (predictors) that govern epidemic development: the host, the pathogen, and the environment, developed from controlled or field experiments. Strawberry powdery mildew is not an exception and there are several different types of disease risk prediction models, each one with their advantages and limitations [28]. In this study we explored a relatively new approach for plant disease risk estimation, classification trees. Because we used only field data, the trees were developed and validated using a large number of observations representing different combinations of SPM classes and inoculum-, host susceptibility-, and weather-based predictors (681 and 136 for a total of 817).
Despite the need to monitor airborne inoculum, the tree built with inoculum, host, and weather predictors was the most reliable and simplest one. Basically, it estimates the risk of SPM, first from airborne inoculum and amount of susceptible leaves, and then from humidity, duration of favorable temperatures and of saturation vapor pressure during the previous 6 days. In other words, the tree is coherent with the epidemiology of strawberry powdery mildew [3,6,10,[21][22][23]26]. Among the four trees, it is the most intuitive and easy to interpret (Figure 4). It uses only three weather-based predictors readily available from weather stations, simple calculation, or forecasts (Table 2, Figure 4). The second best tree that was developed without the information of the number of susceptible leaves, also required data on airborne inoculum and four weather-based predictors readily available (Table 2, Figure 5). The choice of classification tree depends on the objectives of SMP management as someone can look for a very high level of control regardless of the cost (high sensitivity), someone may look for reducing management actions as much as possible (high specificity). In general, we are looking for a balance between acting when needed and not acting when not needed (high accuracy).
In this study, the amounts of airborne inoculum and the number of susceptible leaves were monitored. However, depending on the resources available, monitoring efficiency can be improved or values estimated from models [44]. In fact, monitoring susceptible leaves was easy and rapid. Hence, it can be included into scouting services already available. For monitoring airborne inoculum, it is more difficult and time consuming. However, previous study showed that because of the low level of spatial heterogeneity in P. aphanis airborne inoculum, it can be estimated using only one sampler per strawberry field [21]. In addition, with the advances in molecular biology, airborne inoculum can be assessed using qPCR, LAMP or new field DNA analysis technologies, which would allow for simultaneous monitoring of inoculum concentration and fungicide resistance.

Conclusions
The approach used in this study to determine risk of strawberry powdery mildew has proven to be reliable. Indeed, the development of classification trees has made it possible to highlight all the conditions that lead to the development of the disease. The classification tree, which essentially brings together the classification rules, makes it possible to estimate risks with a high reliability ( Table 2). Although a large number of cases have been used to develop and validate the classification tree, as for all tools, it will need to be validated under commercial conditions, for other varieties of strawberry and under other production conditions. In any case, the rules for estimating the risk of strawberry powdery mildew should be easily integrated into the various phytosanitary warning platforms already available.