#### 2.1. Literature Search and Data

A database was assembled based on a systematic literature review from the (i) Web of Science (WOS) and (ii) the Fungicide and Nematicide Tests (F&N Tests and Plant Disease management reports). For (i), publications were selected according to multiple search strings described in

Appendix A. For (ii), reports were extracted according to multiple key words (

Appendix A). Hard copies of volumes published prior to 2000 were examined directly. Relevant experiments were selected according the following criteria: (a) the experiment evaluated at least one DSS-based strategy, one calendar-based strategy and an untreated control (as a minimum each experiment included three sub-experiments according to the type of treatment); (b) all sub-experiments reported disease incidence (i.e., proportion of diseased organs) and sample size (i.e., total of organs considered to evaluate disease incidence). All experiments fulfilling those criteria were included in the meta-analysis. In calendar-based programs the number and timing of fungicide applications was fixed before the experiment, usually based on the standard practices for disease control. In DSS-based programs the number and timing of fungicide applications was decided during the course of the experiment based on risk indicators.

Each experiment was defined as a unique combination of location (country, state), year, crop, organ evaluated, disease, pathogen, treatment strategy, number of sprays and treatment id (see

Table A2 for detailed information about the location, crop and disease information). Conversely, each sub-experiment was characterised by the type of treatment (DSS, calendar or untreated), the observed disease incidence, the sample size and the number of sprays. For both sources ((i) and (ii)) a total of 67 experiments, 285 sub-experiments were selected from a total of 19 publications/reports (see

Table A3 for further details). The number of independent experiments among the papers varied from 1 to 11 and the number of sub-experiments among experiments from 3 to 7 (

Table A3). The database included a total of 67 sub-experiments under untreated conditions, 86 under a calendar-based strategy and 132 under DSS-based strategy (

Table A3). An overall and individual (per sub-experiment) description of the dataset is provided in

Section 3.1 in terms of the observed disease incidence and incidence ratios (IRs) between calendar (Cal./Unt.) and DSS (DSS/Unt.) strategies against the untreated control.

#### 2.3. Parameter Estimation: Frequentist vs. Bayesian Approach

Inference for both models (GLM and GLMM) was carried out using frequentist and Bayesian statistics, successively, leading to two different sets of estimated parameters. Frequentist models were denoted as GLM_F and GLMM_F and their Bayesian counterparts as GLM_B and GLMM_B. Parameters of frequentist models were estimated by maximum likelihood through iterative reweighted least squares method and Laplace approximation using the

`glm()` and the

`glmer()` functions of the package

`lme4` [

17] implemented in the

`R` software [

18] version 3.5.1, respectively. Likelihood ratio tests were performed to assess the significance of the fungicide treatment effects. Model comparison was done using the Akaïke Information Criterion (AIC) [

19], smaller values of AIC correspond to preferred models. A rule of thumb outlined in Burnham and Anderson [

20] is that models with

${\Delta}_{i}\left(\mathrm{AIC}\right)\phantom{\rule{0.166667em}{0ex}}=\phantom{\rule{0.166667em}{0ex}}{\mathrm{AIC}}_{i}-\mathrm{min}\phantom{\rule{0.166667em}{0ex}}\mathrm{AIC}$ higher than 10 have no support against a model with minimum AIC value.

In the Bayesian approach, uncertainty about quantities of interest and experimental results is always expressed in probabilistic terms. Inferential processes for learning about a quantity of interest $\varphi $ always start with a prior distribution which contains all relevant information about $\varphi $, $\pi \left(\varphi \right)$. Experimental data are related to $\varphi $ via a sampling model (i.e, binomial model) which is the basis for computing the likelihood function $\mathrm{L}\left(\varphi \right)$. Both elements are formally combined by means of the Bayes’ rule to obtain the posterior distribution for $\varphi $, $\pi (\varphi \mid \mathrm{Data})$, which synthesize all the available knowledge about $\varphi $, $\pi (\varphi \mid \mathrm{Data})\propto \mathrm{L}\left(\varphi \right)\phantom{\rule{0.166667em}{0ex}}\pi \left(\varphi \right)$.

Bayesian simple inferential processes are based on conjugate families for which the prior and the posterior distribution belongs to the same distribution family. Under this specific context the posterior distribution can be calculated analytically. However, for more complex models such as ours, it needs to be approximated with numerical methods such as Markov chain Monte Carlo (MCMC) methods [

21]. Then, the posterior distribution of the inference parameters is described by a random sample of parameter values. The same sample of parameters can be used to approximate the posterior distribution of any quantity of interest, for examples, the disease incidence, the odds ratio (OR), the incidence ratio (IR), etc. All these distributions can be summarised using several point estimates such as the mean, the median, the mode. Furthermore, Bayesian approach makes possible to quantify uncertainty of any posterior distribution by means of credible intervals [

22].

GLM_B and GLMM_B inferential processes were performed under several independent prior scenarios. For the GLM_B, following the recommendation of Gelman et al. [

23] a weakly informative prior scenario with Cauchy distributions was considered. Specifically, the prior of

${\beta}_{\mathtt{0}}$ was defined by a Cauchy distribution with location parameter

$\mu $ fixed at 0 and a scale parameter

$\sigma $ of 10, (

$\mathrm{C}(0,10)$). Conversely, the priors of

${\beta}_{\mathtt{dss}}$ and

${\beta}_{\mathtt{cal}}$ were defined by Cauchy distributions also centered at 0 with scale parameter

$\sigma \phantom{\rule{0.166667em}{0ex}}=\phantom{\rule{0.166667em}{0ex}}2.5$, (

$\mathrm{C}(0,2.5)$). This default prior scenario implies that model fitting uses an adaptation of the standard iteratively weighted least squares computation [

23] which makes similar this model with its frequentist counterpart. In GLMM_B, a standard non-informative prior scenario was defined to give prominence to data and also to make it comparable with GLMM_F. The prior for

${\beta}_{\mathtt{0}}^{*}$,

${\beta}_{\mathtt{dss}}^{*}$ and

${\beta}_{\mathtt{cal}}^{*}$ was defined independently by means of normal distributions centered at 0 and with a wide variance (

$\mathrm{N}(0,1000)$). An inverse Wishart distribution (

$\mathrm{IW}(\mathsf{\Psi},\nu )$) was considered to specify the prior to the variance-covariance matrix,

$\mathsf{\Sigma}$. Specifically, the inverse Wishart distribution is defined with an

$r\times r$ scale matrix

$\mathsf{\Psi}$ with

r equal to the number of random parameters, and with several degrees of freedom

$\nu $.

$\mathsf{\Psi}$ was specified as an identity matrix (values of 1 in the diagonal and 0 otherwise) of

$3\times 3$ and

$\nu \phantom{\rule{0.166667em}{0ex}}=\phantom{\rule{0.166667em}{0ex}}3$. For both models, posterior distribution was approximated by means of Markov chain Monte Carlo (MCMC) simulation methods by means of the

`JAGS` software (version 4.3.0) through the

`R2jags` package (version 0.5-7) [

24] of the

`R` software.

The MCMC algorithm was run with three Markov chains each including 120,000 iterations after a burn-in period of 20,000 iterations. In addition, the chains were thinned by storing one in ten iterations in order to reduce autocorrelation in the subsequent sample. Convergence was assessed via three different criteria: (i) graphically, drawing trace plots and assessing the simulated values of the chains appear overlapping one another, (ii) based on the potential scale reduction factor,

$\widehat{R}$, whose values must be equal or close to 1, and (iii) by means of the effective number of independent simulation draws,

$neff$, which must be >100 [

25]. Regarding methods for Bayesian model choice, the deviance information criterion (DIC) [

26] was considered. Smaller values of DIC correspond to preferred models, and a DIC difference of 5 or more is generally regarded as practically meaningful [

27].

In addition to the standard model selection criteria (AIC or DIC), the goodness of fit for the different models was evaluated graphically comparing observed vs. fitted disease incidence in the logit scale. Note that for Bayesian models, logit fitted values were based on the median of the posterior distribution of the logit disease incidence.