Logistic Regression to Evaluate the Marketability of Pepper Cultivars

The goal of this paper is to show that logistic regression is an analytical method of interest to evaluate the marketability of different pepper (Capsicum annuum L.) cultivars. Two studies were conducted on “Italian sweet” pepper cultivars. Fruit samples were introduced in storage chambers and kept at 9 ◦C and 85–95% relative humidity during the study period. The fruits were evaluated individually and periodically by measuring the deterioration of fruit quality (rot, ageing, etc.). In this study, categorical explanatory variables (rot, etc.) and continuous explanatory variables (days of storage) were integrated and combined to determine the probability of marketability of the fruit. The results show that the binary logistic model is a useful statistical tool to analyse together both categorical and continuous variables in the study of the marketability of pepper cultivars.


Introduction
All production systems tend to increase their profitability by improving crop productivity along with quality [1], thus reducing the environmental impact associated with food production [2,3].This trend would be incomplete if the postharvest shelf life of the products were not improved.Many of the fruit and vegetable losses occur between harvesting and consumption [4][5][6].
To extend postharvest life, it is essential to know what causes the deterioration of fruit quality, thus decreasing marketability, and based on this knowledge, to develop new methods and technologies that reduce deterioration to economically acceptable levels [7].The postharvest deterioration of pepper fruits (Capsicum annuum L.) can vary from a few days to several months [8,9].The factors causing this deterioration are mainly water loss (wilting), rotting, and chilling injury, which affect the visual appearance of the fruit and its nutritional properties and thus decrease its commercial value [10][11][12].
The works aimed at extending pepper shelf life have focused on maintaining fruit quality, such as aroma, flavour, turgidity, colour, or nutritional value [13][14][15][16][17].Many of these studies regarding shelf life mentioned above are based on statistical models.For example, Tijskens and Polderdijk [18] proposed a generic model for the preservation quality of perishable products based on the kinetics of decreased individual quality attributes.The model includes the effects of temperature, chilling injury, and different levels of initial quality and quality acceptance limits.This model is complex to apply, so it is not used in practice.
Many pepper postharvest studies, including those regarding genetic improvement and selection, are based on analysis of variance (ANOVA) and linear or non-linear regression to characterise the relationship between a response variable and a set of explanatory variables.However, these analyses do not consider the joint effect of all aspects causing fruit deterioration [19].This is because of the lack of mathematical tools that can evaluate the loss of commercial value based on the joint action of categorical and continuous variables on the fruit [20,21].
The binary logistic model is a useful statistical tool to simultaneously analyse both categorical and continuous variables [22][23][24].The application of the binary logistic regression model to select extended-shelf-life tomato cultivars has already been demonstrated [20,21].However, this mathematical approach has not yet been applied to pepper cultivars.
In this study, logistic regression is applied as an analytical method of interest to evaluate the marketability of different pepper (Capsicum annuum L.) cultivars.To achieve this goal, several analytical approaches will be described; simple and multiple logistic regression will be used as statistical tools to identify more precisely those cultivars with extended shelf life during the storage period from harvesting to consumption will be described (Figure 1).This is because of the lack of mathematical tools that can evaluate the loss of commercial value based on the joint action of categorical and continuous variables on the fruit [20,21].
The binary logistic model is a useful statistical tool to simultaneously analyse both categorical and continuous variables [22][23][24].The application of the binary logistic regression model to select extended-shelf-life tomato cultivars has already been demonstrated [20,21].However, this mathematical approach has not yet been applied to pepper cultivars.
In this study, logistic regression is applied as an analytical method of interest to evaluate the marketability of different pepper (Capsicum annuum L.) cultivars.To achieve this goal, several analytical approaches will be described; simple and multiple logistic regression will be used as statistical tools to identify more precisely those cultivars with extended shelf life during the storage period from harvesting to consumption will be described (Figure 1).

Figure 1.
Rationale for the application of binary logistic regression models to evaluate the probability of marketability in pepper cultivars.π (x) is the probability of marketability of the fruit, "x" is the number of days of storage, "e" is the Euler number, and "α" and "β" are the intercept and slope of the model, respectively.

Plant Materials
Two independent studies were conducted on "Italian sweet" pepper cultivars.One was during the 2017-2018 agricultural campaign on '43', 'Spiker', and 'Xhanti' cultivars, marketed under the Tribelli XL® trademark (Figure 2 and Table 1); the other study was conducted in 2016-2017 on 'Sixto' and 'Palermo' cultivars (Figure 3 and Table 1).The "Italian sweet" pepper fruits are characterised by Figure 1.Rationale for the application of binary logistic regression models to evaluate the probability of marketability in pepper cultivars.π (x) is the probability of marketability of the fruit, "x" is the number of days of storage, "e" is the Euler number, and "α" and "β" are the intercept and slope of the model, respectively.

Plant Materials
Two independent studies were conducted on "Italian sweet" pepper cultivars.One was during the 2017-2018 agricultural campaign on '43', 'Spiker', and 'Xhanti' cultivars, marketed under the Tribelli XL ® trademark (Figure 2 and Table 1); the other study was conducted in 2016-2017 on 'Sixto' and 'Palermo' cultivars (Figure 3 and Table 1).The "Italian sweet" pepper fruits are characterised by having thin flesh and slender fruits (narrow with respect to their length) and being very elongated and pointed.
The fruit samples evaluated came from plants grown in greenhouses located in south-eastern Spain (Almería, Spain).All the crops were transplanted in August, and the agronomic practices (irrigation, fertilisation, pruning, etc.) were the usual ones for pepper cultivation in the producing area.The producers were professionals dedicated to the production and export of pepper, so the state of maturity of the samples was optimal for marketability (Figures 2 and 3).having thin flesh and slender fruits (narrow with respect to their length) and being very elongated and pointed.

Experimental Design
The fruit samples evaluated came from plants grown in greenhouses located in south-eastern Spain (Almería, Spain).All the crops were transplanted in August, and the agronomic practices (irrigation, fertilisation, pruning, etc.) were the usual ones for pepper cultivation in the producing area.The producers were professionals dedicated to the production and export of pepper, so the state of maturity of the samples was optimal for marketability (Figures 2 and 3).The fruit samples evaluated came from plants grown in greenhouses located in south-eastern Spain (Almería, Spain).All the crops were transplanted in August, and the agronomic practices (irrigation, fertilisation, pruning, etc.) were the usual ones for pepper cultivation in the producing area.The producers were professionals dedicated to the production and export of pepper, so the state of maturity of the samples was optimal for marketability (Figures 2 and 3).

Experimental Design
Once the samples were harvested from the production plots, they were labelled and transported to the laboratory located at the University of Almeria (Almería, Spain).Each sample comprised 200 fruits.An initial commercial quality evaluation was performed on each fruit individually; the fruits were then stored in chambers and kept at 9 • C and 85-95% relative humidity throughout the study period [25,26].After the initial evaluation, the quality of each fruit of the samples was evaluated every 7 days until all the fruits in the sample lost their marketability (Table 1).
To evaluate the commercial quality of pepper fruits, those fruits presenting rottenness due to fungi or bacteria, overmaturity at harvest, decay, wilting, chilling injury, mechanical damage, bruises, etc., were considered.These are the main causes of the commercial depreciation of pepper fruits [27][28][29].

Statistical Analysis. Logistic Regression
Logistic regression is a statistical method to analyse binary response data in which an event either occurs or does not.Its construction consists of estimating the model parameters that best describe the relationship between the dichotomous categorical response variable (Y = yes-no, 0-1, etc.) and one or several explanatory variables (Xi), which can be categorical or continuous [23,24].
In our research, the logistic regression model has been designed to describe the probability of marketability of pepper cultivars as the dependent variable (Y).In the present context, "marketability" is the term chosen to define the quality of the fruits subjected to experimental conditions and their capacity for use, which is a function of the changes detected in quality occurring during postharvest storage.The simplest case of model application was for the independent variable X = days of storage (DOS), using simple binary logistic regression (see Tables 2 and 3).Its expression is shown in Equation (1) [20,22].
where π (x) is the probability of marketability of pepper fruits, "x" is the number of DOS, "e" is the Euler number, "α" is the intercept, and "β" is the slope.In other more complex cases, multiple binary logistic regression has been applied; the mathematical expression is shown in Equation (2) [20,22,24]: where π (x) is the probability of marketability of pepper fruits; "x i " are the independent variables: DOS, cultivar, and month of evaluation; "e" is the Euler number; and "α" and "β" are the intercept and the slope of the model, respectively.In this study, multiple models have been proposed by following two analytical approaches.The first is to evaluate the explanatory variables of cultivar and days of storage as factors influencing the probability of marketability of pepper fruits.The second is to evaluate all the variables considered (DOS, cultivar, and month) to identify those that can predict with greater precision the probability of marketability [20].Categorical variables in the multiple binary logistic regression analysis were treated as independent dummy variables.[21,24].
To estimate the parameters α and βi of the simple and multiple regression models, the maximum likelihood method was applied [30].To verify that the coefficient βi differs from 0, Wald's contrast test was applied, whose statistic can be observed in Equation ( 3) [22].
where β is the estimate of the parameter β via the maximum likelihood method and SE is the standard error of β.The model fit quantified by the goodness-of-fit test was also checked with the Hosmer-Lemeshow goodness-of-fit statistic [24].Finally, the odds ratios were determined (Equation ( 4)) to complement the interpretation of the constructed logistic models.
where "θ" is the odds ratio; the odds of an event is the ratio between the probability that said event will occur (π (x)) and the probability it will not occur (1− π (x)), and "π (x)" is the probability of marketability of pepper fruits [22].
The statistical analyses in this study were performed with the Statgraphics Centurion XVII-X64 and IBM SPSS Statistics Version 23 software packages.

Storage Time as a Factor That Influences Marketability
One method to study the influence of storage time on the marketability of different plant cultivars is by constructing and comparing independent simple binary logistic regression models for each cultivar [20,21].The results obtained for simple logistic regression models are presented in Table 2 and Figure 4 for the Tribelli XL ® pepper cultivars and in Table 3 and Figure 5

Influence of Storage Time and Cultivar on Fruit Marketability
Another analytical use of the binary logistic model is to study the effect of storage time and cultivar as factors influencing the marketability of pepper fruits [20,21].For this purpose, multiple binary logistic regression models were constructed for each phenotype group studied.Multiple binary logistic regression analyses were performed for the Tribelli XL® pepper cultivars (Table 4) and "Italian sweet" red pepper cultivars (Table 5).In both cases, the explanatory variables were the The models were constructed using simple logistic regressions for each pepper cultivar and fitted for the DOS.All coefficients α and β of the logistic models indicate a significant fit for DOS (p < 0.001).Moreover, the Hosmer-Lemeshow test revealed good fit of the models (p > 0.05).This means that the independent variable (DOS) of the model affects and explains the probability of marketability (dependent variable) for each cultivar.Furthermore, in all cases, there was a negative relationship (β < 0) between the commercial value of the fruit and storage time.This means that as the number of days of storage increased, the overall quality and marketability of the fruits decreased (Figures 4 and 5), as previously indicated by other authors [20,[31][32][33].The loss of commercial value of the peppers studied refers, fundamentally, to global changes in visual quality of the fruits under the experimental storage conditions (fruit wrinkling, rot, etc.).
The odds ratios (Exp (β)) and their confidence intervals were also calculated (Tables 2 and 3).The confidence interval for Exp (β) never contains the value one.This means that the DOS variable explains the behaviour of the probability of marketability for each pepper cultivar.The odds ratio is a measure of the association between the logistic model variables and is used to determine the strength of association [22].For a logistic regression in which a single explanatory variable is considered, when this variable is continuous, the odds ratio indicates the probability of the phenomenon studied for the explanatory variable with a value of "x" with respect to the probability of this variable for a value of "x+1" [22,24].For the Tribelli XL ® pepper, an increase of one unit in storage time reduced the probability of marketability (the odds of marketability decreased) by (1−0.803)× 100 = 19.7% in 'Spiker', 18.6% in '43', and 17.6% for 'Xhanti' (Table 2).In the comparison with the "Italian sweet" red pepper, an increase of one unit of storage time reduced the probability of marketability by 51.3% for 'Sixto' and 40.7% for 'Palermo' (Table 3).
The fitted models are shown in Figures 4 and 5.These reflect the probability of marketability of each pepper sample as a function of the number of DOS.For example, at the beginning of storage (0 DOS), all 'Palermo' and 'Sixto' fruits were marketable (100%).In contrast, after 15 days of storage, the probability of marketability decreased to 0.71 for 'Sixto' and 0.28 for 'Palermo' (Figure 5).The probability of marketability as a function of the DOS of 'Palermo' and 'Sixto' is given by the following equations: π(x) Sixto = e α+βx 1 + e α+βx = e (11.690−0.720x)   1 + e (11.690−0.720x)(5) π(x) Palermo = e α+βx 1 + e α+βx = e (6.885−0.523x) 1 + e (6.885−0.523x)(6) Figures 4 and 5 also show a line limiting the 95% tolerance level of marketable fruits and the intercept with the time evolution curves of the probability of marketability of the pepper cultivars studied.A non-marketable fruit fraction of 5% is a tolerance level allowed for the marketability of pepper in the European Union [34] and is usually allowed by distribution companies [20].With Tribelli XL ® varieties, a 5% level of non-marketable fruits was reached after 10 days for the '43' and 'Xhanti' cultivars; for 'Spiker', the same level was reached after 9 days of storage (Figure 4).For the "Italian sweet" red pepper, a 5% level of non-marketable fruits was reached after 12 days of storage for the 'Sixto' cultivar and at 7 days of storage for 'Palermo' (Figure 5).This approach allows us to determine and compare the behaviour of different pepper cultivars and identify those with greater probability of marketability [20,21].For example, the 'Sixto' cultivar exhibits longer shelf life than 'Palermo' (Figure 5).

Influence of Storage Time and Cultivar on Fruit Marketability
Another analytical use of the binary logistic model is to study the effect of storage time and cultivar as factors influencing the marketability of pepper fruits [20,21].For this purpose, multiple binary logistic regression models were constructed for each phenotype group studied.Multiple binary logistic regression analyses were performed for the Tribelli XL ® pepper cultivars (Table 4) and "Italian sweet" red pepper cultivars (Table 5).In both cases, the explanatory variables were the pepper cultivars and the number of days of storage (DOS).In both studies, regression coefficients, odds ratios (with confidence intervals), and significance for each variable were calculated.Likelihood ratio (omnibus, p < 0.000).Hosmer-Lemeshow test (p = 0.121).CI: confidence interval.
In the case of the "Italian sweet" red pepper study, all Wald values were statistically significant.This means that the DOS and cultivar variables explain the behaviour of the probability of marketability, as this relationship is negative (negative β value; Table 5).In contrast, for the Tribelli XL ® peppers, the Wald value was not significant for phenotype '43' when 'Xhanti' is considered as the reference (and vice versa); therefore, the confidence interval (CI) contains the value one.The other coefficients exhibited significant values of the Wald statistic, with a negative relationship between the number of days of storage and the probability of marketability (parameter β of DOS < 0; Table 4).However, all statistical models exhibited a good fit to the data, as indicated by the Hosmer-Lemeshow p-value (Tables 4 and 5).
The odds ratio for the DOS continuous variable in a multiple logistic model indicates the variation of the probabilities (odds) of marketability when DOS increases by one unit, considering the other explanatory variables equal to one [20].The odds ratio for DOS was 0.814 for Tribelli XL ® peppers and 0.548 for "Italian sweet" red peppers; these results imply that an increase of one unit of storage time decreases the probability (odds) of marketability by 18.6% and 45.2%, respectively (if the cultivars are considered = 1).
The odds ratios for each cultivar indicate the relationship between the probability of marketability of a cultivar with respect to what is considered the reference [20,21].This means that, for example, when we consider DOS with a fixed value (say DOS = 1) and 'Xhanti' as the reference, phenotypes '43' and 'Spiker' reduced their probability of marketability by 15.2% and 34.8%, respectively.Moreover, 'Spiker' exhibited an odds ratio farthest from unity and thus the greatest strength of association (Table 4).With the "Italian sweet" red pepper, if we consider the shelf life as a fixed value, 'Sixto' exhibited a 6.6-fold greater probability of marketability than 'Palermo' (Table 5).
On the other hand, Figures 6 and 7 show the number of days of storage versus the probability of marketability based on multiple analyses, with the 95% probability line setting the rejection criteria of distribution companies [20].These figures indicate that the level of 5% non-marketable fruits was achieved after 12 days for the 'Sixto' cultivar and 8 days for 'Palermo' (Figure 7).With the 'Spiker', '43', and 'Xhanti' cultivars, the values were 9, 10, and 11 days, respectively (Figure 6).With Tribelli XL ® , the greatest difference in reaching a 5% level of non-marketable fruits was two days (between 'Spiker' and 'Xhanti'), so these phenotypes have a good aptitude for being marketed together in tricolour containers because the shelf life of this sale packaging of fruits (minimum unit marketable), including fruits of each colour, is defined by the cultivar with the shortest shelf life (Figure 2).Likelihood ratio (omnibus, p < 0.000).Hosmer-Lemeshow test (p = 0.121).CI: confidence interval.

Influence of All the Variables Studied on Marketability
In a multiple logistic regression based on categorical and continuous explanatory variables, only variables with good correlations are included in the final model [20].Tables 6 and 7 report the results of the DOS, month, and cultivar variables when subjected to multiple binary logistic regression analysis for the two cultivar groups studied.
With the "Italian sweet" red pepper, all the results associated with the Wald test were significantly different from 0 and, therefore, the variables can explain the behaviour of the dependent variable (probability of marketability).In addition, after applying the forward and backward selection method based on the Wald statistic, none of the variables initially considered (cultivar, DOS, and month) were excluded.Therefore, they can be considered as good predictors of the probability of marketability.Furthermore, the odds ratio indicates that the probability of marketability is 17.2-

Influence of All the Variables Studied on Marketability
In a multiple logistic regression based on categorical and continuous explanatory variables, only variables with good correlations are included in the final model [20].Tables 6 and 7 report the results of the DOS, month, and cultivar variables when subjected to multiple binary logistic regression analysis for the two cultivar groups studied.With the "Italian sweet" red pepper, all the results associated with the Wald test were significantly different from 0 and, therefore, the variables can explain the behaviour of the dependent variable (probability of marketability).In addition, after applying the forward and backward selection method based on the Wald statistic, none of the variables initially considered (cultivar, DOS, and month) were excluded.Therefore, they can be considered as good predictors of the probability of marketability.Furthermore, the odds ratio indicates that the probability of marketability is 17.2-fold greater for 'Sixto' than for 'Palermo' [1/0,058 = 17.2], and in December, the probability of marketability was 27-fold higher than in October [1/0,037 = 27.0](Table 7).
With the Tribelli XL ® peppers, when all the explanatory variables (DOS, month, and cultivar) were considered, the estimation of the multiple binary logistic model was poorly fit, as indicated by the Hosmer-Lemeshow goodness-of-fit test (unpublished data).Therefore, to find a multiple logistic model with a more appropriate fit to the data and to include the pepper cultivars under study, the DOS variable was excluded from the model because of its very high correlation effect, in addition to having previously studied its combination with cultivars (Tables 2 and 4).Multiple general logistic models were fit excluding DOS (Table 6).These results indicated a significant Hosmer-Lemeshow test (p = 0.139); furthermore, the model parameters α and β i differed significantly from zero, except for 'Xhanti' when '43' is the reference.This result means that 'Xhanti' and '43' had the same probability of marketability, and '43' had a 1.2-fold higher probability of marketability than 'Spiker' (when we considered the months with a fixed value).Finally, the month with the lowest probability of marketability was April (if we consider the phenotypes with fixed values).

Limitations and Future Lines of Research
Some future works arising from this contribution would be the use of the logistic regression model with other plant species, which may be affected by commercialization problems during shelf life.On the other hand, it would also be interesting to add parameters related to loss of color, flavor and other parameters in fruits during post-harvest storage.This lack could be considered as the main limitation in our research.

Conclusions
This study demonstrated that the simple and multiple logistic regression model is a good mathematical tool to evaluate the probability of marketability of pepper cultivars.Simple regression is an interesting method to study the influence of storage time on the marketability of individual pepper phenotypes.All simple logistic models exhibited highly significant associations between the probability of marketability of pepper fruits and storage time.
Construction of the multiple logistic regression model, based on categorical and continuous explanatory variables, allows us to select the explanatory variables that best describe the behaviour of the probability of marketability of the fruits.In our study, some of the predictors exhibited a strong correlation, which meant that other predictors with lower correlation were not significant.The results and slender fruits (narrow with respect to their length) and being very elongated and pointed.Agronomy 2018, 8, x FOR PEER REVIEW 3 of 15

Figure 4 .
Figure 4. Time evolution of the probability of marketability for each Tribelli XL ® pepper cultivar ('43', 'Spiker', and 'Xhanti').The results come from the simple independent logistic model for each cultivar, as a function of the number of days of storage (DOS).

Figure 5 .
Figure 5.Time evolution of the probability of marketability for the 'Palermo' and 'Sixto' red pepper cultivars.The results come from the simple independent logistic model for each cultivar, as a function of the number of days of storage (DOS).

Figure 5 .
Figure 5.Time evolution of the probability of marketability for the 'Palermo' and 'Sixto' red pepper cultivars.The results come from the simple independent logistic model for each cultivar, as a function of the number of days of storage (DOS).

Figure 6 .
Figure 6.Time evolution of the probability of marketability as a function of the number of days of storage (DOS), as affected by the Tribelli XL® pepper cultivars ('43', 'Spiker' and 'Xhanti').

Figure 6 .
Figure 6.Time evolution of the probability of marketability as a function of the number of days of storage (DOS), as affected by the Tribelli XL ® pepper cultivars ('43', 'Spiker' and 'Xhanti').

Figure 7 .
Figure 7. Time evolution of the probability of marketability as a function of the days of storage, as affected by the 'Sixto' and 'Palermo' pepper cultivars.

Figure 7 .
Figure 7. Time evolution of the probability of marketability as a function of the days of storage, as affected by the 'Sixto' and 'Palermo' pepper cultivars.

Table 1 .
Sampling for each crop cycle and month of evaluation and days in which the samples of each tomato, cucumber, and pepper cultivar were measured in the laboratory.

Table 1 .
Sampling for each crop cycle and month of evaluation and days in which the samples of each tomato, cucumber, and pepper cultivar were measured in the laboratory.

Table 2 .
Estimation of independent simple logistic regression parameters for each Tribelli XL ® pepper cultivar as a function of the number of days of storage (DOS) as a factor influencing the probability of marketability.

Table 3 .
Estimation of independent simple logistic regression parameters for each red pepper cultivar, 'Palermo' and 'Sixto', as a function of the number of days of storage (DOS), a factor influencing the probability of marketability.Likelihood ratio (omnibus, p < 0.000).Hosmer-Lemeshow test (p > 0.05).The results come from the independent simple logistic model for each cultivar, as a function of the number of days of storage. * for the "Italian sweet" red pepper.
* Likelihood ratio (omnibus, p < 0.000).Hosmer-Lemeshow test (p > 0.05).The results come from the independent simple logistic model for each cultivar, as a function of the number of days of storage.

Table 3 .
Estimation of independent simple logistic regression parameters for each red pepper cultivar, 'Palermo' and 'Sixto', as a function of the number of days of storage (DOS), a factor influencing the probability of marketability.
* Likelihood ratio (omnibus, p < 0.000).Hosmer-Lemeshow test (p > 0.05).The results come from the independent simple logistic model for each cultivar, as a function of the number of days of storage.

Table 4 .
Estimation of multiple logistic regression parameters for Tribelli XL ® pepper cultivars ('43', 'Spiker', and 'Xhanti') and number of days of storage (DOS) as factors influencing the probability of marketability.

Table 5 .
Estimation of multiple logistic regression parameters for the "Italian sweet" red pepper cultivars ('Sixto' and 'Palermo') and number of days of storage (DOS) as factors influencing the probability of marketability.

Table 4 .
Estimation of multiple logistic regression parameters for Tribelli XL® pepper cultivars ('43', 'Spiker', and 'Xhanti') and number of days of storage (DOS) as factors influencing the probability of marketability.

Table 7 .
Estimation of multiple logistic regression parameters for 'Sixto' and 'Palermo' peppers.