Marketability Probability Study of Cherry Tomato Cultivars Based on Logistic Regression Models

The purpose of this study was to demonstrate interest in applying simple and multiple logistic regression analyses to the marketability probability of commercial tomato (Solanum lycopersicum L.) cultivars when the tomatoes are harvested as loose fruit. A fruit’s firmness and commercial quality (softening or over-ripe fruit, cracking, cold damage, and rotting) were determined at 0, 7, 14, and 21 days of storage. The storage test simulated typical conditions from harvest to purchase-consumption by the consumer. The combined simple and multiple analyses of the primary continuous and categorical variables with the greatest influence on the commercial quality of postharvest fruit allowed for a more detailed understanding of the behavior of different tomato cultivars and identified the cultivars with greater marketability probability. The odds ratios allowed us to determine the increase or decrease in the marketability probability when we substituted one cultivar with a reference one. Thus, for example, the marketability probability was approximately 2.59 times greater for ‘Santyplum’ than for ‘Angelle’. Overall, of the studied cultivars, ‘Santyplum’, followed by ‘Dolchettini’, showed greater marketability probability than ‘Angelle’ and ‘Genio’. In conclusion, the logistic regression model is useful for studying and identifying tomato cultivars with good postharvest marketability characteristics.


Introduction
The tomato (Solanum lycopersicum L.) is one of the most popular and in-demand vegetables in the world [1][2][3][4].However, postharvest losses are a major problem in supply chains [5].Globally, postharvest losses from tomato supply chains range from 10% to 40% of harvested tomatoes [6,7].Tomatoes are vulnerable to postharvest losses due to their perishable nature [8].In recent years, tomato processing firms have encountered increasing difficulties, due both to an increase in raw material costs and market difficulties [9].
Food systems are complex systems that encompass a plurality of processes, from production to processing and retailing, all the way to consumption [10].The quantitative and qualitative losses of vegetables occur from harvest through handling [11], storage, processing, and marketing to their final delivery to the consumer.Losses in industrialized countries occur at the retail and consumer stages, and in developing countries, losses occur during the production, harvest, postharvest, and processing phases [12].
Tomatoes have a relatively short life over the course of the marketing process [13].The postharvest handling and transport conditions of fresh products directly affect their shelf life [14].The longer the time between harvest and consumption is, the greater the loss of quality [15].The storage and maturation recommendations for tomatoes are well known, but quality problems associated with postharvest mismanagement and transportation continue to occur during distribution [16].
Postharvest losses in quality and quantity are related to crop immaturity, inadequate initial quality control, incidence and severity of physical damage, exposure to improper temperatures, and delays between harvest and consumption [17].The primary causes of the postharvest loss of commercial value in tomatoes are softening from impact or over-ripe fruit, cracking, water loss, cold damage, and changes in composition and decay [18].The intrinsic variables in the tomato fruit that can affect the organoleptic quality during marketing are the pH, titratable acidity (TA), total soluble solids (TSS), TSS/TA, glucose, fructose, firmness, and color.Regarding the primary extrinsic factors, those that produce the greatest change in quality are the ripening state and the time and conditions of transport and storage.Of the intrinsic and extrinsic factors, those that have the greatest influence on the tomato marketability probability are the time and storage conditions, together with the fruit firmness [19,20].
In recent decades, the demand for improved storage and transport conditions to maintain the fruit quality (i.e., flavor, color, nutritional aspects, firmness, and shelf life) of fresh products during their shelf life has increased [21].Strategies to reduce postharvest losses of fruits and vegetables include the development of new cultivars with a longer commercial life, improved handling systems (less aggressive packaging and the maintenance of the cold chain), and the use of modified atmospheres and treatments at the pre-and postharvest stages with substances that reduce the deterioration of the fruits [22][23][24][25][26][27].Most studies on the commercial life of tomatoes are based on statistical models that estimate the fruit's commercial life based on its initial firmness [28][29][30][31].However, the improvement of the useful life should be based on studies that include all the factors that cause losses in the commercial value, and therefore, they should be based on analytical techniques that integrate and combine quantitative (e.g., firmness) and qualitative (e.g., cracked and rotten fruits) variables.In addition, these variables should also be simple to determine so they can be used by production and marketing companies as well as for breeding and genetic improvement studies, especially for the selection of precommercial cultivars.
Despite the existence of many studies focused on the development of new tomato cultivars with a longer commercial life based on statistical models, in no case do these models consider all the aspects that lead to commercial loss, as described by Kader [18].In the development of new tomato cultivars with greater commercial lifespans, there are no studies on the development of models that can be used to predict the marketability probability of stored fresh products.To understand the postharvest behavior of new tomato cultivars, it is necessary to study the commercial life in response to the current needs.Therefore, it is necessary to develop study models that holistically integrate all the causes of commercial loss (Figure 1), such as the studies performed by Melesse et al. [20] on the marketability of tomatoes subjected to different pre-and postharvest treatments and Tolesa et al. [19] on the marketability of tomato fruits harvested at different stages of maturity and subjected to different disinfection and storage conditions.
Logistic regression models are statistical models that determine the relationship between a dichotomous qualitative dependent variable (binary or binomial logistic regression) and one or more independent explanatory variables or covariables, whether qualitative or quantitative (multiple logistic regression) [32].The logistic regression model is presented as a novel approach for calculating the marketability probability of fresh products that have been subjected to multiple agro-economic practices [19,20,33].This logistic regression model has not been applied in comparative studies of tomato cultivars (Figure 1).
The objective of the present study is to demonstrate that the application of simple and multiple logistic regression analysis is a useful tool for studying and understanding the marketability potential of tomato cultivars, which allows for the selection of plant materials with longer postharvest commercial lives.In the development of these logistic models, both continuous and categorical variables will be used, representing the primary causes of postharvest loss.Likewise, best-fit models will be selected for predicting the marketability probability of tomatoes stored under conditions that simulate the typical storage conditions from collection to consumer purchase-consumption.
Agronomy 2018, 8, x FOR PEER REVIEW 3 of 13 best-fit models will be selected for predicting the marketability probability of tomatoes stored under conditions that simulate the typical storage conditions from collection to consumer purchase-consumption.Where π (x) is the probability that a tomato fruit is marketable, "x" are the days of storage, "e" is Euler's number, "α" is the intersection, and "β" is the slope parameter.

Production and Preparation of the Sampled Tomatoes
This research was performed on four cultivars of cherry tomatoes cultivated in 4 greenhouses in Almeria (south-eastern Spain) owned by professional producers who are dedicated to the production and export of tomatoes.The commercial cultivars studied here were 'Santyplum', 'Dolchettini', 'Angelle', and 'Genio'.The evaluated fruits were sampled from fruit batches marketed to different European countries (the fruits harvested at optimum commercial maturity).All the plants were cultivated under the same standard conditions of fertilization, irrigation, climate control, etc.The plants were transplanted during the second half of August, 2017, and the crop cycle ended in May of 2018 (a typical growing cycle in south-eastern Spain).The first collections began in October of 2017.

Experimental Design
In each crop field, a sample of 200 fruits from each cultivar was collected during the months of November, December, January and April.Each sample was identified and labeled with the aim of maintaining traceability throughout the study, and each was then moved to a laboratory at the Universidad de Almería (located in Almería, Spain).Each 200-fruit sample was subdivided into 4 subsamples of 50 fruits each to evaluate their commercial quality at different time intervals of 0, 7, 14, and 21 days of storage (T0, T7, T14, and T21).The storage conditions simulated the route taken by the fruit from their collection (in south-eastern Spain) to their purchase by the consumer in Where π (x) is the probability that a tomato fruit is marketable, "x" are the days of storage, "e" is Euler's number, "α" is the intersection, and "β" is the slope parameter.

Production and Preparation of the Sampled Tomatoes
This research was performed on four cultivars of cherry tomatoes cultivated in 4 greenhouses in Almeria (south-eastern Spain) owned by professional producers who are dedicated to the production and export of tomatoes.The commercial cultivars studied here were 'Santyplum', 'Dolchettini', 'Angelle', and 'Genio'.The evaluated fruits were sampled from fruit batches marketed to different European countries (the fruits harvested at optimum commercial maturity).All the plants were cultivated under the same standard conditions of fertilization, irrigation, climate control, etc.The plants were transplanted during the second half of August, 2017, and the crop cycle ended in May of 2018 (a typical growing cycle in south-eastern Spain).The first collections began in October of 2017.

Experimental Design
In each crop field, a sample of 200 fruits from each cultivar was collected during the months of November, December, January and April.Each sample was identified and labeled with the aim of maintaining traceability throughout the study, and each was then moved to a laboratory at the Universidad de Almería (located in Almería, Spain).Each 200-fruit sample was subdivided into 4 subsamples of 50 fruits each to evaluate their commercial quality at different time intervals of 0, 7, 14, and 21 days of storage (T0, T7, T14, and T21).The storage conditions simulated the route taken by the fruit from their collection (in south-eastern Spain) to their purchase by the consumer in north-central Europe.The storage period was divided into a first stage of 7 days of storage in a cold room, in which refrigerated transport was simulated at 12 • C, and a second period of 7 to 21 days of storage in a chamber at ambient temperature (18-20 • C), simulating the period of fruit exposure and sale to consumers.The activities during each stage of the fruit evaluation were the following: • T0: At the time of collection, a subsample of 50 fruits from the initial 200 was randomly selected.The commercial quality was measured for each fruit (firmness, state of freshness, presence of fruit anomalies such as splitting, and fungi).The remaining 150 fruits were kept under storage conditions of 12 • C and 70% relative humidity for 7 days.• T7: After 7 days under these storage conditions, a second subsample of 50 fruits was randomly extracted and evaluated for their commercial quality.The remaining 100 fruits were kept in a chamber at room temperature (18-20 • C), simulating the period of fruit exposure and sale to consumers.• T14: After 7 days under market conditions (14 days after collection), of the remaining sample of 100 fruits, a third subsample of 50 fruits was randomly extracted, on which the commercial quality parameters were evaluated.The remaining 50 fruits were kept at room temperature (18-20 • C). • T21: After 7 days (21 days after collection), the last subsample of 50 fruits was evaluated, on which the commercial quality parameters were measured.

Data Collection and Laboratory Measurements
The commercial quality was evaluated for each fruit.The storage time at the optimal temperature and the firmness of the stored fruit are the factors with the greatest influence on the marketability probability of tomatoes [19,20].Therefore, the measured parameters were the firmness as determined by an AGROSTA 100 digital durometer (Durofel DFT 100, Serqueux, France) specific to fruits and vegetables.The system has a head that measures 2.54 mm in length and is assembled on a precision spring.The head can be more inserted if the material measured is harder.The measurement range goes from 0 to 100.The 0 measurement means that the sensor is completely outside, and the 100 measurement means that the sensor is fully inserted.The resolution of the system is 1 graduation and the accuracy is ±1 graduation.
The marketability of the fruit was subjectively evaluated by considering the primary causes of postharvest loss as described by Kader [18] and Valero and Serrano [34], with causes consisting of softening from impact or over-ripe fruits, cracking, cold damage, and rot.These are the primary reasons for claims associated with postharvest problems from distribution companies (wholesalers and retailers) to production and handling companies.

Logistic Regression
The binary logistic regression models are statistical models that can determine the relationship between a dichotomous qualitative-dependent variable (yes-no, 0-1, true-false, etc.) and one or more independent explanatory variables, or covariates, whether qualitative or quantitative [32,35].
The logistic regression model in its simplest form (one explanatory variable) is shown in Equation (1) [36].This regression model was used to analyze the relationship between the days of storage and the marketability of the tomato fruits from each cultivar.
The logit function transforms the probability scale from the range (0, 1) to (−∞, +∞).The logistic regression formula for an explanatory variable involves the following formula for the probability π (x): Agronomy 2018, 8, 176 5 of 13 where π (x) is the probability that a tomato fruit is marketable for a value x that can be a categorical variable (e.g., cultivar) or a continuous one (e.g., days of storage (DOS)).The parameters α and β are the intersection and the slope [20,36].Equation ( 1) can be generalized to multiple independent continuous and/or categorical explanatory variables.This equation is known as multiple logistic regression, and its expression is as follows [32,33].
where π (x) is the marketability probability of the tomato given a set of n explanatory variables x i (i = 1,..., n), which can be categorical variables (e.g., cultivar) or continuous ones (e.g., days of storage (DOS)), α is the intersection parameter, and β i are the slope parameters for the primary effects of each explanatory variable (i = 1,..., n) [19,32].We used the multiple logistic regression model to evaluate significant predictors of marketability probability for all the explanatory variables considered in the study.Equation (3) was also used to determine the effect of the most influential explanatory variables on the marketability of the tomato cultivars.For the logistic regression analysis, IBM SPSS Statistics Version 23 and Statgraphics Centurion XVII-X64 software were used.
The inferential calculation of the logistic regression parameters consists of calculating the parameter values to maximize the marketability probability of the observed data.For this purpose, the maximum likelihood estimation method was applied [36].
The interpretation of a logistic regression model involves probabilities and odds ratios.For Equation ( 1), the probabilities of response 1 (that is, the probabilities of success) are as follows.
The interpretation of β in (5) can be understood as for each increase of one unit in x, the probabilities are multiplied by e β .That is, the probabilities at the level x + 1 are equal to the probabilities at x multiplied by e β .When β = 0, e β = 1 and the probabilities do not change as x changes.[32,36].Odds ratios are measures of association that are widely used, especially in epidemiology [37].

Influence of Days in Storage on the Marketability Probability Based on Simple Independent Logistic Regressions for Each Cultivar
The temporal evolution of the marketability probability for the fruits of the cherry tomato cultivars are shown in Figure 2.These models were obtained from the simple logistic regression for each cultivar and adjusted for the days of storage (DOS).The adjusted model predicts the probability that the tomato fruits of each cultivar can be marketed during storage.All the simple logistic regression models were statistically significant (p < 0.05).The regression coefficients, the odds ratios (Exp (β)), and their lower and upper 95% confidence limits were calculated (Figure 2).The marketability probability refers to the global changes in postharvest fruit quality related to the loss of firmness, cracking, rotting, etc. that occurs under the experimental storage conditions.Negative regression coefficients show a negative relationship (e.g., Equation ( 6)), which indicates that as the length of time the fruit remains in storage increases and the marketability probability decreases, as described by other authors [19,20,33].For example, at 0 days of storage, the marketability probability was 100% for the cultivars 'Angelle', 'Genio', and 'Santyplum', and it reached 99% in 'Dolchettini'.By contrast, after 14 days of storage, the marketability probability was reduced to 97% in the case of "Santyplum", 90% in 'Dolchettini', 86% in 'Angelle', and 85% in 'Genio' (Figure 2).

Effects of the Cultivar and Days of Storage as Influencing Factors on the Marketability Probability
In CE Regulation n 182/2011 [38], specific provisions are established for the marketability of certain agricultural products, allowing for a total marketing tolerance of 5% for tomatoes (in number or by weight) in the Extra category and 10% in categories I and II.In practice, a 5% tolerance is usually allowed for the marketing of tomatoes in the European Union.A tolerance level of 5% for noncommercial fruits (95% for commercial fruits) is also included in Figure 2. When comparing the simple logistic regression models for each cultivar with respect to the line representing 95% of marketable fruits, it can be observed that the noncommercial fruits reached 5% after 10 days of storage for 'Genio', 'Angelle', and 'Dolchettini' and at 15 DOS for 'Santyplum' (Figure 2).However, when we performed a multiple analysis based on the continuous variable DOS and the categorical variable cultivar (Figure 3 and Table 1), the probability of obtaining a 5% tolerance in noncommercial fruits was obtained after 10 days in the case of 'Angelle' and 'Genio' and at 13 days for 'Dolchettini' and 'Santyplum'.
The odds ratios show the decreased marketability probability during the storage of each of the cultivars.The odds ratio of 0.816 for DOS in the cultivar 'Angelle' indicates that for a one day increase in storage, the odds of marketability decreases by (1 − 0.8116) × 100 = 18.4%.In the other cultivars was 18.5% for 'Dolchettini', 23.0% for 'Genio', and 32.7% for 'Santyplum'.The categorical variables included in the model were the tomato cultivars ('Angelle', 'Dolchettini', 'Genio', and 'Santyplum') and the months of the crop cycle during which the fruits were evaluated (November, December, January, and April).The continuous variables considered in the model were the fruit firmness and days of storage (DOS).The high correlative effect of some dependent variables on the multiple regression models can generate nonsignificant effects in other dependent variables [19].In our case, the days of storage (DOS), the firmness, and the months during which the fruits were evaluated (month) were variables with high correlative effects, which, after backward elimination, resulted in a final model that only included variables that were significantly associated with noncommercial fruits.The final model did not include tomato cultivars (cv).This result could be due to the fact that the correlation of the cv with the other variables is significantly lower.Although the model in our study did not include the tomato cultivar variable (cv), there are other variables presenting higher correlations that are interesting (Table 3).

Effects of the Cultivar and Days of Storage as Influencing Factors on the Marketability Probability
The independent variables DOS, firmness, and month used to predict the marketability probability of the tomato fruits showed a significant fit with the multiple logistic regression model (p < 0.05).The month of collection is an important variable in the model.The month of April is not included in Table 3 because it was used as a reference in the multiple logistic regression model.The odds ratio showed that the November marketability probabilities were 75.239 times higher than those of April, when keeping all the other factors fixed.In December and January, the marketability probabilities were 125.759 and 12.309 higher, respectively, than they were in April (Table 3).
Some researchers have done previous work in which they applied logistic regression models in the study of postharvest in fruits.Logistic regression models were considered the best statistical tools for the evaluation of bruising development in tomato cultivars [40]; they successfully analyzed the effect of storage temperature on the avocado fruit [41], identified the factors that influence the decomposition of the nucleus in pear fruits [33], quantified the factors associated with the microbial contamination of the product in the pre-and postharvest phases in 14 types of fruits and vegetables [42], evaluated the quality of the tomato subjected to different treatments before and after the harvest [19,20], and the loss of quality in chicory associated with the discoloration of the head of the leaves was evaluated [43].In these works, the effect of continuous and categorical variables on changes in fruit quality could be identified by logistic regression analysis.In the present work a similar approach has been applied to identify changes in postharvest quality for different tomato cultivars.

Conclusions
The logistic regression model allows investigators to study and identify tomato cultivars with good attributes in relation to postharvest marketability.The combination of simple and multiple regression analyses of the continuous and categorical variables with the greatest influence on the commercial quality in postharvest tomato fruits can help to determine the behaviors of different cultivars and identify those with the greatest marketability probability.In addition, the logistic model allowed to determine the firmness values for each cultivar from which a fruit would be rejected commercially.The approach of the developed model can be used by companies that are developing new tomato cultivars, by farmers-processors, and by distributors (wholesalers and retailers) since it is based on simple measurement parameters (firmness, cracking, etc.) that represent the primary causes of postharvest loss.
An analysis of the odds ratios can determine whether a cultivar improves upon or makes the marketability probability worse when it is substituted for another cultivar.Of the cultivars studied here, 'Santyplum' and 'Dolchettini' had greater marketability than 'Angelle' and 'Genio'.The odds ratios can also determine the marketability probability of the cultivars based on their firmness.In our study, the marketability probability of the 'Santyplum' cultivar had the greatest sensitivity to the fruit firmness.
In a multiple logistic regression based on continuous and categorical explanatory variables, only highly correlated variables are included in the final model.It was found that the firmness, days of storage, and months evaluated are the primary determining factors in tomato marketability.The cultivar variable was not included in the final model because it was weakly correlated with the other variables.Nevertheless, this analysis is interesting in light of the other highly correlated variables.
This type of study can be applied to any type of tomato fruit whose collection and marketability are based on loose fruit.Additionally, it would be interesting to perform these same studies in the context of marketing the complete tomato branch.

Figure 1 .
Figure 1.Diagram of the interest in and practical application of the logistic regression model in comparative studies of tomato cultivars to identify the cultivars with the longest postharvest commercial lives.Where π (x) is the probability that a tomato fruit is marketable, "x" are the days of storage, "e" is Euler's number, "α" is the intersection, and "β" is the slope parameter.

Figure 1 .
Figure 1.Diagram of the interest in and practical application of the logistic regression model in comparative studies of tomato cultivars to identify the cultivars with the longest postharvest commercial lives.Where π (x) is the probability that a tomato fruit is marketable, "x" are the days of storage, "e" is Euler's number, "α" is the intersection, and "β" is the slope parameter.

Figure 2 .
Figure 2. Temporal evolution of the marketability probability for the studied cherry tomato cultivars.The results originate from the simple independent logistic model for each cultivar based on the days of storage.The Hosmer and Lemeshow test: p > 0.05.

Figure 2 .
Figure 2. Temporal evolution of the marketability probability for the studied cherry tomato cultivars.The results originate from the simple independent logistic model for each cultivar based on the days of storage.The Hosmer and Lemeshow test: p > 0.05.

Figure 2 .
Figure 2. Temporal evolution of the marketability probability for the studied cherry tomato cultivars.The results originate from the simple independent logistic model for each cultivar based on the days of storage.The Hosmer and Lemeshow test: p > 0.05.

Table 1 .
Estimates of logistic regression parameters for cultivars and days of storage (DOS) as influencing factors on the marketability probability.

Table 3 .
Parameter estimates for the multiple logistic regression model.