Prediction of Resin Production in Copal Trees ( Bursera spp.) Using a Random Forest Model

: Non-timber forest products (NTFPs) are essential for community development, but their enormous demand has posed a serious threat to trees growing in their natural habitat. Copal resin is one of these products, which has a great deal of religious and ceremonial signiﬁcance in Mexico and around the world. Resin extraction from a tree depends on its morphological and physiological characteristics, as well as its physical and sanitary condition. In this study, a methodology was proposed for determining the yield and health status of Copal trees, and a random forest (RF) model was developed to explain their resin production based on their morphological and condition characteristics. The experiment was conducted in the Agua Escondida watershed in Puebla, Mexico. With the training data, the average accuracy of the model was 99%, with a Kappa index of 98%, which is considered an excellent level of agreement beyond chance, and with the validation data, the average accuracy was 71% and 47%, which is considered a good level of agreement beyond chance. Tree condition was the most important factor affecting resin production in Copal trees, followed by stem diameter (33 and 38 cm), height (2 and 2.5 m), and diameter of secondary branches (from 8 to 15, 22 and 32 cm).


Introduction
Non-timber forest products (NTFPs), also known as non-timber forest benefits, are plant and animal products and services, excluding timber, that are derived from forests harvested by human beings [1]. NTFPs can be classified into a variety of categories, including fibers, gums, waxes, rhizomes, bush soil and resins [2]. NTFPs found in tropical deciduous forests have a great deal of importance for the development of rural communities living in or near the forests, which are continually harvested [2,3]. Tadesse et al. [4] reported that gum Arabic, frankincense, and myrrh are produced from gum and resin from the Leguminoseae and Burseraceae families. Gums such as chicle, latex, rubber, conifer resins for making turpentine, bursera resins for making copal, and linaloe essential oils are the most important NTFPs in Mexico [5]. Copal resin, derived from species of the Bursera genus, is one of the most important NTFPs in this type of vegetation. Among the most important species of this genus in Mexico's tropical deciduous forests, Bursera bipinnata (Sessé & Moc.) Engl. and Bursera copallifera (Sessé & Moc. Ex DC.) Bullock. are the species that produce the most valuable resin for Copal producers [6,7]. Copal is a resin with economic and religious significance. In temples, Copal is burned as a means of purification [6] and is thought of as a Sustainability 2022, 14, 8047 2 of 13 sacred food present in offering rituals [8]. Numerous factors, however, threaten the survival of Copal-producing species within the plant community. For instance, overgrazing reduces the number of seedlings that can replace older trees [9], and the demand for product leads to ignoring the extraction and rest times necessary for trees to recover after harvest [6]. The extraction method is another possible cause of the decline in natural populations, which can have ecological and economic consequences. The process of cutting involves making a series of V-shaped incisions in the bark of the trunk and in the thick branches of the tree, which creates an environment conducive to the development of pests and diseases [10]. Mexican Official Standards for Copal resin harvesting call for management plans that include an inventory and yield calculation [11]. The trees are usually found on communal properties where many extractors gather; thus, there is no formal organization, and producers do not follow sustainable practices. No studies specify how much resin can be extracted from a tree based on its morphological and physiological characteristics [12]. In the region of Izúcar de Matamoros, Puebla, as well as in many other parts of Mexico and the world, Copal continues to be a significant element of religious ceremonies, especially for the Day of the Dead, when people who have died are prayed for and honored, but its excessive demand has caused serious degradation to its natural environment [6].
Non-parametric methods are attractive for classification, since they allow one to use arbitrary data distributions without assuming that the shapes of the underlying densities are known [13]. Random forest (RF) is a nonparametric supervised machine learning classification algorithm [14] that is highly accurate and robust to noise [14][15][16]. The RF method is capable of evaluating complicated interactions between variables because it automatically selects the most significant variables, regardless of the number of variables initially used [14,17,18]. The configuration of an RF is straightforward in comparison to other non-parametric methods, as no complex parameter settings are required.
This study aimed at: (1) proposing factors and indicators to estimate Copal tree condition and resin production guiding by the structure of Webster's Guide for the evaluation of urban trees [19]; and (2) identifying which morphological and condition variables of Copal trees affect resin production via random forest classification.

Study Area
The study area was the Agua Escondida watershed located in the municipalities of Izúcar de Matamoros, Epatlán, Ahuatlán and Xochiltepec in the Puebla state ( Figure 1), with elevations from 1100 to 1600 m and an area of 6492 ha as well as a temperature range from 19 to 25 • C and a precipitation range from 700 to 900 mm per year. The predominant soils within the watershed are leptosol, vertisol and regosol, with two main climates: warm subhumid with summer rains and semi-warm sub-humid with summer rains. Primary land use is agriculture, and the predominant vegetation type is low deciduous forest [20][21][22].
Experimental site. The experimental site was the Ejido San Juan Raboso, municipality of Izucar de Matamoros, located in the region of Cerro de la Virgen, within the Agua Escondida watershed (Figure 1), with an elevation 1282 m and an area of 30.8 ha. The predominant soils are leptosol and vertisol. The main climate is warm sub-humid with summer rains and a temperature range of 18 to 26 • C and precipitation range of 700 to 900 mm per year [22]. Tropical deciduous forest is predominant in this area, typical of an area with two distinct seasons of moisture availability, rainy and dry [23], as well as species that lose their leaves in the dry season, and others with thorny and corky protuberances with sparse and open crowns [24].

Response Variable
Resin production. For the selection of Copal trees for resin extraction, the amount of resin produced is the primary criterion; however, there is no established method to estimate this quantity. According to Copal growers in this region and other regions [8,25], morphological characteristics affect resin production. Copal resin production was evaluated using the factors and indicators listed in Table 1. Based on the experimental knowledge of Copal producers as well as the scientific knowledge of the researchers, factors and indicators were developed guiding by the structure of Webster's Guide for the evaluation of urban trees [19]. Resin production was categorized as high, medium, or low according to the sum of the main diameter, height, diameter of secondary branches 1 and 2, and condition of each Copal tree ( Table 1). Each of the three production categories, high, medium, and low, was coded as a dummy variable of 2, 1, or 0.

Response Variable
Resin production. For the selection of Copal trees for resin extraction, the amount of resin produced is the primary criterion; however, there is no established method to estimate this quantity. According to Copal growers in this region and other regions [8,25], morphological characteristics affect resin production. Copal resin production was evaluated using the factors and indicators listed in Table 1. Based on the experimental knowledge of Copal producers as well as the scientific knowledge of the researchers, factors and indicators were developed guiding by the structure of Webster's Guide for the evaluation of urban trees [19]. Resin production was categorized as high, medium, or low according to the sum of the main diameter, height, diameter of secondary branches 1 and 2, and condition of each Copal tree ( Table 1). Each of the three production categories, high, medium, and low, was coded as a dummy variable of 2, 1, or 0.

Explanatory Variables
Condition. One of the most important criteria used by coppice growers to select trees for harvest is their condition. The condition of the tree is defined by many structural factors, such as its appearance, age and condition of the stem [25]. Based on the factors and indicators proposed in Table 1, the condition of Copal trees was evaluated. Factors and indicators were developed based on the experimental knowledge of Copal producers and the scientific knowledge of the researchers guiding by the structure of Webster's Guide for the evaluation of urban trees [19]. The condition of the trees was assessed in terms of good, regular, or unsatisfactory based on the sum of the indicators, such as the condition of the stembark, branch elongation, branching structure, and damage caused by pests, diseases, and epiphytes, as well as by the percentage of canopy cover ( Table 2). Each of the three condition categories, good, regular, and unsatisfactory, was coded as a dummy variable as 2, 1, or 0. Main stem diameter. One of the most important morphological characteristics of Copal trees affecting the amount of resin extracted is the diameter of the main stem. Smaller diameters than 10 cm cannot be used for extraction due to their low yield, and their exploitation would reduce their production capacity in the future [25,26]. Its code in the model was diameter.
Diameter of secondary branches 1 and 2. Another important morphological characteristic for determining Copal resin content is the diameter of the secondary branches [25,27]. Their codes in the model were dia1 and dia2, respectively.
Total tree height. Copal resin production may also be influenced by its height [25]. Its code in the model was alt.

Measurement of Variables and Database
Data on the vegetation in the study area were collected using simple random sampling. The sampling unit consisted of a quadrant of 10 square meters [28]. A total of 3097 sampling units (SU) was estimated in the area of 30.8 ha. Within the study area ( Figure 2), 100 Copal sampling units were located where their geographic coordinates were recorded, their current condition was evaluated, and their morphological variables of total height, stem diameter, and diameter of two secondary branches were measured [25,29].
Sustainability 2022, 14, x FOR PEER REVIEW 5 of 13 current condition was evaluated, and their morphological variables of total height, stem diameter, and diameter of two secondary branches were measured [25,29]. Geographic coordinates for each tree were determined by using a global positioning device. Each tree was examined by experienced Copal growers based on factors such as bark condition, branch structure, damage caused by pests, diseases, and epiphytes, percent canopy cover, and branch elongation as measured with a diametric tape ( Table 2). Due to their irregular height, the diameter of the stem was measured below the secondary branches using a diametric tape. In many instances, the diameter was the same as the diameter at breast height at 1.30 m above ground level. The diameter of the secondary branches was measured using a diametric tape (diameter of branch 1 and diameter of branch 2). Although some trees had more than two secondary branches, only two were measured, which are most commonly used by Copal producers for extraction of resin. With a Haga-type hypsometer, the height of the tree was measured from ground level to the apex of the canopy. For estimating resin production by each tree, morphology and condition were observed in the field (Table 1).

Variable Selection
Multicollinearity occurs when two or more predictor variables are closely correlated, which results in less precise estimates of the dependent variable due to the effect of an independent variable on the dependent variable, as opposed to when the independent variables are uncorrelated [30]. Multicollinearity between Copal explanatory variables was assessed with Pearson's linear correlation coefficient to evaluate the association between variables [31] and with the variance inflation factor (VIF) to estimate how the variance increases when predictors are correlated [32]. With the Corrplot [33] and Usdm [34] packages of the R statistical language [35], Pearson's linear correlation and VIF between predictor variables were verified with the objective of avoiding multicollinearity effects and maximizing the contributions of variables [36]. Pearson's correlation coefficients, | |, > ~0.70 [37] and variables with a VIF greater than four were removed from the model because they represent a significant correlation between them [38].

Decision Tree
This paper presents the development and validation of a random forest classification model to explain Copal resin production as a function of both morphological and environmental variables, in Ejido San Juan Raboso, located in the Agua Escondida watershed. Geographic coordinates for each tree were determined by using a global positioning device. Each tree was examined by experienced Copal growers based on factors such as bark condition, branch structure, damage caused by pests, diseases, and epiphytes, percent canopy cover, and branch elongation as measured with a diametric tape ( Table 2). Due to their irregular height, the diameter of the stem was measured below the secondary branches using a diametric tape. In many instances, the diameter was the same as the diameter at breast height at 1.30 m above ground level. The diameter of the secondary branches was measured using a diametric tape (diameter of branch 1 and diameter of branch 2). Although some trees had more than two secondary branches, only two were measured, which are most commonly used by Copal producers for extraction of resin. With a Haga-type hypsometer, the height of the tree was measured from ground level to the apex of the canopy. For estimating resin production by each tree, morphology and condition were observed in the field (Table 1).

Variable Selection
Multicollinearity occurs when two or more predictor variables are closely correlated, which results in less precise estimates of the dependent variable due to the effect of an independent variable on the dependent variable, as opposed to when the independent variables are uncorrelated [30]. Multicollinearity between Copal explanatory variables was assessed with Pearson's linear correlation coefficient r to evaluate the association between variables [31] and with the variance inflation factor (VIF) to estimate how the variance increases when predictors are correlated [32]. With the Corrplot [33] and Usdm [34] packages of the R statistical language [35], Pearson's linear correlation and VIF between predictor variables were verified with the objective of avoiding multicollinearity effects and maximizing the contributions of variables [36]. Pearson's correlation coefficients, |r|, > ∼ 0.70 [37] and variables with a VIF greater than four were removed from the model because they represent a significant correlation between them [38].

Decision Tree
This paper presents the development and validation of a random forest classification model to explain Copal resin production as a function of both morphological and environmental variables, in Ejido San Juan Raboso, located in the Agua Escondida watershed.

Random Forest
RF is a nonparametric supervised machine learning classification algorithm that uses decision trees to identify predictive variables from a portion of the data used to train the model and the remainder to estimate the model's performance [14]. A predictor variable is considered significant in a classification model if, by omitting it, the out of bag error (OOB) is increased. The model creates many classification trees, each based on bootstrap samples to improve the predictive ability of the model [39]. For RFs, each split of a tree is determined by using a random subset of normalized input factors at each node. Model output is the average of all trees [40]. The RF was run with 75% of the data for training, and the number of trees in the forest (ntree) and number of randomly sampled variables as candidates for each split (mtry) were determined a priori. These parameters were optimized to minimize the model generalization error (OOB) in order to obtain the best predictive power. Based on a set of starting data, the algorithm combines many decision trees by choosing explanatory variables ( Table 2) at each node of the tree [41]. Twenty-five percent of the remaining data was used for model validation, which is the most important component of any modeling process to ensure that the results are suitable for scientific studies [42]. Model prediction was performed using the random forest package [43] in R software [35].

Accuracy Assessment
Based on the observational errors of the training dataset that are left out during the process of building each tree, the performance of each model was estimated through the OOB method that is used to evaluate the accuracy of a random forest. The estimate is based on the prediction that you obtain for each data point by averaging only those trees for which the record was not included in the training set [15]. The predictions were validated using the confusion matrix [44] and the Kappa concordance index [45,46]. The overall prediction accuracy is the proportion of correct predictions made by the model for a given class and is calculated as the sum of the total number of correct predictions divided by the total number of predictions made. Kappa accuracy, also known as the Kappa concordance coefficient, is a metric that is used to assess the expected error rate, and it is calculated by subtracting the expected accuracy (E) from the observed accuracy (O) and dividing that by 1 minus the expected accuracy,

Identification of Significant Variables
The mean decrease Gini coefficient (MDG) and mean decrease accuracy (MDA) were used to determine the relative importance of the variables [48]. The first measure is the degree of homogeneity between the nodes and leaves of the classification tree, which means that variables with higher values are more useful in identifying well-differentiated classes (measures the variance between classes of the response variable). The second measure measures the loss of accuracy of the RF as a result of removing a certain variable from the classification. Thus, variables with high values are more important for the classification of the data [49]. For each tree, the two measures of importance are computed separately and are averaged over the entire forest [16,43,49].

Partial Dependence Plot (PDP)
PDP indicates the marginal influence of one or two features on the predicted outcome of a machine learning model [50] and can be used to determine whether the relationship between a target and a feature is linear, monotonic, or more complex.

Database and Variable Selection
The present study examined a total of 100 Copal trees, recorded their geographic coordinates, analyzed their current status, and measured three morphological variables: total height, stem diameter, and diameter of two secondary branches. Four continuous variables (stamen diameter, diameter of secondary branch 1, diameter of secondary branch 2, and tree height) and one categorical variable (condition, Table 2) were used as predictors. Production was considered as a response variable with three levels (Table 1). Based on Sustainability 2022, 14, 8047 7 of 13 linear correlation, r, and VIF, the variable secondary branch diameter 1 was excluded from the subsequent modeling process due to its high correlation.

Random Forest
The two key hyperparameters, the number of trees (ntree) and the number of variables in each split (mtry) were determined using the exhaustive method. The initial set of data consisted of 500 trees (ntree) and two variables at every node (mtry), which were determined using the square root of the total number of predictor variables √ 4 variables = 2 . RF model parameters mtry and ntree were optimized in order to maximize the predictive accuracy of the estimation. Seventy-six percent of the field data (n = 76) was used to evaluate model performance. The OOB estimate for the RF model error rate was 30.67 percent. Using the exhaustive method, the number of variables (mtry) in the default RF model were assigned from one to four, and the other parameters were fixed to obtain the minimum error rate. Using the same method, ntree was taken to the transversal algorithm by setting the value of mtry. Figure 3 illustrates the results and processes.
variables (stamen diameter, diameter of secondary branch 1, diameter of secondary branch 2, and tree height) and one categorical variable (condition, Table 2) were used as predictors. Production was considered as a response variable with three levels ( Table 1). Based on linear correlation, r, and VIF, the variable secondary branch diameter 1 was excluded from the subsequent modeling process due to its high correlation.

Random Forest
The two key hyperparameters, the number of trees (ntree) and the number of variables in each split (mtry) were determined using the exhaustive method. The initial set of data consisted of 500 trees (ntree) and two variables at every node (mtry), which were determined using the square root of the total number of predictor variables √4 variables 2 . RF model parameters mtry and ntree were optimized in order to maximize the predictive accuracy of the estimation. Seventy-six percent of the field data (n = 76) was used to evaluate model performance. The OOB estimate for the RF model error rate was 30.67 percent. Using the exhaustive method, the number of variables (mtry) in the default RF model were assigned from one to four, and the other parameters were fixed to obtain the minimum error rate. Using the same method, ntree was taken to the transversal algorithm by setting the value of mtry. Figures 3 illustrates the results and processes. A set of parameter combinations was tested across five ntree levels (50, 100, 400, 500, and 1000) and four mtry levels (1 to 4). For stable results, the maximum value of ntree was 1000 [14,51]. The optimum setting of the parameters was mtry = 1 and ntree = 400, resulting in a model error rate of 28.05 percent (Figure 3).

Accuracy Assessment
The chosen model adequately classified 99% of the three resin production classes of Copal trees and achieved a Kappa index of 98% with the training data, 71% and 47% with the validation data, as well as the lowest OOB with 28% with the lowest number of classification trees at 400. The confusion matrices of the final model are shown in Table 3 for the training and validation sets.
From the training data, 100%, 100% and 97% of the total number of trees with high, low and medium resin production were correctly classified. Likewise, 100%, 99% and 100% of trees that did not correspond to high, low and medium resin production were correctly classified. From the validation data, 90%, 0%, and 62% of the total number of predicted trees with high, low and medium resin production were correctly classified by A set of parameter combinations was tested across five ntree levels (50, 100, 400, 500, and 1000) and four mtry levels (1 to 4). For stable results, the maximum value of ntree was 1000 [14,51]. The optimum setting of the parameters was mtry = 1 and ntree = 400, resulting in a model error rate of 28.05 percent (Figure 3).

Accuracy Assessment
The chosen model adequately classified 99% of the three resin production classes of Copal trees and achieved a Kappa index of 98% with the training data, 71% and 47% with the validation data, as well as the lowest OOB with 28% with the lowest number of classification trees at 400. The confusion matrices of the final model are shown in Table 3 for the training and validation sets. From the training data, 100%, 100% and 97% of the total number of trees with high, low and medium resin production were correctly classified. Likewise, 100%, 99% and 100% of trees that did not correspond to high, low and medium resin production were correctly classified. From the validation data, 90%, 0%, and 62% of the total number of predicted trees with high, low and medium resin production were correctly classified by the model, while 71%, 96%, and 80% of trees that did not correspond with high, medium, and low resin production were correctly classified by the model. Figure 4 illustrates the relative contribution of the four most important predictor variables according to the mean decrease in precision and the mean decrease in the Gini coefficient, obtained by following the OOB minimum error principle.  Figure 4 illustrates the relative contribution of the four most important predictor variables according to the mean decrease in precision and the mean decrease in the Gini coefficient, obtained by following the OOB minimum error principle.

Identification of Significant Variables
Considering the mean decrease in precision, the condition variable is the most important variable within the model (the selection of this variable is indicative of the largest change in model precision), followed by tree diameter, tree height, and secondary branch diameter 2. MDG indicated that stem diameter was the variable with the highest homogeneity of data, followed by branch diameter 2, tree height, and condition.

Partial Dependence Plot
Using the RF machine learning model, the partial dependencies of resin production on Copal tree morphological and condition variables were analyzed. Figure 5 demonstrates that the condition of the Copal trees determined the maximum yield of resin in the low category, while the maximum yield in the medium category was obtained when the trees were in good or regular condition and, most of the time, had a stem diameter of 33 cm, a total height between 2 and 2.50 m, and secondary branch diameters between 8 and 15 cm. The same graph shows that the highest production of high grade Copal resin occurs when the Copal tree is in good condition and reaches a diameter of 38 cm at the stem, a total height of 2 and 2.50 m, and a diameter of 22 and 32 cm at its secondary branches. Considering the mean decrease in precision, the condition variable is the most important variable within the model (the selection of this variable is indicative of the largest change in model precision), followed by tree diameter, tree height, and secondary branch diameter 2. MDG indicated that stem diameter was the variable with the highest homogeneity of data, followed by branch diameter 2, tree height, and condition.

Partial Dependence Plot
Using the RF machine learning model, the partial dependencies of resin production on Copal tree morphological and condition variables were analyzed. Figure 5 demonstrates that the condition of the Copal trees determined the maximum yield of resin in the low category, while the maximum yield in the medium category was obtained when the trees were in good or regular condition and, most of the time, had a stem diameter of 33 cm, a total height between 2 and 2.50 m, and secondary branch diameters between 8 and 15 cm. The same graph shows that the highest production of high grade Copal resin occurs when the Copal tree is in good condition and reaches a diameter of 38 cm at the stem, a total height of 2 and 2.50 m, and a diameter of 22 and 32 cm at its secondary branches.

Discussion
Accuracy assessment. The RF model with training data (99%) had a higher prediction accuracy than the RF model with validation data (71%), but both models correctly predicted resin production in Copal trees. Based on the Kappa index, the model produced an excellent classification of resin production classes in Copal trees using training data and a good classification using validation data [47].
Identification of significant variables. One of the variables that affected resin production in the model was condition, which was defined by morphological characteristics of the trees such as robustness and health [6,25]. In the study, one of the indicators of the tree's condition was the amount of bark present in the stem and the resin cavities. These indicators are affected by the age of the tree, resulting in a reduction in the number and diameter of resin ducts, thus affecting resin production [52]. A second indicator is the damage caused by pests and diseases, which reduce photosynthesis and cause water stress [53]. This leads to a decrease in resin production and reduces the useful life and quality of the trees [54]. Copal growers possess a high level of expertise in identifying the state of the tree and selecting those with resin extraction potential by observing the bark, branching, leaf cover, and even the aroma of the leaves, as well as by identifying the resin ducts present through past cuts [25]. The diameter of the main stem was the second most important variable. Larger diameter stems provide a greater harvest area [25] but produce less resin because it accumulates mainly in the periderm, secondary phloem, and bark, the latter only when the stem is young. During tree growth, the accumulation of resin in the bark decreases, resulting in a reduction of resin-producing sources, which can reduce overall production [55]. Bursera bipinnata (Sessé & Moc.) Engl., Bursera jorullensis (HBK) Engl., and Bursera glabrifolia (HBK) Engl. produce little resin when their diameters are smaller than 20 cm or larger than 50 cm. Exploiting trees with diameters less than 20 cm may reduce their future productivity [9,56,57]. Furthermore, there is evidence that the resin yield in Pinus species is higher in trees with larger diameters as compared to those with smaller diameters [29,58]. The third most important variable was height. Abad-Fitz et al. [25] found that Bursera bipinnata (Sessé & Moc.) Engl. produced little resin in trees up to 6 m tall compared with other Burseraceae species. In Pinus oocarpa Schiede ex Schltdl.

Discussion
Accuracy assessment. The RF model with training data (99%) had a higher prediction accuracy than the RF model with validation data (71%), but both models correctly predicted resin production in Copal trees. Based on the Kappa index, the model produced an excellent classification of resin production classes in Copal trees using training data and a good classification using validation data [47].
Identification of significant variables. One of the variables that affected resin production in the model was condition, which was defined by morphological characteristics of the trees such as robustness and health [6,25]. In the study, one of the indicators of the tree's condition was the amount of bark present in the stem and the resin cavities. These indicators are affected by the age of the tree, resulting in a reduction in the number and diameter of resin ducts, thus affecting resin production [52]. A second indicator is the damage caused by pests and diseases, which reduce photosynthesis and cause water stress [53]. This leads to a decrease in resin production and reduces the useful life and quality of the trees [54]. Copal growers possess a high level of expertise in identifying the state of the tree and selecting those with resin extraction potential by observing the bark, branching, leaf cover, and even the aroma of the leaves, as well as by identifying the resin ducts present through past cuts [25]. The diameter of the main stem was the second most important variable. Larger diameter stems provide a greater harvest area [25] but produce less resin because it accumulates mainly in the periderm, secondary phloem, and bark, the latter only when the stem is young. During tree growth, the accumulation of resin in the bark decreases, resulting in a reduction of resin-producing sources, which can reduce overall production [55]. Bursera bipinnata (Sessé & Moc.) Engl., Bursera jorullensis (HBK) Engl., and Bursera glabrifolia (HBK) Engl. produce little resin when their diameters are smaller than 20 cm or larger than 50 cm. Exploiting trees with diameters less than 20 cm may reduce their future productivity [9,56,57]. Furthermore, there is evidence that the resin yield in Pinus species is higher in trees with larger diameters as compared to those with smaller diameters [29,58]. The third most important variable was height. Abad-Fitz et al. [25] found that Bursera bipinnata (Sessé & Moc.) Engl. produced little resin in trees up to 6 m tall compared with other Burseraceae species. In Pinus oocarpa Schiede ex Schltdl. species, total tree height is not a determining factor in resin production [29], but it influences latex production in Hevea brasiliensis Müll. Arg. [59], although the variable of greatest interest is the diameter of the main stem. Finally, the diameter of secondary branches was the fourth most important variable.
Montúfar [8] reports that resin is extracted from the thickest branches of Copal trees, whereas in Pinus oocarpa Schiede ex Schltdl., the diameter of secondary branches does not matter, but trees with a high number of secondary branches do since they may be the most productive trees [29]. Concerning the use of incense (Clusia vel. sp. nov.), there is no information available on the influence of the diameter of secondary branches on resin production, but there is information on their importance, primarily in those located closer to the stem [27].
Partial Dependence Plot. The low resin production of Copal trees was determined solely by their unsatisfactory condition [25]. The condition of the trees was also a determining factor in resin production in the medium category, but it had a positive impact [25]. The regular condition of the trees resulted in a regular resin production in trees with stem diameters of 33 cm and heights of 2 and 2.50 m. This means that tree growth was controlled by tree diameter and height [55,57], since trees initially consumed more energy and nutrients, but as they matured, demands decreased; thus, the trees had more energy and nutrients to produce resin [55] to cope with different biotic and abiotic disturbances [60]. The mean resin production of secondary branches with diameters between 8 and 15 cm also increased, but not in other branches, as the resin concentration in the main stem was higher [55]. High category resin production was also affected by the condition of the trees [25]. As a result of the good condition of the trees, high resin production was observed in trees with stem diameters of approximately 38 cm and heights of between 2 and 2.5 m, diameters and heights that are within the optimal ranges for maximum resin production [25,29,55,57,58]. The resin production in the high category also showed an increased production rate when the diameter of the secondary branch was larger, specifically 22 and 32 cm. A similar high degree of influence was reported by Montúfar [8], who reported that Copal extraction is most successful when it is performed on the thickest branches.
As a result, Copal harvesting should move from its natural system to a sustainable system, such as agroforestry, in order to maximize resin production through management practices such as selection, propagation, and constant monitoring in order to preserve tropical deciduous forests [9]. Management practices for Copal populations are key to ensuring maximum resin production; thus, trees that do not meet the ideal characteristics should be removed. Furthermore, it is important to renew the Copal population by transplanting or seeding, as well as to maintain it by eliminating neighboring plants to increase canopy space and by suppressing epiphytes in order to accelerate the growth of the plants [25].

Conclusions
The present research revealed the most important morphological and physiological parameters that influence resin production in Copal trees, Bursera bipinnata (Sessé & Moc.) Engl. and Bursera copallifera (Sessé & Moc. Ex DC.) Bullock. As a result of the construction and evaluation of the random forest (RF) model, a new method has been developed to explain and predict Copal resin production in terms of their condition and morphological variables. The random forest classification algorithm achieved good overall accuracy and Kappa concordance index in both the training and validation data. Different levels of Copal resin production were determined primarily by the condition variable, followed by stem diameter, height, and diameter of secondary branches. This study determined optimal stem diameters of 33 and 38 cm, heights of 2 and 2.5 m, and secondary branch diameters of 8 to 15, 22, and 32 cm to achieve optimal resin production at different levels. This research helps to document the traditional management techniques of resin species in Mexico to contribute to forest conservation and to maintain the livelihoods of the people. However, Copal harvesting should be moved from its natural environment to a sustainable system, such as agroforestry, in order to maximize resin production by using management practices and in order to preserve tropical deciduous forests.