Genetic Variability of Oil Palm in Mexico: An Assessment Based on Microsatellite Markers

: Oil palm ( Elaeis guineensis Jacq.) has become the largest source of vegetable oil in the world. It is known that all existing genotypes of this species are related, so their genetic variability is considered to be low. In Mexico, all oil palm plantations are located in the southeast of the country, and they are established with different origins seeds, which has caused poor yields and resulted in the need to establish a genetic improvement program. Therefore, in this study, the extent of genetic diversity among 151 oil palm accessions from all producing regions of Mexico was assessed with twenty simple sequence repeats (SSRs) and seven random ampliﬁed microsatellite (RAM) markers. The markers utilized proved to be useful in revealing high existing genetic variability, with a total of 1218 and 708 alleles detected and polymorphic information content (PIC) of 0.96 and 0.91 for RAM and SSR, respectively. The genetic distance among all accessions of oil palm collected ranged between 31% and 82% for similarity. Accessions from Tabasco and Veracruz presented the greatest and smallest genetic diversity, respectively. These results can allow breeding strategies to be established for the genetic improvement of this crop in Mexico.


Introduction
Oil palm (Elaeis guineensis Jacq.) is a tropical perennial plant that belongs to the Arecaceae family [1] and is native to the coasts of the Gulf of Guinea in West and Central Africa [2].Although this is a monoecious species, it has been qualified as "temporally dioecious" because it produces functionally unisexual male and female inflorescences in an alternating cycle on the same plant, resulting in allogamous reproduction [3].
The discovery of palm oil in an Egyptian tomb (West Africa), dating back to 3000 BC, has led to the consideration that palm oil has been used for 5000 years [4].It is believed that this species passed to the American continent in the 16th century as a result of the slave trade.Despite this, it was not until 1940, when the first plantations were established in Honduras and Costa Rica, that E. guineensis was introduced to Ecuador, Guatemala, Venezuela, Peru and Mexico [5,6].
This perennial crop produces edible oil from fruit mesocarp and kernels [7,8], which have become an increasingly important dietary component for millions of people worldwide.From this oil, other by-products have also been derived and used mainly for the food, cosmetic, chemical, or biofuel industries, among others, which makes it highly attractive and useful for markets [9,10].
This species has mostly been cultivated in Indonesia, Malaysia and Thailand in Southeast Asia, for 59%, 25% and 4% of the world production, respectively.These are followed in the ranking by Colombia and Nigeria, both with 2% of production worldwide [11].
In Mexico, the first oil palm plantations were established in 1948 by small producers in the coastal zone of Chiapas.A second introductory stage of the plantations took place in 1982 when the first 287 hectares that established, with seeds originating in Costa Rica, the Ivory Coast and Indonesia.At the beginning of the 1990s, the area reached 2800 hectares.A third stage has been defined since 1996, with the Mexican government establishing a plantation program for the South and Southeastern regions of the country, in the states of Chiapas and Campeche and later in the Tabasco and Veracruz states [12].Initially, this crop was promoted to reduce the vegetable oil deficit and to offer a profitable option to Mexican producers [13].Nowadays, there are large plantations in these four states and a total of 18 extraction plants [10,14].Consequently, oil palm production has increased significantly, reaching 225,000 MT (million tons) and ranking Mexico during 2021 in 17th production position worldwide [10,11].The whole process in this country focuses mostly on palm oil extraction, not on solid by-products [14].
Worldwide, the development of oil palm cultivation is closely linked to research groups which established, in the 20th century's beginning, germplasm collections, which formed the genetic base for the improvement of oil palm [2,8].Crosses between plant materials with different fruit structures, such as pisifera (P) (shell-less: embryo rarely formed) and dura (D) (thick shell: less mesocarp), are used to produce tenera: the D × P hybrid (thin shell: more mesocarp).These tenera planting materials, on which the global oil palm seed industry largely relies, have led to a limited genetic base [15,16].
A determining factor in the significant increase in this crop yield was the discovery of the potential, in production terms, of the crosses between the African "origins" (where they were collected or originated) and Deli dura palms, both introduced and improved in Southeast Asia [2,8,17].Since only local genetic resources were used in these selection programs, the genetic gain achieved did not increase further, and current commercial material has remained mostly of that type [2].The efficient genetic base is, therefore, very narrow.This is not only a drawback for improving yields but especially for developing material that is tolerant to certain important endemic diseases [8].
The area where oil palm production has been established in Mexico presents great contrasts in terms of edapho-climatic conditions and agronomic management.To establish plantations, the most frequently planted varieties were Deli (D) × Avros (P), Deli × Ghana, Deli × Nigeria and Deli × Ekona [18,19], although these have also used Colombian and Costa Rican origin seeds, in addition to some Guatemalan varieties [10], in an attempt for these varieties to adapt and exhibit good levels of productivity.
Using conventional breeding strategies in oil palm to improve characteristics such as yield, oil quality, or resistance to biotic and abiotic stresses is slow due to this species' long life cycle and the lack of genetic homozygosity in parental materials used in current breeding [20].Therefore, the use of molecular markers associated with agronomic characteristics of importance for the selection and improvement of oil palm could be very useful [21].At the same time, it has been proved that the proper identification and characterization of plant materials is essential for the successful conservation of plant genetic resources and ensuring their sustainable use.Molecular techniques such as DNA barcoding (genetic fingerprinting), random amplified polymorphic DNA (RAPD), microsatellites (SSR) and single nucleotide polymorphisms (SNP) have been used for several years in many studies to reveal the genetic diversity of different plant species [22].
Polymorphisms in the genome of oil palm (Elaeis guineensis Jacq.) and American oil palm (E.oleifera [H.B.K.] Cortés) have been detected and analyzed in several studies, which have allowed molecular markers to be used, among others, to analysis genetic variability and establish of phylogenetic relationships in these species [23][24][25][26], genetic linkage maps [27,28], as well as the localization and tracking of valuable genes [26,28].
To develop a breeding program for oil palm (and for any other crop), it is crucial to have a well-characterized germplasm.In this species' case, it is necessary not only to precisely distinguish the accessions of E. guineensis from E. oleifera but also to find the interspecific differences for which the most accurate method is DNA marker use [24,26].
For all the reasons stated above, the objective of this study was to molecularly characterize genetic variability present in oil palm cultivation in Mexico, using microsatellites (SSR) and random amplified microsatellites (RAM), which allow the genetic similarity of different palms to be evaluated and analyzed, and could establish the bases, among other, for monitoring interest characteristics in oil palm in the future, and the implementation of a genetic improvement program for this crop.

Plant Material
The collection of oil palm samples was conducted between October and November 2019 in different places in the four producing states of this species in Mexico (Campeche, Chiapas, Tabasco and Veracruz).These locations were selected according to a previous analysis regarding the genetic variation observed in producers for the plant material used.
To carry out the sampling, a young leaf was taken, preferably leaf 17, in adult plants at full production.The material had to be healthy to avoid pathogenic DNA interference in the results.

DNA Isolation
Genomic DNA was extracted from the samples collected following the plant DNA extraction protocol developed at GeMBio [30].
The concentration and purity of the genomic DNA were determined by reading absorbance at 260 and 280 nm in a spectrophotometer (Nanodrop 2000, Madrid, Spain), and DNA was diluted to 20 ng/µL.The quality and integrity of DNA were determined by electrophoresis in a 0.8% (w/v) agarose gel, staining with ethidium bromide, and visualized in a U.V. light transilluminator (Bio-Rad Gel Doc Ez Imager, Hercules, CA, USA).
Additionally, in order to verify oil palm, DNA samples were free of inhibitors, and PCR reactions were performed with the 16S1/16S2 primers, specific for chloroplasts (endogenous gene), which amplified a region of the 16S ribosomal gene, generating a product of 330 bp [31].

SSR and RAM Markers Analysis
Twenty one SSR markers reported by Billote et al. [29] were used to determine genetic diversity in oil palm (Table 1).Seven RAM markers used before [32], in a genetic diversity study on oil palm, were also utilized to test the samples of oil palm in this work (Table 2).
Table 2. RAM markers [32] used in this study.All amplifications were developed in a final volume of 25 µL for a reaction containing 50 ng of template genomic DNA, with a 1X PCR buffer (20 mM Tris-HCl, pH 8.4 and 50 mM KCl) (Invitrogen, Carslbad, CA, USA); 1.5 mM MgCl 2 (Invitrogen, Carslbad, CA, USA); 200 µM dNTPs mix (Invitrogen, Carslbad, CA, USA); 1 U of Taq DNA polymerase (recombinant) (Invitrogen, Carslbad, CA, USA) and 0.2 µM of each of the primers used (Tables 1 and 2).The amplification conditions included initial denaturation for 1 min at 95 • C, followed by 35 cycles of amplification, each at 94 • C for 30 s, 52 • C for 1 min and 72 • C for two minutes and a final extension step for eight minutes at 72 • C.

Marker
All the fragments obtained with the SSR and RAM markers were separated on 6% polyacrylamide gels in a ratio of 29:1 (acrylamide: bisacrylamide) (MilliporeSigma, St. Louis, MO, USA).Gels were run at a constant voltage of 200 V for 2.5 h in a 1X Tris-Glycine (TG) (MilliporeSigma, St. Louis, MO, USA) buffer.Band sizes were determined by comparison with 1 Kb plus the DNA molecular size marker (Invitrogen, Carslbad, CA, USA).Gel staining was carried out using the silver staining technique [33].

Data Analysis
PCR fingerprinting data were scored as discrete variables, using "1" to indicate the presence of a fragment and "0" to indicate the absence of a fragment.Binary data obtained by scoring SSR and RAM profiles and obtained with different primers, both individually and cumulatively, were employed to construct a similarity matrix using Jaccard's coefficients [34] and the unweighted pair group method.The unweighted pair group method with arithmetic mean (UPGMA) was used to generate a dendrogram based on Jaccard's similarity coefficient with the sequential agglomerative hierarchical and nested (SAHN) clustering module of NTSYSpc software, version 2.02e [35].Principal coordinate analysis (PCoA) was also performed to separate the oil palm accessions.The data were further employed to calculate polymorphism information content (PIC) values according to the equation developed by Anderson et al. [36].

Elaeis Guineensis Plant and DNA Samples
There were 151 collected samples of E. guineensis plants from 19 different localities in the four oil palm producing states in Mexico (Table 3, Figure 1, Table S1).All samples were used for this study.The genomic DNA of each sample was isolated, and their concentrations were in the range of 115.4 to 1036.8 ng/µL, with a mean of 538.96 ng/µL.The purity value of DNAs (A260/A280 ratio) ranged from 1.15 to 2.09.
Additionally, a PCR product of 330 bp was obtained for all samples when they were amplified for the chloroplast 16S ribosomal region, which indicated that there were no inhibitors for PCR reactions in these DNA samples.

SSR and RAM Markers Analysis
In Table 4, the allelic status and PIC obtained with molecular markers used are presented.When PCR tests with 21 microsatellites were conducted to obtain genetic fingerprints of the oil palm samples, the marker mEgCIR0046 was eliminated from the study because some samples did not present amplification.With the remaining 20 SSRs, a PIC value of 0.91 was revealed.The mEgCIR67 marker was the one that generated the most polymorphic information, while the mEgCIR304 and mEgCIR326 primers were monomorphic for all the samples under study.In Figures S1 and S2, amplification products of 151 samples obtained with two of SSR markers with more polymorphic bands (mEgCIR0476 and mEg-CIR0221 respectively) are presented.The genomic DNA of each sample was isolated, and their concentrations were in the range of 115.4 to 1036.8 ng/µL, with a mean of 538.96 ng/µL.The purity value of DNAs (A260/A280 ratio) ranged from 1.15 to 2.09.
Additionally, a PCR product of 330 bp was obtained for all samples when they were amplified for the chloroplast 16S ribosomal region, which indicated that there were no inhibitors for PCR reactions in these DNA samples.

SSR and RAM Markers Analysis
In Table 4, the allelic status and PIC obtained with molecular markers used are presented.When PCR tests with 21 microsatellites were conducted to obtain genetic fingerprints of the oil palm samples, the marker mEgCIR0046 was eliminated from the study because some samples did not present amplification.With the remaining 20 SSRs, a PIC value of 0.91 was revealed.The mEgCIR67 marker was the one that generated the most polymorphic information, while the mEgCIR304 and mEgCIR326 primers were monomorphic for all the samples under study.In Figures S1 and S2, amplification products of 151 samples obtained with two of SSR markers with more polymorphic bands and mEgCIR0221 respectively) are presented.Tests with RAM primers showed an average polymorphism value of 0.96, which was higher than that revealed with the SSR primers.The CA and CT markers were the ones that generated the most polymorphic information in this case.
A dendrogram of the 151 samples under study was generated (Figure 2) with data from the absence-presence matrices of the 27 processed primers (20 SSR + 7 RAM).Dendrogram analysis showed two large clades, where the 151 accessions were split almost in half, with 78 accessions in clade A and 73 in clade B. The most genetically similar accessions (0.82 similarity coefficient) were P-150 and P-151 from the state of Veracruz in clade A (Figure 2A) and P31 and P-32 (0.81 similarity coefficient) from the state of Campeche in clade B (Figure 2B).Subclade G grouped accessions with the lowest rank in the similarity coefficient (0.58-0.79) with those most similar genetically, while in subclade F, those accessions with the highest rank of similarity (0.42-0.82) were the most different.
Practically, the majority of accessions grouped in clade A geographically corresponded to plantations established in the northern zone of the states of Veracruz, Tabasco and Campeche.By contrast, those of clade B were more geographically dispersed, although the highest percentage was located in the southern zone in the state of Chiapas.
Figure 3 shows the principal coordinate analysis (PCoA) of the 151 oil palm accessions based on their molecular marker profiles, which distinguished between two groups, A and B. All accessions were scattered according to their similarity.The first three components explain 44.2% of the variation.and Campeche.By contrast, those of clade B were more geographically dispersed, although the highest percentage was located in the southern zone in the state of Chiapas.
Figure 3 shows the principal coordinate analysis (PCoA) of the 151 oil palm accessions based on their molecular marker profiles, which distinguished between two groups, A and B. All accessions were sca ered according to their similarity.The first three components explain 44.2% of the variation.In the two-dimensional representation, all subgroups within cluster A appeared very close, but on the three-dimensional plot, it appeared that there was a separation between subgroups K and J from the rest of the subgroups of A. Regarding clade B, in both representations, the groupings were very similar.In both dimensional representations, both H and M subgroups were far apart, indicating that they were genetically very different.

Discussion
In oil palm-producing countries, the evaluation of all genetic material is an activity that takes place in the field.Considering that this is a perennial crop with a very long cultivation cycle, carrying out genetic improvement in a conventional manner entails studying a large number of plants, generally in extensive areas, which translates into high costs for any program over a considerable period of time.All of this conspires against progress and improvement.However, for several decades now, molecular biology techniques have universally demonstrated their usefulness in detecting genetic variability between individuals and populations, which makes their application vital and necessary in breeding programs for any crop, but above all, one with the aforementioned characteristics of palm oil [31,36].In the two-dimensional representation, all subgroups within cluster A appeared very close, but on the three-dimensional plot, it appeared that there was a separation between subgroups K and J from the rest of the subgroups of A. Regarding clade B, in both representations, the groupings were very similar.In both dimensional representations, both H and M subgroups were far apart, indicating that they were genetically very different.

Discussion
In oil palm-producing countries, the evaluation of all genetic material is an activity that takes place in the field.Considering that this is a perennial crop with a very long cultivation cycle, carrying out genetic improvement in a conventional manner entails studying a large number of plants, generally in extensive areas, which translates into high costs for any program over a considerable period of time.All of this conspires against progress and improvement.However, for several decades now, molecular biology techniques have universally demonstrated their usefulness in detecting genetic variability between individuals and populations, which makes their application vital and necessary in breeding programs for any crop, but above all, one with the aforementioned characteristics of palm oil [31,36].
Parent selection is one of the most important criteria, if not the most, when implementing a breeding program since the efficiency of these programs relies on the selection of contrasting genotypes with high values of genetic distance, which makes it possible to obtain hybrids with high values of heterosis.For this reason, knowledge about the genetic diversity and genetic relationships between the materials to be improved constitutes very valuable information to implement the best improvement strategies in oil palm cultivation [37][38][39].Different molecular markers have been used in several studies, with different purposes, carried out since the 1990s on Elaeis guineensis, many of which have revealed important findings regarding its genetic variability.This has led to the recognition that, with the genetic distance measured by molecular data, promising crosses between oil palm materials can be predicted while also reducing the total number of experimental progenies and related costs [37].
The adequate selection of genetic components associated with the most important traits for oil palm productivity can undoubtedly advance any breeding program for this crop.However, it depends on the genetic background of the existing populations; therefore, information on genetic variability in these populations allows breeders to make important decisions regarding which materials to use as parents [39].
In this study, to assess genetic variety among oil palm parental lines in Nigeria, using ten SSR markers, a high score (0.70) was found for genetic diversity [40].Also, in Nigeria, 107 SSR markers were used to examine genetic diversity and their relatedness to 186 palms from an oil palm germplasm, and on average, 8.67 alleles per SSR locus were scored [41].In another work [42], where genetic diversity among 49 populations from ten African countries was explored, three breeding materials and one semi-wild material were assessed with 16 SSR markers, the average genetic distance among accessions was 0.769, and 209 alleles were detected, accounting for an average of 13.1 alleles per locus.In our work, in SSRs' case, we a greater number of average alleles per locus (37), which also ranged in genetic diversity and were wider than those reported in other studies referred to here.
In Colombia [32], a study was carried out to understand the genetic diversity of 51 oil palm genotypes from the Congo; seven RAM markers were used (the same RAM markers used in our work), and a genetic similarity of 0.52 was found.Also, 241 alleles were generated, and the number of polymorphic loci ranged from 14 to 46 for CGA and ACA primers, respectively.The results obtained with these markers in Mexico were somewhat different since the markers that in our case showed higher polymorphisms were CA and CT, with 218 and 203 polymorphic loci, respectively, although GCA and ACA also had higher values (see Table 4).
All these results can be explained by the heterogeneous origin of the plantations in Mexico and the greater number of accessions analyzed in this study.
The principal coordinate analysis showed two groups, A and B, that correspond to the grouping observed in the dendrogram.The H and M subgroups, which were found to be the most genetically different, are represented by accessions from the states of Tabasco and Campeche (H) and Chiapas (M), which might explain this considerable difference.
In general, our study found higher values of genetic variability among accessions, even among several that came from the same locality.A greater number of alleles were also found with the markers used.As previously mentioned, the oil palm populations sampled in Mexico come from very diverse origins; in some cases, they have been planted randomly without taking care to spatially separate those that come from different seeds.All of this could explain the unexpected genetic diversity that has been found in the germplasm currently used in production, hence the yield and quality problems faced by palm oil producers in Mexico.

Conclusions
The oil palm populations sampled in the four producing states of Mexico had a higher-than-expected genetic diversity.The greatest genetic variability was found among plantations of Tabasco state.Molecular markers SSR and RAM were very useful in revealing polymorphisms between accessions.All this information could allow breeders to make important decisions regarding which materials to use for the genetic improvement of this crop in Mexico, as well as, based on the genetic fingerprints obtained, to develop STStype markers specific for accessions of interest in order to establish some marker-assisted breeding strategy.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/agriculture13091772/s1, Figure S1 S1: Data from location (municipality and state) of 151 oil palm samples collected in Mexico.

5
These designations were used for the degenerate sites: H (A or T or C); B (G or T or C); V (G or A or C); Y (C or T) and D (G or A or T).

Figure 1 .
Figure 1.Geographical location of collection sites of the oil palm samples in the south and southeastern area of Mexico (each ellipse shows the location key used).

Figure 1 .
Figure 1.Geographical location of collection sites of the oil palm samples in the south and southeastern area of Mexico (each ellipse shows the location key used).
from the absence-presence matrices of the 27 processed primers (20 SSR + 7 RAM).Dendrogram analysis showed two large clades, where the 151 accessions were split almost in half, with 78 accessions in clade A and 73 in clade B. The most genetically similar accessions (0.82 similarity coefficient) were P-150 and P-151 from the state of Veracruz in clade A (Figure2A) and P31 and P-32 (0.81 similarity coefficient) from the state of Campeche in clade B (Figure2B).

Figure 2 .
Figure 2. SSR and RAM consensus of unweighted pair-group arithmetic average (UPGMA) dendrogram based on Jaccard's coefficient showing the relationships between 151 oil palm accessions determined using cumulative data.(A) Clade A; (B) Clade B.

Figure 2 .
Figure 2. SSR and RAM consensus of unweighted pair-group arithmetic average (UPGMA) dendrogram based on Jaccard's coefficient showing the relationships between 151 oil palm accessions determined using cumulative data.(A) Clade A; (B) Clade B.

:
Amplification products of 151 samples of Oil Palm obtained in PCR test with SSR marker mEgCIR0476.MM: Molecular marker (1 Kb plus); Figure S2: Amplification products of 151 samples of Oil Palm obtained in PCR test with SSR marker mEgCIR0221.MM: Molecular marker (1Kb plus) Table

Table 3 .
Data from the collection of oil palm samples in Mexico, state, municipality and the number of samples obtained.

Table 4 .
Allelic status and polymorphic information content (PIC) obtained with molecular markers used in this work.

Table 4 .
Allelic status and polymorphic information content (PIC) obtained with molecular markers used in this work.
PIC: polymorphic information content, bp: base pair.The average values, in the case of the amplified bands, are rounded to the nearest number.