Biomorphological Characterization of Brazilian Capsicum Chinense Jacq. Germplasm

: Loss of biodiversity and ecosystem degradation become major concerns worldwide, making the conservation process an important strategy for maintaining biodiversity. Capsicum chinense Jacq. is the most Brazilian species of the genus, with representatives in di ﬀ erent biomes. Anthropic pressure, such as burns, real estate speculation, and changing cultivation habit has led to risks of genetic erosion. Conservation and characterization of conserved accessions are paramount to ensure genetic diversity, useful for the bioeconomy and for genetic improvement. We report the characterization of 55 C. chinense accessions from four di ﬀ erent regions of Brazil and one accession from Peru. The accessions were characterized based on 37 morpho-agronomic variables, Inter Simple Sequence Repeats—ISSR and Simple Sequence Repeat—SSR. Qualitative descriptors were analyzed using a descriptive statistical, while the quantitative descriptors were analyzed via F test and signiﬁcant di ﬀ erences in mean values were separated using Scott-Knott test. The relative contribution of each quantitative trait was determined. A correlation between morphological and molecular distances was calculated. Color of ripe fruit and fruit shape had the largest number of observed classes. Six distinct groups and the joint analysis presented an entanglement rate of 0.58, evidencing the divergence of accessions between the groups of both dendrograms. Bayesian analysis allowed the distinction of two clusters for SSR. A signiﬁcant variability was observed among accession with potential to integrate several breeding programs.


Phenotyping
The 55 C. chinense accessions from four Brazilian region and one accession from Peru (Supplementary Table S1) are part of the gene bank of Capsicum spp. from the Universidade Estadual do Norte Fluminense Darcy Ribeiro-UENF, located in Campos dos Goytacazes, RJ, Brazil. Seeds from each accession were sown in 128-cell polystyrene trays containing the commercial substrate. Three seeds were placed per cell and after germination and growth of seedlings, two seedlings from each repetition were individually transferred to five-liters pots containing a mixture of soil, sand, and manure (1: 1: 1 ratio) with subsequent thinning. The seeds used had an average of 4 years of storage and were all cultivated varieties. The plants were kept in greenhouse conditions located in the experimental area of the same institution. There was no humidity and temperature control inside the greenhouse and the irrigation was done once a day. The experiment was conducted in completely randomized design with five repetitions, each repetition was composed of a plant and they were grown following practices recommended for chili pepper cultivation. A digital thermometer recorded the temperature and humidity conditions of the greenhouse. The irrigation shifts vary according to the plants needs at different stages of development.
The accessions were characterized based on 37 morpho-agronomic variables (Supplementary  Tables S2 and S3), sixteen quantitative and twenty-one qualitative, following the Bioversity International (formerly International Plant Genetic Resources Institute, IPGRI, 1995) [38]. Using this approach, other researchers and gene bank curators can compare and easily identify traits of common interest in our collection. All phenotypic information that we generate on this study is being used to fill a platform that share genebank information with other researchers.

Genotyping
Genomic DNA was isolated from pools of fresh young leaves from five individuals accessions according to the Doyle and Doyle (1990) [39] protocol with modifications [40]. The genomic quality and integrity DNA were checked by electrophoresis for ISSR and SSR analysis using 1% agarose gel.
For ISSR analysis 35 ISSR primers were preliminarily evaluated to determine optimal amplification reaction conditions (annealing temperature and cycling time). Eighteen primers out of the thirty-five resulted in profiles of well-separated fragments and were selected for molecular characterization of the C. chinense accessions. A list of DNA sequence and melting temperatures (Tm) associated with ISSR markers can be found in Supplementary Table S4. PCR amplification of 10 ng DNA (2 µL) was performed in 13 µL reaction containing 1.5 µL dNTP mix (0.2 mM), 1.0 µL MgCl 2 (1.9 mM), 1.3 µL PCR Buffer 1X, 0.12 µL Taq DNA polymerase (0.6 U) and 1 µL of each primer (5 µM) forward and reverse. The cycling conditions were 5 min at 94 • C, followed by 35 cycles of 1 min at 94 • C, 1 min at 48-52 • C (temperature previously determined for each primer), 3 min at 72 • C, and a final extension of 7 min at 72 • C. PCR products were separated by electrophoresis in agarose gel (2%), stained with 6 µL of a mix with gel red and blue juice (1:1), visualized on a transilluminator using UV illumination and photographed using documentation system MiniBis Pro (Bio-Imaging Systems). The ISSR results were interpreted as presence or absence of bands and expressed in a binary matrix.
For SSR analysis 17 SSR primers were selected from a set of 47 SSR primers previously developed and optimized for C. annuum [41]. The screening was based on the amplification of well-separated fragments at different annealing temperatures (56-66 • C). A list of DNA sequence and melting temperatures (Tm) associated with SSR markers can be found in Supplementary Table S5. The PCR analysis were performed as described above. The amplified products were separated by electrophoresis in MetaPhor agarose gel (4%), stained with 6 µL of a mix with gel red and blue juice (1:1), visualized on a transilluminator using UV illumination and photographed using documentation system MiniBis Pro (Bio-Imaging Systems). MetaPhor agarose gel which produces a high resolution gel that allows the visualization of fragments between 20 bp-800 bp. According to manufacturer, "MetaPhor agarose (FMC or Cambrex Corporation, East Rutherford, NJ, USA) is an intermediate melting temperature agarose (75 • C) that provides twice the resolution capabilities of the finest sieving agarose products. Using submarine gel electrophoresis, MetaPhor agarose gives high resolution separation of 20 to 800 bp DNA fragments that differ in size by 2%, which approximates the resolution of polyacrylamide gels". The SSR results were converted into numeric code per locus for each allele. With the polymorphic loci, an array of numerical data was constructed, assigning values from 1 to the maximum number of alleles per locus.

Data Analysis
Qualitative descriptors were analyzed using a descriptive statistical, while the quantitative descriptors were analyzed via the F test in the analysis of variance (ANOVA) and significant differences in mean values were separated using Scott-Knott test at α = 0.05. The relative contribution of each quantitative traits on the phenotypic divergence was calculated using the method proposed by Singh [42]. The estimates of Pearson's correlation coefficients were obtained based on the average of repetitions between the characters combined two by two. Statistical analyzes were performed with GENES software [43]. The values of the 37 morpho-agronomic variables were used for Ward clustering analysis based on the Gower distance [44]. For this analyze we used R software [45].
The DNA samples of C. chinense accessions were amplified with ISSR primers and analyzed by visual assessment of the most consistent bands (Supplementary Figure S1). The matrix of genetic dissimilarities was obtained by arithmetic complement of the Jaccard index.
Amplification data from the SSR primers were converted to numeric code per locus for each allele (Supplementary Figure S2). The polymorphic loci were used to generate a numerical matrix. Genetic distances among C. chinense accessions was evaluated by Index Weighted performed with the GENES software [43] and clustered with hierarchical algorithms, such as the unweighted pair-group method by arithmetic averages (UPGMA) performed with the R program [45]. The number of alleles (Na), effective number of alleles (Ne), expected heterozygosity (He), observed heterozygosity (Ho), diversity index (I), and fixation index (F) were performed with GENES program [43].
A Bayesian-based cluster analysis was performed to determine the optimal number of genetic clusters using Structure 2.3.4 software [46] according to the method described by Evanno et al. (2005) [47], with 10,000 repetitions. K values between 1 and 10 were tested, with 10 independent iterations for each group.
Analyses of the ISSR and SSR data were performed using the Gower distance matrix and Ward clustering analysis. The correlation between molecular and morphological distances was determined using the distance matrices relative to the 37 morpho-agronomic descriptors and ISSR and SSR markers. For these analyses, we used the Dendextend package in the R program [45].

Morpho-Agronomic Characterization Data
The qualitative traits stem shape (SS), stem pubescence (SP), leaf shape (LS), corolla shape (CS), cotyledon leaf color (CLC), and cotyledon leaf shape (CLS) were monomorphic, described as angular, sparse, oval, round, green and lanceolate, respectively. The hypocotyl color was green in 78.2% of the accessions and purple in 21.8% of the remaining accessions. For the attribute stem color (SC), three colors were observed, ranging from green (85.45%), green with purple stripes (12.7%) and purple (1.85%). Anthocyanin color of the node (ACN) was absent in 73.5% of the accessions and in those in which it was observed it was purple (10.1%), light purple (9.1%), and dark purple (7.3%).
Most accessions exhibited plant growth habit (PGH) described as erect (80%) and the remainder (20%) was compact. The flower position (FP) intermediate was observed in 50.9% of the accessions, while standing position and pendent position were observed in 38.2 and 10.9% of the accessions, respectively. For corolla color (CC), only one accession presented a pale-yellow corolla and the others were white-green. For the anther color (AC) the purple color was predominant (56.36%) followed by blue color (38.2%) and yellow color (5.5%). In relation to the number of flowers per axilla (NFA), 74.54% of the accessions presented two flowers per node and 25.45% three flowers per node.
Wide variability was observed for quantitative descriptors mean values. These data are described in Supplementary Table S6 wherein Scott-Knott (1974) analysis revealed the arrangement of distinct groups for each attribute. The plant height (PH) ranged from 49 to 156 cm and the accessions were settled in seven distinct groups. The canopy diameter (CD) and stem diameter (SD) allowed the establishment of four groups and ranged from 44.18 to 123.25 cm and 0.96 to 2.08 cm, respectively.
The time of the plant development stages was different among the accessions. Days for germination (DG) ranged from 5 to 11 days, days for flowering (DFL) ranged from 62 to 92 days, days for fruiting (DFR) ranged from 62 to 117 days and days for fruit maturation (DM) ranged from 135 to 217 days. Based on four attributes, accessions were classified in seven, four, five, and six distinct groups, respectively.
Fruits attribute such as fruit weight (FW) ranged from 1.04 to 18.61g, fruit length (FL) ranged from 7.85 to 84.93 mm and fruit diameter (FD) ranged from 8.99 a 34.44 mm and according to these attributes the accessions were classified into nine, eight and five distinct groups, respectively. Cotyledon leaf length (CLL) ranged from 12.81 to 25.96 mm and cotyledon leaf diameter (CLD) ranged from 6.04 to 8.05 mm, and the accessions were arranged in seven and four distinct groups, respectively. The leaf length (LL) and leaf width (LW) traits also varied among the accessions, presenting values from 6.5 to 9.5 and 3.8 to 5.9 cm, respectively. It was possible to classify them in four and five different groups according to these two attributes. The trait peduncle length (PL) ranged from 19 to 43 mm and the accessions were separated into seven groups. The pericarp thickness (PT) varied from 1.38 a 3.08 mm and four groups were formed.
The relative contribution of each of these quantitative traits on the phenotypic divergence of the C. chinense accessions was estimated using the method proposed by Singh (1981) [42] (Table 1). It was possible to observe that the fruit diameter (FD) (9.83%) was the attribute that most contributed to the discrimination of the genotypes, followed by days for germination (DG) (9.1%) and fruit weight (FW) (8.84%). In contrast, cotyledon leaf length (CLL) was the characteristic that presented less relative importance for the phenotypic divergence (2.85%).
The presence of correlation between variables was analyzed using Pearson's correlation coefficient (Supplementary Table S7). High correlations were found for the characteristic plant height and plant canopy width (75%), fruit diameter and pericarp thickness (71%), days for fruiting and maturation (70%), and moderate correlation for fruit diameter and fruit weight (69%), pericarp thickness and fruit weight (66%), and leaf length and leaf diameter (64%). Moderate negative correlations were found for days for germination and fruit diameter (−39%), days for germination and cotyledonous leaf diameter (−38%), and leaf width and canopy diameter (−31%).

Morpho-Agronomic Diversity
The identification of homogeneous groups on the 55 accessions of C. chinense was performed by Ward clustering analysis. The dendrogram obtained shows the distinction of six groups (Figures 1a  and 2). Group I was composed of four accessions, which presented fruits with four locules, semi-rough surface and orange color when immature. Group II was composed of six accessions, which presented flowers in the upright position and purple-colored cotyledon and stem. Group III was composed of 21 accessions. All with purple anthers and fruits with smooth surface and pungents. Group IV was composed of nine accessions in which the following descriptors were predominant: days for germination (DG) ranged from 7 to 9 days; stem diameter (SD) varying from 0.96 to 1.33 mm; growth habit (PGH) described as compact, fruit color at intermediate stage was green and fruit color at mature stage was orange, except for UENF 1730, the only Peruvian accession, that fruit color on mature stage was brown. Group V comprised seven accessions. Red fruit color at mature stage, triangular and elongated shape, presence of a neck at the base of the fruit are some of the characteristics of this group. These accessions also presented fruits with a small variation in length (35.53 to 44.46 mm). Group VI is characterized by clustering plants with compact growth habit, lack of a neck at the base of the fruit and fruits with three locules, and plant height (PH) ranged from 58.62 to 86.22 cm.

Molecular Characterization Data
For ISSR analysis, 15 primers out of the eighteen detected polymorphisms among accessions. These primers produced 97 fragments, of which 46 were polymorphic, representing 47% polymorphism.
For SSR analysis, from 47 pairs of microsatellite primers, previously developed and optimized for C. annuum [41] only 17 resulted in amplification and nine primers out of the 17 detected polymorphisms among C. chinense accessions.
The values for the major allele frequency (MAF) ranged from 0.5 to 0.94, with an average of 0.72 ( Table 2). The diversity index (I) ranged from 0.10 to 0.50, with an average of 0.38. The observed heterozygosity (Ho) ranged from zero to 0.44, with a mean of 0.10, and the fixation index ranged from −0.14 to 1.00, with a mean of 0.77.
Following Bayesian analysis, the highest ∆K value was obtained when two clusters were formed, obtaining groups I and II (Figure 3). Group I gathered 41 accessions, while group II gathered 14 accessions.

Molecular Diversity
A combined grouping was obtained for the ISSR and SSR markers by the Ward clustering analysis. The dendrogram obtained shows the distinction of six groups (Figure 1b). Group I was composed of 11 accessions, which presented predominance of red fruit color on the mature stage. Group II was composed of seventeen accessions which presented similarity regarding their growth habit. Group III was composed of nine accessions, with predominance of flowers in intermediate position. Group IV was composed of six accessions in which presence of capsaicin were predominant. Group V also comprised six accessions with triangular shape as a common characteristic of this group. Group VI is characterized by clustering six accessions with two flowers per node.
The entanglement obtained was 0.58, evidencing the divergence in the distribution of the accessions in molecular and morpho-agronomic dendrograms. Most accessions were allocated in different groups in both dendrograms, with the exception of UENF 1736, UENF 1722, UENF 1753, UENF 1791, and UENF 2044 accessions, which were allocated to group VI in both molecular and morpho-agronomic dendrograms.

Discussion
The characterization of genetic diversity among Capsicum genus has become essential for the breeding programs of pepper species, important as a vegetable and spice crop world-wide. The investment in screening of this diversity has the potential to reveal many traditional varieties with distinguished values. In order to do this, we have described in this study the characterization of 55 C. chinense accessions from different regions of Brazil. The characterization was performed using qualitative and quantitative descriptors.
The consumer market for fresh peppers has good acceptance for different pepper sizes [48]. In agreement with our description, similar fruit weight was found by Castro and Dávila (2008) [49] (1. A thicker pericarp is an important aspect in fruit quality as it may increase the degree of resistance to pathogens and parasites during post-harvest and give a better appearance for consumers than fruits with a thinner pericarp [53]. In addition, among the attributes that most contributed to the discrimination of the genotypes are the fruit diameter (FD) (9.83%) and fruit weight (FW) (8.84%). This abundant phenotypic variation, including traits such as fruit shapes, color, and sizes are very interesting for current pepper breeding programs, focused on meet consumer preferences and product differentiation. The characterization of variability for fruit-related characteristics to select promising accessions has been also shown in other Capsicum sp. studies [10,[12][13][14]24,29,[35][36][37].
Pearson's correlation coefficient is a measure of linear association between two quantitative variables, in which one characteristic can be selected based on another. Pessoa et al. (2019) [54] evaluating the inheritance of seedling and plant traits in ornamental pepper, identified positive correlations for the traits plant height, first bifurcation height and canopy width, 96%, 88%, and 65%, respectively. They also affirmed that the positive correlation for most of the traits indicates that the recessive alleles were generally responsible for the increase in these traits. Agyare et al. (2016) [55] assessing the genetic diversity of agro-morphological characters of pepper (Capsicum sp.) and evaluating characteristics similar to those evaluated in this work identified positive correlation between cotyledon leaf length and fruiting (30.50%) and negative correlation with cotyledon leaf width (−37.50). The cotyledon leaf width had a negative association with most morpho-agronomic characteristics in Capsicum and tomato [55,56]. Andrade Junior et al. (2018) [57] evaluating C. annuum and C. chinense also found a positive correlation between the pericarp thickness and the fruit characteristic (fruit length) of 83% and negatively with the plant height (−85%). Peña-Yam et al. (2019) [58] in C. chinense identified 49.2% entre pericarp thickness and fruit weight. The fruit shape and the pericarp thickness are important traits for the classification of accessions, varieties and cultivars of peppers. The positive correlation between these characteristics can directly affect the flavoring compounds essential to the Capsicum fruits [59].
Accessions sampling was determined in order to contemplate the diversity of Brazilian biomes. All regions of Brazil, except the South, and four biomes were sampled (Pantanal, Cerrado, Amazon and Atlantic Forest). Accessions from Caceres, MT, corresponded to 42% of the total analyzed, due to the high number of accession found per property at the time of collection, on average seven types of pepper per property. The municipality of Caceres-MT shares a border with Bolivia, the origin center of Capsicum genus. This municipality has an area of 29,031 km 2 and three biomes are present in its territory: Pantanal, Cerrado, and Amazon. Capsicum accessions collected in this region are expected to have a high adaptive capacity, since they are present in places that annually register the occurrence of flood and fires, in a natural way, in addition to anthropic actions (including fires and deforestation). The collections were carried out after a very strong rainy period, and plants above two meters in height were observed with a stem up to 10 cm in diameter, with bush size. At the time of collection, it was explained by locals that C. chinense plants lose their leaves during the time they are flooded and when the soil dries, there is a regrowth, characterizing them as perennials, which differentiates them from peppers from other regions of Brazil. These features raise the hypothesis that these accessions have great potential for use in order to mitigate the effects of ongoing climate changes, whether in breeding programs or for immediate use.
The use of molecular markers is an effective strategy to complementing phenotypic characterization in detecting additional sources of genetic diversity present within the gene pool. In our study, the ISSR markers detected 47% polymorphism among accessions of C. chinense or 45 polymorphic loci. Different from our results, Moulin et al. (2015) [60], using 35 ISSR primers in C. baccatum var. pendulum observed a total of 201 polymorphic loci. The characterization of 81 accessions of Capsicum spp. with 13 ISSR markers resulted in a total of 88 amplified loci where 80 of them were polymorphic [61]. ISSR markers provide a large amount of information and enable the identification of polymorphic loci, making the correct differentiation among the accessions. The SSR markers are highly polymorphic and widely distributed in the pepper genome [62]. In addition, they have been widely used for genetic diversity assessment of germplasm because of their ability to detect multi-allelic forms of variation and being co-dominant, are able to distinguish genetic relationships between genotypes. The SSR analysis showed a transfer rate of approximately 36.17%.
The values for the major allele frequency (MAF) ranged from 0.5 to 0.94, with an average of 0.72. The effective number of alleles per locus was two alleles for all genotypes, which is in agreement with other pepper studies [63]. The diversity index (I) indicating the presence of diversity among accessions. Very close to our results, Meng et al. (2017) [64], studying the genetic diversity in the genus Capsicum, found values between 0.384 and 1.379, with an average of 0.508. The observed heterozygosity (Ho) suggesting the presence of homozygotes among individuals and the fixation index indicate the presence of inbreeding in the samples. Pepper is traditionally a cross-pollinated crop with its bisexual flower. The domesticated peppers are diploid and predominantly perform self-pollination, contributing to the high homozygous accessions.
The groups obtained by Bayesian analysis were not formed exclusively by accessions from the same geographical origin. This can be attributed to seed exchange between farmers and free fruit transport between the different regions of Brazil, also indicated by Cardoso et al. (2018) [12]. In contrast, Moses et al. (2014) [11], studying the genetic diversity of C. chinense accessions using microsatellite molecular markers, observed the formation of two distinct genetic clusters, corresponding to the upper and lower Amazon regions, suggesting two independent domestication events or two centers of diversity in these regions.
The description of diversity for phenotypic traits was established using Ward hierarchical clustering analysis allowing the identification of six groups. In general, the accessions could not be grouped based on the geographic origin, since the accessions of same geographic areas were classified in different groups. The same observation was presented by Baba et al. (2016) [13] and Moreira et al. (2018) [14]. The Ward hierarchical clustering analysis of molecular data also allowed the identification of six groups. However, when comparing clusters formed by morpho-agronomic and molecular data, an agreement of 42%, considered low, was observed. Low correlation or even no correlation between morpho-agronomic and molecular data was reported in other studies [13,[65][66][67]. Molecular markers not related to the morphoagronomic traits evaluated used in the study should provide a plausible explanation for the absence or low correlation between formed clusters. Microsatellites are present in both coding and non-coding regions and are therefore not necessarily linked to the expression of morphological traits. We can assume that both characterization stages are important for an understanding of the genetic variability of a population and for developing effective strategies for germplasm conservation and breeding purposes. Similar to phenotypic characterization, no association with the geographical origin was observed.

Conclusion
A high level of variation was found among the most qualitative and quantitative traits evaluated. Pearson's correlation showed a high correlation between a few pairs of phenotypic traits: Plant height and plant canopy width, fruit diameter and pericarp thickness, and days for fruiting and maturation. Based on this genetic diversity, the clustering analysis formed a dendrogram with these distinct groups with the 55 accessions of C. chinense, showed the divergence between the molecular and morpho-agronomic analyzes of the accessions.
In the molecular characterization, of the 18 ISSR, 15 of those primers detected polymorphism and of the 47 SSR evaluated, only 9 were polymorphic between accessions. There was a significant variation for the Major Allele Frequency, Number of alleles, Diversity index, Observed heterozygosity and Fixation index.
There was divergence among accessions considering morpho-agronomic and molecular analyzes, with no duplicate detected. The variability observed reflects accessions adaptation to different ecogeographic and cultural conditions in which they were collected. These accessions have the potential to be used in several breeding programs, including tolerance to abiotic stresses.