A Morphometric Analysis of the Santolina chamaecyparissus Complex (Asteraceae)

The genus Santolina (Asteraceae, Anthemideae) includes 26 species of aromatic evergreen shrubs endemic to the western Mediterranean Basin. Santolina is widely used as ornamental plant, in xerigardening, and in ethnobotany. The Santolina chamaecyparissus complex, including about half of the known species diversity, has been properly investigated on systematic and taxonomic grounds only recently, and a complete morphometric study is still missing. Here we provide a morphometric characterization and comparison of all the 14 species of this complex, using both univariate and multivariate analyses. Our results suggest that species of this complex can be distinguished using combinations of quantitative and qualitative character-states, mostly related to the leaf morphology. The analysis of S. villosa, a tetraploid/hexaploid Spanish endemic, showed that the two cytotypes cannot be safely identified based on morphology. Coupling this evidence with available phylogenetic information, we conclude that there is no reason to split the two cytotypes of S. villosa in two distinct taxa. An identification key for all the species of the complex is presented.


Introduction
Santolina L. (Anthemideae) is a genus of evergreen shrubs endemic to the western portion of the Mediterranean Basin [1]. Most species occur under Mediterranean bioclimate, usually on calcareous substrates [2][3][4]. Due to their ability to tolerate periods of strong drought, some species, and in particular S. chamaecyparissus L., are cultivated as ornamental plants and in xerigardening [5]. In addition, most species are known for their traditional ethnobotanical uses. For instance, the inflorescences of S. chamaecyparissus, S. oblongifolia Boiss., and S. rosmarinifolia L. were used for their anti-inflammatory effects [6], whereas aerial parts of S. corsica Jord. & Fourr., S. ericoides Poir., and S. etrusca (Lacaita) Marchi & D'Amato were used as vermifuge and antiparasitic [7,8]. The ethnobotanical importance of Santolina has stimulated in the last decades research concerning the biological properties and the phytochemical composition. Indeed, phytochemical studies discovered the presence of several compounds, such as terpenoids, chrysanthemane monoterpenoids, flavonoids, and coumarins, that are known for their effects on human health [9][10][11][12][13][14]. However, while the literature concerning the phytochemistry of Santolina was proliferating [10,[15][16][17][18][19][20], the systematic knowledge of this genus has remained fragmentary and incomplete until recent years. Important contributions to the systematics and taxonomy of Santolina were provided by Carbajal and collaborators [4,21] for the S. rosmarinifolia complex, whose species mostly occur in the Iberian Peninsula, and by Giacò and collaborators for the S.

Morphometrics of the Two Cytotypes of S. villosa
In Figure 1, a PCoA showing the two cytotypes of S. villosa is reported. The first two axes explain 33.21% of the morphological variability. The tetraploid population shows a wide morphological variability on the first axis, and partially overlaps with the hexaploid population on the left side of the graph. was proliferating [10,[15][16][17][18][19][20], the systematic knowledge of this genus has remained fragmentary and incomplete until recent years. Important contributions to the systematics and taxonomy of Santolina were provided by Carbajal and collaborators [4,21] for the S. rosmarinifolia complex, whose species mostly occur in the Iberian Peninsula, and by Giacò and collaborators for the S. chamaecyparissus complex, more widely distributed across the western Mediterranean Basin. As regards the latter, a nomenclatural revision [1] and a karyomorphological study [22] raised several taxonomic issues that have been later clarified using integrated taxonomic approaches. In particular, De Giorgi and collaborators [23] focused on polyploid Santolina populations from Corsica and Sardinia, Giacò and collaborators [24] on diploid continental Italian species, while Giacò and collaborators [25] untangled the systematic relationships of diploid populations occurring in southern France and north-eastern Spain. Santolina insularis (Gennari ex Fiori) Arrigoni has been synonymized with S. corsica [23], whereas new taxa have been recognized in France and Spain [25]: S. intricata Jord. and Fourr. and three allopatric subspecies within S. decumbens Mill. However, several taxa of the complex have not yet been properly studied, and an overall quantitative morphological analysis is still lacking. In addition, an important gap of knowledge concerns the evaluation of possible taxonomic distinction of the two cytotypes of S. villosa Mill., a tetraploid (2n = 4x = 36) and hexaploid (2n = 6x = 54) species that is endemic to central-eastern and southern Spain [22].
Accordingly, the aims of this study are (a) to quantitatively assess whether the two cytotypes of S. villosa can be distinguished on morphometric grounds, (b) to carry out an exhaustive univariate and multivariate morphometric analysis of the complex including all the 14 recognized species, and (c) to build an identification key.

Morphometrics of the Two Cytotypes of S. villosa
In Figure 1, a PCoA showing the two cytotypes of S. villosa is reported. The first two axes explain 33.21% of the morphological variability. The tetraploid population shows a wide morphological variability on the first axis, and partially overlaps with the hexaploid population on the left side of the graph. The two populations significantly differ for eight quantitative character-states (Table 1). However, their Cohen's d values are always <1.2, showing remarkable overlaps. In Table   Figure 1. PCoA based on Gower distance showing the morphological relationships between the two cytotypes of Santolina villosa, a polyploid species endemic to central-eastern and southern Spain.
The two populations significantly differ for eight quantitative character-states (Table 1). However, their Cohen's d values are always <1.2, showing remarkable overlaps. In Table  S1, the mean values ± standard deviation of each quantitative character is reported for each population, included the two studied populations/cytotypes of S. villosa. Conversely, no qualitative character shows significant differences. Assuming the two cytotypes as a priori groups, Random Forest returned a low value of overall correct classification (68.4%), further confirming the high morphological overlap. Table 1. The results of univariate analyses contrasting the two cytotypes of the polyploid Spanish endemic Santolina villosa. In this case, fs_len = length of the flowering stem (cm), ss_len = length of the non-flowering stem (cm), sq_if_len = length of the inter-floral bract (mm), ssl_seg_dist = distance between the segments of the non-flowering stem leaf (mm), fs_n_nodes = number of nodes of the flowering stem, ss_n_nodes = number of nodes of the non-flowering stem, ssl_hair = degree of tomentosity of the non-flowering stem leaf segment (%), and fs_hair = degree of tomentosity of the flowering stem (%).

Morphometrics of the Whole S. chamaecyparissus Complex
Random Forest returned a value of overall mean correct classification of 89.2% (Table 2), considering all the 14 species as a priori groups. Santolina ericoides and S. pinnata are correctly classified in 100% of cases. Conversely, S. vedranensis shows the lowest value of mean correct classification (59.9%), since it is confused mostly with S. corsica (22.9%) and S. decumbens (7.3%). Except for S. intricata (68.7%), S. virens (69.5%), and S. decumbens (81.8%), other species are well classified (>90%) by the algorithm. By plotting the first two axes of a PCA based on the mean values of eight non-correlated characters (65.3% of the variance explained), the overall morphological relationships among species are highlighted ( Figure 2).
In Table 3, the mean values ± standard deviation for each species and for each quantitative character are reported, whereas the same information is reported at population level in Table S1. In Table S2, the number of significantly different quantitative character-states showing Cohen's d > 1.2 and the number of significantly different qualitative characterstates are reported for each pair of species. The pair S. chamaecyparissus vs. S. etrusca shows the highest number of significantly different character-states (26 quantitative + 5 qualitative), whereas the pairs S. benthamiana vs. S. intricata, S. decumbens vs. S. villosa, and S. ericoides vs. S. virens show the lowest number (5 + 2, 2 + 5, and 3 + 4, respectively).
16.0 ± 8.9 12.1 ± 6.  3.0 ± 0.9 0.9 ± 0.4 5.9 ± 2.2 2 ± 0.6 6.7 ± 1.7 9.3 ± 2.2 1.6 ± 0.8 1.8 ± 0.6 1.5 ± 0.6 ssl_seg_ratio 7.6 ± 2.2 1.9 ± 0.6 8.9 ± 3.3 9.7 ± 2.2 1.7 ± 0.6 1.8 ± 0.7 2.3 ± 1.3 In Table S3, the quantitative characters that significantly differ with Cohen's d > 1.2 and the significantly different qualitative character-states for each pair of species are reported. The quantitative character occurring with the highest frequency in the pairwise comparisons (69 times in Table S3) is the tomentosity of the flowering stems (fs_hair). The following nine characters showing high frequency (63 to 49 times) are still all related to the leaf morphology. The character with the lowest frequency (four times) is the length of the external involucral bract (sq_ext_len). Overall, quantitative characters related to the capitula morphology are less frequently represented than the characters related to the leaf morphology. The qualitative character occurring with the highest frequency (70 times in Table S3) is the tomentosity of the internal involucral bract (sq_int_hair). Conversely, the  Table S3) is the colour of the flowers (fl_col).

Discussion
Our analyses showed that it is almost impossible to distinguish the two cytotypes of S. villosa. Albeit the tetraploids exhibit a morphological variability wider than hexaploids (Figure 1), a remarkable number of individuals morphologically overlaps with the hexaploid cytotype. Univariate analyses suggest that there is no quantitative or qualitative character allowing an unambiguous identification of cytotypes (Table 1). Based on this result, it is not possible to assign a putative ploidy level to the lectotype of S. villosa [26] on morphological grounds, and more in general it is not possible to study the distribution of the two cytotypes using the morphology of herbarium specimens. Therefore, albeit the tetraploid populations were detected in central-eastern Spain and the only known hexaploid population was detected in south-eastern Spain [22,27], we deem that the current shortage of chromosome data, in proportion to the wide distribution range, does not allow for speculation about a possible allopatric distribution of the two cytotypes. The absence of morphological distinctiveness between the two cytotypes agrees with their sister relationship observed in the phylogenetic tree provided by Giacò and collaborators [25]. Based on the current knowledge, the case of S. villosa does not fit with any of the cases presented by Soltis and collaborators [28], in which chromosome races may be worth of taxonomic distinction. Therefore, on taxonomic grounds, we deem the two cytotypes of S. villosa should not be recognized as distinct taxa. Indeed, also in other species of Santolina the co-occurrence of more than one cytotype did not lead to the recognition of separate taxa. Indeed, S. corsica (S. chamaecyparissus complex) is both tetraploid and hexaploid [23], whereas S. montiberica (Riv.-Guerra) R.Carbajal, L.Sáez, M.Serrano & S.Ortiz, S. pectinata Lag., and S. rosmarinifolia s.str. (S. rosmarinifolia complex) are both diploid and tetraploid [4,29].
The morphometric analyses carried out on all the species of S. chamaecyparissus complex show that the most important overall discriminant characters are those related to the leaf morphology. The length and tomentosity of leaves, as well as the number of leaf segments, their length, and how much they are spaced-out are all good discriminant characters, especially if used in combination. Conversely, the characters related to the capitulum morphology show less discriminant power. Moreover, characters such as the width of the peduncle of capitula, the shape of capitula (globose or goblet-shaped), the apex of the inter-floral bracts (rounded or truncated), and the shape of additional morphological structures on the inter-floral bracts, albeit considered important characters by some authors [3,30,31], were preliminary discarded from our analyses since they were extremely variable within the same individual.
Most species show high values of correct a priori classification ( Table 2). The exceptions are S. benthamiana, S. decumbens, and S. intricata, the morphological variation of which was already discussed in detail by Giacò and collaborators [24], also in the light of their phylogenetic relations. Santolina virens and S. ericoides are morphologically close (Table 2, Figure 2), and this affinity further supports the hypothesis which sees S. virens as a homoploid hybrid species having S. ericoides and S. rosmarinifolia as parents [22,32]. In addition, these two putative parental taxa are sympatric in central and northern Spain, where S. virens is native [4]. Albeit similar, however, S. ericoides and S. virens can be easily distinguished by the shape of the leaf segment apex, that is rounded in the former and acute in the latter. A remarkable number of species is partially misclassified by Random Forest as S. corsica (Table 2). A possible explanation of this result lies in the high intra-and inter-populational variability documented for this species [23]. However, univariate analyses detected those morphological characters allowing an unambiguous distinction between S. corsica and all the other partially misclassified species. For instance, S. vedranensis, a narrow endemic to the islet of Es Vedrà (Balearic Islands, Spain), albeit partially misclassified with S. corsica (22.9%), can be easily distinguished by the degree of tomentosity of the leaves of non-flowering stems, almost glabrous in S. vedranensis and densely tomentose in S. corsica. According to Carbajal et al. [21], the taxonomic distinction of S. vedranensis and S. corsica is supported also on molecular grounds. More details regarding the characters allowing a distinction between species are provided in the identification key.
A phylogenomic analysis of the whole genus Santolina is currently ongoing in order to better understand the evolutionary history of species. The preliminary results [33] suggest that all the species studied here represent distinct evolutionary lineages.
In conclusion, our study filled a gap of knowledge concerning the lack of morphological diagnosability of the two cytotypes of S. villosa and the morphometric relations of all the species currently recognized within the S. chamaecyparissus complex.

Identification Key
For a reliable identification, complete portions composed of both flowering and nonflowering stems must be sampled. In the sampling, fragments with branched flowering stems should be preferred to fragments without branched stems. Identification must be carried out on flowering or fruiting specimens, either fresh or dry, albeit in dry specimens the color of the flowers is usually lost. In the identification process, only the longest stems, leaves, and leaf segments, and the widest capitula must be considered. It is recommended to measure the same character multiple times on distinct portions of the fragment and then to compare the mean value obtained with the variation ranges reported in the key ( Table 4), instead of using a single measurement. Some parts of the identification key were taken and integrated from [24,25]. In Figure 3, photos in nature of all species, except S. villosa and S. virens, are reported.

Materials and Methods
A total amount of 27 populations was sampled in the field during the summers of 2019, 2020, and 2021. For each population, 20 flowering individuals were collected (except for S. virens, S. chamaecyparissus, and S. vedranensis for which four, nine, and 13 individuals were, respectively, sampled). Concerning Corsica and Sardinia, continental Italy, and populations from southern France and north-eastern Spain, the same individuals studied by De Giorgi and collaborators [23] and Giacò and collaborators [24,25] were analyzed. A total amount of 506 specimens was analyzed. In Table 5, information concerning all the studied populations is reported. All the studied specimens are conserved in the herbarium of Pisa (PI) (acronym follows Thiers [34]) and HD images of all of them are available at https://www.jacq.org/ (accessed on 9 November 2022).

Materials and Methods
A total amount of 27 populations was sampled in the field during the summers of 2019, 2020, and 2021. For each population, 20 flowering individuals were collected (except for S. virens, S. chamaecyparissus, and S. vedranensis for which four, nine, and 13 individuals were, respectively, sampled). Concerning Corsica and Sardinia, continental Italy, and populations from southern France and north-eastern Spain, the same individuals studied by De Giorgi and collaborators [23] and Giacò and collaborators [24,25] were analyzed. A total amount of 506 specimens was analyzed. In Table 5, information concerning all the studied populations is reported. All the studied specimens are conserved in the herbarium of Pisa (PI) (acronym follows Thiers [34]) and HD images of all of them are available at https://www.jacq.org/ (accessed on 9 November 2022).

Species N Population Vouchers
S. corsica 20 Italy, Sardinia, Monte Spada [WGS84: 40.058586 N, 9.293333 E] For each individual, 31 quantitative and nine qualitative characters were measured ( Table 6). All of the measurements were taken on dried material with a ruler/digital caliper or with ImageJ v.1.52b (http://rsb.info.nih.gov/ij, accessed on 30 August 2022). In this latter case, a 1200 dpi scan of the portion to measure was obtained. Tomentosity of leaves and stems was measured according to the following procedure: a portion of leaf/stem was photographed with a digital camera mounted on a stereomicroscope. Next, the area covered by tomentum was measured with ImageJ. Finally, the percentage of area covered by tomentum was calculated dividing the area covered by tomentum by the total area. The tomentosity of the non-flowering stems (ss_hair in Table 6) was transformed into an ordered factor using the following classes: 0-5% (hairless or almost hairless), 6-30% (slightly pubescent), 31-60% (pubescent), 61-90% (tomentose), and 91-100% (densely tomentose). The tomentosity of the inter-floral bracts (sq_if_hair in Table 6) was categorized based on the number of hairs: 0-3 (glabrous), 4-10 (slightly pubescent), 11-25 (pubescent), 26-50 (tomentose), and 51 or more (densely tomentose). The morphological variation of the two cytotypes of S. villosa was graphically visualized with a PCoA based on Gower distance. Next, univariate analyses were conducted to check for possible morphological characters discriminant between the two cytotypes. For characters showing equal variance (Bartlett test with p > 0.05), a t-test was conducted. Instead, for those characters showing unequal variance (Bartlett's test with p < 0.05), a Welch t-test was conducted. After that, for each significant result (Tukey-Kramer or Welch t-test with p < 0.05), the Cohen's d index was calculated [35,36]. As in Giacò and collab-Plants 2022, 11, 3458 13 of 14 orators [25], significant results were considered relevant only when Cohen's d > 1.2, i.e., the two means are distant at least 1.2 standard deviations. Qualitative characters were analyzed with the Fisher's exact test. The differences were considered significant when p < 0.05.
The analyses concerning the whole complex were carried out by employing a PCA based on mean values for each species. For a better visualization of the biplot, a Pearson correlation test was carried out between all pairs of variables, and highly correlated (r > 0.85) variables were discarded. Next, to check for the robustness of the morphological diagnosability of the two cytotypes of S. villosa and of all the species currently recognized in the S. chamaecyparissus complex, the Random Forest classification method (RF) was used using the R package "randomForest", considering all species as a priori groups. RF was reiterated 100 times, each time half randomly splitting the dataset in the training and test subsets. Next, univariate analyses have been carried out as described above using the Hochberg's method to adjust p-values and reduce the family-wise error rate. All statistical analyses were conducted in R environment [37].

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/plants11243458/s1. Table S1: Mean values ± standard deviation for each character and for each studied population in the Santolina chamaecyparissus complex. Table S2: Significantly different morphological character-states in the Santolina chamaecyparissus complex. Table S3: Significantly different morphological character-states in the Santolina chamaecyparissus complex.