The Role of Botanical Families in Medicinal Ethnobotany: A Phylogenetic Perspective

Studies suggesting that medicinal plants are not chosen at random are becoming more common. The goal of this work is to shed light on the role of botanical families in ethnobotany, depicting in a molecular phylogenetic frame the relationships between families and medicinal uses of vascular plants in several Catalan-speaking territories. The simple quantitative analyses for ailments categories and the construction of families and disorders matrix were carried out in this study. A Bayesian approach was used to estimate the over- and underused families in the medicinal flora. Phylogenetically informed analyses were carried out to identify lineages in which there is an overrepresentation of families in a given category of use, i.e., hot nodes. The ethnobotanicity index, at a specific level, was calculated and also adapted to the family level. Two diversity indices to measure the richness of reported taxa within each family were calculated. A total of 47,630 use reports were analysed. These uses are grouped in 120 botanical families. The ethnobotanicity index for this area is 14.44% and the ethnobotanicity index at the family level is 68.21%. The most-reported families are Lamiaceae and Asteraceae and the most reported troubles are disorders of the digestive and nutritional system. Based on the meta-analytic results, indicating hot nodes of useful plants at the phylogenetic level, specific ethnopharmacological research may be suggested, including a phytochemical approach of particularly interesting taxa.


Introduction
Ethnobotany is a relatively recent denomination for a discipline that studies plant names, uses and management by human societies from ancient to current times, aiming at their projection to the future [1]. Even if a precedent of this term, botanical ethnography, was coined to name the investigation of any plant materials in archaeology in order to unveil their uses and symbolisms [2], Harshberger [1] himself emphasised the fact that ethnobotanical findings should not only constitute an inventory of old knowledge, but should be relevant for current productive activities. From those dates to the present times, ethnobotany undertook methodological innovations, but maintained the double approach of recording and preserving the ancient uses of plants by people-which contributes to describing human lifestyles-and aiming to improve human life conditions [3]. This is why the collection of plant uses related to health, mostly medicinal and food ones, are predominant in ethnobotanical research, although other uses are relevant as well [4].
According to the importance of using folk local knowledge to preserve and improve health, not a few drugs have been developed based on their ethnobotanical background, such as, to quote just two famous and recent ones, the oseltamivir, used against chicken flu [5], and the antimalarial artemisinin [6]. Additionally, in agreement with this, medical or pharmaceutical ethnobotany, the botanical side of ethnopharmacology, is one of the main pillars of the discipline, particularly in industrialised countries [7,8].
In Europe, the Catalan linguistic domain, the framework of the present research, is among the better-known Iberian areas from an ethnobotanical viewpoint [9]. The amount of information recollected until now allows us to start conducting research involving comparison among several territories [4,10], in order to establish general patterns in ethnobotanical knowledge.
In the above-referred geographical area, as in general in Europe and worldwide, the predominant ethnobotanical research has been an ethnofloristic one [11,12]. Nevertheless, efforts are being devoted toward finding other complementary approaches, such as studies focused on plants used for ailments related to a determined system, and on the validation of the ethnobotanical evidence with chemical or pharmacological data [13]. Moreover, the potentially predictive role of molecular phylogeny in bioprospection and in phytochemistry [14,15] and the concept of ethnobotanical convergence [16,17] has opened the way to integrate ethnobotany with genetic (including molecular phylogenetic), genomic (and other "omic" disciplines) and phytochemical approaches [18,19].
One of the aspects always addressed in ethnobotanical investigation is the distribution of the plant taxa recorded in botanical families, since, after the number of plant species known in a given territory, the families in which they are included is one of the most evident pieces of information. Even if some deeper analyses of the causes for some predominating families have been undertaken [18,[20][21][22][23], work in this field is still lacking, including territories other than those already considered, and taking into account, among others, phylogenetic issues.
Several statistical methods have been used to test whether a specific taxonomic group is over-or underrepresented in an ethnobotanical flora in comparison to overall local flora. Although the linear regression analysis [21] and binomial analysis [24] have been widely used, more recent studies have pointed out some limitations in these previous analytical methods, and propose a Bayesian approach in order to analyse the over-and underused plant groups [22,23]. This method allows us to consider the uncertainty of the proportion of medicinal plant species in the overall flora and shows its robustness in small datasets [22].
In this context, the aim of this work is to shed light on the role of botanical families in ethnobotany, depicting, in a molecular phylogenetic frame, the relationships between botanical families and medicinal uses of vascular plants in several Catalan-speaking territories (Formentera, Mallorca, and Catalonia, Northern Catalonia included). This will allow us to ascertain the most used families in pharmaceutical ethnobotany in this area and the possible phylogenetic reasons accounting for this, and to find out whether some families are more focused on some particular health conditions than others.

Results and Discussion
For the areas under consideration, we analysed a total of 47,630 use reports corresponding to medicinal uses of vascular plants in human medicine registered in our database, including data from ethnobotanical research performed from 1991 until today, in order to study different aspects related to the botanical families to which these uses belong, and the disorders they refer to.
Based on these data, we can state that the medical ethnobotany of the Catalan-speaking territories scrutinised is distributed in 120 botanical families of vascular plants (seven to pteridophytes, five to gymnosperms and 108 to angiosperms). These families include 894 taxa, with medicinal uses, of which 41 have only been determined at generic level (the remaining majority, at specific and infraspecific levels).
The ethnobotanicity index (EI) for the studied area is 14.44%. In addition, the total considered flora ( [25], see materials and methods for precisions) is at least slightly larger than that of the territories object of the present research, since some plants present in the places not covered here (i.e., Valencian area and some of the Balearic Islands) do not grow in the areas here concerned. This just means that, in fact, the EI in the studied area has been slightly underestimated. The result, as calculated, is higher than in other Mediterranean areas like Arrábida Natural Park, Portugal (12.1%; [26]), Monti Sicani Regional Park, Italy (12.7%, [27]) or the northwest Basque Country (12%; [28]), and lower than that of Serra de São Mamede, Portugal (23.1%; [29]), in the same biogeographical region, or that of Keelakodankulam, India (20.17%; [30]), in a quite distant and floristically different area.
The EI was conceived [31], and is usually calculated, for species and infraspecific taxa. Nevertheless, as the present paper focuses on families too, we calculated the EI for this taxonomic category (excluding the 15 non-native families, not present in the flora used as a basis). We found a familiar EI of 68.21%, indicating that almost three-quarters of the families in the area considered contained plants used in pharmaceutical ethnobotany. We believe that it would be of interest to calculate this parameter for other ethnofloras, in order to be able to compare the rates of families hosting useful plants.

Most Reported Families
Among the 10 most cited families (Table 1), which represent 57.34% of total use reports, we find some of the most relevant in ethnobotanical studies in the Mediterranean, namely Lamiaceae (15.40%), Asteraceae (11.90%) or Rosaceae (5.57%). These are large in terms of number of taxa, Asteraceae being the largest one [32]. These families are cosmopolitan and well represented in our territories, but also, particularly in Lamiaceae and Rosaceae, they are economically very significant, thanks to aromatic plants in Lamiaceae [33] and edible fruits and ornamental uses in Rosaceae [34]. In ethnofloristic works conducted in Mediterranean areas, these three families are almost always predominate at the top of the list, and also some others such as Fabaceae and Apiaceae [11,12,26,[35][36][37][38][39][40][41][42][43][44][45]. From this, the simple but clear idea can be deduced, that people use with preference (or at least importantly) the plants that they easily find not far from their place of daily life, as Johns et al. [46] and Bonet et al. [47] stated.  79 12 In addition, partly due to the restructuring of families following the APG IV last update [48], in which Sterculiaceae and Tiliaceae have become part of the family Malvaceae, this latter family also appears in the top 10 most reported families, with 4.82%.
The remaining most quoted families are Adoxaceae (4.14%), Apiaceae (3.49%), Amaryllidaceae (3.24%), Oleaceae (3.08%), Pinaceae (2.90%) and Rutaceae (2.79%). Four of these families (Apiaceae, Oleaceae, Pinaceae, Rutaceae) have not been the object of any recent systematic restructuring, and they are classically important in terms of medicinal taxa. Conversely, the Adoxaceae, with only ca. 225 species worldwide [49], and just six of them in the studied area [25], now host Sambucus nigra, one of the most used plants in the Mediterranean region and, in particular, in the Catalan-speaking territories [50,51], which has recently been transferred from the Caprifoliaceae. Similarly, the Amaryllidaceae exhibit a significant rate owing to the fragmentation of the Liliaceae lato sensu in several families and the attribution of genus Allium to this family, which is also very relevant in Mediterranean pharmaceutical ethnobotany [42].
Moerman [20] concluded that although in a random universe the size of a family would be the best predictor of its medicinal potential in number of taxa, the Asteraceae contain more medicinal plants than random would indicate, so that size is not the only condition for this success. Moerman et al. [21] found, through a comparative analysis of several geographically distant medicinal floras, that the five most important medicinal plant families in four very differentiated regions (North America, Korea, Kashmir, Chiapas highlands) were delineated by only nine plant families (Araceae, Bignoniaceae, Ericaceae, Euphorbiaceae, Fabaceae, Loganiaceae, Malvaceae, Rosaceae, Solanaceae), accounting for the existence of a global pattern of human knowledge. Indeed, to include a fifth area (Ecuador), only three more families were necessary (Apiaceae, Asteraceae, Lamiaceae). In the same line, six out of the top ten families in the present study are among the 14 most quoted ones in the five North American, Mesoamerican, South American and Asian territories investigated by Moerman et al. [21], those reported above plus Liliaceae and Ranunculaceae. The only top families in this paper not appearing in Moerman et al. [21] are Adoxaceae, Oleaceae, Pinaceae and Rutaceae (considering the Amarillydaceae included in the Liliaceae lato sensu, as this family was referred to in the aforementioned work).

Over-and Underuse of Plant Groups and Plant Families
Results for the over-and underused high taxonomic groups are shown in Table 2. Gymnosperms and monocots are the only two groups that differ from the common proportion. The observed proportion was 0.1709 and ranges from 16% and 18% with 95% of probability and, consequently, we can refuse the null hypothesis for these groups. While monocots are underused, gymnosperms are overused. The proportion of used monocots is very low, and the 95% posterior credible interval very narrow. Contrarily, gymnosperms show the highest proportion of the used plants, and a larger interval, possibly related, according to Weckerle et al. [22], to their small number of species. The families whose 95% posterior credible interval lies above the interval of the overall proportion of flora (0.160, 0.182) are listed in Table 3. These overused families are families represented by a small number of genera, and most of them having medicinal uses, that is, with high proportions. The preponderance of woody species over herbaceous among the most used has been discussed by several authors [20,[52][53][54]. Fagaceae, Rutaceae and Cannabaceae are the three most overused families. The woody families such as Fagaceae, Pinaceae and Cupressaceae (the last two gymnosperms) together with some shrubby families such as Buxaceae, Rhamnaceae and Ericaceae are overrepresented in the medicinal Catalan flora, but in approximately the same proportion as weedy plant families, such as Equisetaceae, Asphodelaceae or Urticaceae.  The families whose 95% posterior credible interval lies below the interval of the overall proportion of flora (0.160, 0.182) are listed in Table 4. Usually, these are families comprising a large number of species in the local flora and with little representation in the medicinal one. In the present study, only nine families are underused, Cyperaceae, Plumbaginaceae and Poaceae, the most underused. Three families, Poaceae, Juncaceae and Cyperaceae belong to the underrepresented high group of monocots. In the present study, most of members of the underused families are herbaceous plants.

Genera with Folk Medicinal Uses
The 120 families recorded are represented herein by 432 genera. If we analyse the number of genera per family, the results vary slightly, and families with more genera are Asteraceae (52), Fabaceae (30) and Lamiaceae (25). Apart from the two families with the most use reports, here appears the Fabaceae family (765 UR, 1.61%), although not being among the top ten. Despite not having a high percentage, Fabaceae is a relevant family in the Mediterranean flora (even being indicative of the Mediterranean character of a territory; [55]) and ethnoflora [42]. Conversely, Adoxaceae with a large number of use reports (1,971 UR, 4.14%) is very asymmetrically distributed in only two genera, Sambucus (99.9%) and Viburnum (0.1%), which would be explained by the change of family, since they previously belonged to the Caprifoliaceae, yet with the APG system, a new familiar delimitation of the Adoxaceae was created for these two genera, and for three more not present in our territory. On the other hand, there are 60 botanical families in which all the reports are grouped in a single genus. Some examples, representing pteridophytes, gymnosperms and angiosperms, are the Equisetaceae, a family with 578 UR exclusive of the genus Equisetum, the only one in the family to be present in the studied area, and the Taxaceae and the Juglandaceae families, with 17 and 570 UR respectively, exclusively represented by their corresponding single species growing in the area, Taxus baccata L. and Juglans regia L. There are also families that concentrate all the medicinal records in one genus, although they have other representatives in the territories studied like Liliaceae stricto sensu, with six genera in the concerned area, but with all UR from only one, Lilium.

Plants not Appearing in the Flora of the Studied Area
Concerning plants not appearing in the flora of the territories studied ( [25]; see materials and methods for details), there are 15 families not present in the flora of our territory, representing 12.5% of the total number of families. Examples are Actinidiaceae, Myristicaceae and Zingiberaceae. These families contain very renowned and used medicinal plants, some of them used first for food purposes, such as Actinidia chinensis Planch., Myristica fragrans Houtt. and Zingiber officinalis Roscoe. In addition, there are 62 taxa not present in our territory, yet belonging to families that nonetheless are present thanks to other genera; this is the case of Cinnamomum verum J.Presl (Lauraceae), Cocos nucifera L.  [25], and yet they are important in local ethnobotany. Indeed, in a work in progress regarding in the Catalan linguistic area we are recording a non-negligible number and percentage of UR attributed to non-native plants [56].

Most Reported Troubles
We grouped the troubles or systems addressed in 15 categories ( Table 5). The four most addressed troubles, representing 61.01% of all use reports, are disorders of the digestive and nutritional system (11,754 UR, 24.68%), followed at a great distance (approximately half of the use reports) by respiratory system disorders (6418 UR, 13.47%), skin or subcutaneous tissue disorders (5588 UR, 11.73%) and circulatory system and blood disorders (5299 UR, 11.13%). Most uses are addressing mild and chronic illnesses, which agrees with the most widespread idea on the main focuses of pharmaceutical ethnobotany and phytotherapy in general [57], but in some cases, they are also pointing to acute and more severe health troubles, like cardiovascular and pulmonary ones, and even cancer.

Relationship between Families and Uses
One of the most important aims of this work is to study the possible relationships between plant families and categories of medicinal uses, i.e., troubles or systems addressed. We analysed the correspondences between families and health diseases, and will now comment on the most relevant findings.
A general consideration of the relationship between families and UR in a phylogenetic frame (Figure 1) shows, within the angiosperms, that the superasterids clearly host the largest number of uses, as well as the largest number of families with an important number of UR, in comparison with the remaining large groups. In each of these, only one or a few families play a protagonist role, such as Malvaceae, Rosaceae and Rutaceae in the superrosids, Ranunculaceae in the eudicots, and Amaryllidaceae and Poaceae in the monocots. Magnoliids and basal angiosperms are not of much significance in terms of UR. It is worth mentioning that, in the asterids, most UR are concentred in the most evolved clade, formed by the campanulids plus the lamiids. As for the gymnosperms, the Pinaceae accumulate most UR.
The analysis of the percentages of UR related to the different troubles/systems within each family (Figure 2) denotes that the highest rates are rather widespread at the phylogenetic scale. In any case, it clearly appears that a large number of families have exhibit digestive and nutritional problems as the most treated ones (mean percentage: 24.56), which is in agreement with the above-mentioned idea that ethnobotany and phytotherapy importantly address mild, daily health constraints. Nevertheless, in the vast majority of families, there are also strong rates of uses focused on circulatory and blood, and respiratory ailments (mean percentages: 14.30 and 11.03, respectively), most of which are not so mild. Finally, some disorders are very scarcely addressed in the pharmaceutical ethnoflora under consideration, such as those linked to endocrine and metabolic, and immune systems, as well as to neoplasia (mean percentages: 0.35 and 0.16, respectively).
If we analyse the percentage of use reports of families for each disorder (Figure 2), 10 out of the 15 trouble/system categories established are dominated by the two families with most UR in general, Lamiaceae (six categories: Circulatory system and blood disorders; pain and inflammation; digestive system and nutritional disorders; skin and subcutaneous tissue disorders; respiratory system disorders; tonic and restorative) and Asteraceae (four categories: Endocrine system and metabolic disorders; infections and infestations; musculoskeletal system disorders and traumas; sensory system disorders). The well-known important presence and diversity of essential oil compounds in the Lamiaceae [58] account-together with the size of the family, as already commented-for its relevance in many medicinal fields related to antiseptic properties, which could explain the prevalence in digestive, dermic and respiratory disorders. Similarly, the abundance, among other compounds, of terpene compounds (including sesquiterpene lactones) in the Asteraceae [59] is logically at the basis of their uses for different ailments, again considering the size and diversity within the family. Figure 1. Heatmap depicting the distribution of use records among plant families and the addressed troubles/systems. Abbreviations, as quoted in the figure, are as follows. CB, circulatory system and blood disorders; PI, pain and inflammations; DN, digestives system and nutritional disorders; P, poisoning; EM, endocrine system and metabolic disorders; PBP, pregnancy, birth and puerperal disorders; G, genitourinary system disorders; II, infections and infestations; IN, immune system disorders and neoplasia; NM, nervous system and mental disorders; MT, musculoskeletal system disorders and traumas; SS, skin and subcutaneous tissue disorders; R, respiratory system disorders; S, sensory system disorders; TR, tonic and restorative.
Concerning Asteraceae, we want to underline two troubles in particular. First, the uses for the musculoskeletal system disorders and traumas, explained by Arnica montana L. (335 UR, 52.59%) and other species of this family (Arctium minus (Hill) Bernh., Doronicum grandiflorum Lam., all of the Inula genus, Jasonia saxatilis (Lam.) Guss., Pallenis spinosa (L.) Cass. and Pulicaria dysenterica (L.) Gaertn.) referred to with the popular name "àrnica" (189 UR, 29.67%). In total, the ethnotaxon constituted by Arnica montana and the aforementioned related taxa accumulates 82.26% of the Asteraceae UR employed for musculoskeletal system disorders and traumas, specially bruises. This medicinal plant complex has been well studied in the Iberian Peninsula and Balearic Islands from a botanical and ethnopharmacological point of view [60]. Secondly, the uses for the endocrine system and metabolic disorders are due to hypoglycaemic activity (259 UR, 98.48%) of several species of the genus Centaurea, representing, with 180 UR, the 68.44% of this property, abundantly registered in this genus [61]. Figure 2. Heatmaps depicting percentages of use reports related to the different troubles/systems within each family, and percentages of use reports of families for each trouble/system. Abbreviations, as quoted in the figure, are as follows. CB, circulatory system and blood disorders; PI, pain and inflammations; DN, digestives system and nutritional disorders; P, poisoning; EM, endocrine system and metabolic disorders; PBP, pregnancy, birth and puerperal disorders; G, genitourinary system disorders; II, infections and infestations; IN, immune system disorders and neoplasia; NM, nervous system and mental disorders; MT, musculoskeletal system disorders and traumas; SS, skin and subcutaneous tissue disorders; R, respiratory system disorders; S, sensory system disorders; TR, tonic and restorative.
In the other five categories, the prominence of a family in the treatment of a problem is basically due to one or a very few taxa. Amaryllidaceae are the most quoted in fighting against poisoning cases (21.33%), with all the reports concentrated on a few species of the genus Allium, which has been reported with this function, and is, for instance, used worldwide against snakebites [62,63]. The dominance of Rutaceae in the pregnancy/childbirth/puerperal treatment is basically explained (27.44%) by the genus Ruta, with three species. This family is closely followed by the Saxifragaceae (22.56%), because of a few species of Saxifraga. Irrespective of the fact of containing plants used for food purposes in ethnobotany, both genera mentioned are among those most famous abortifacients recorded in folk medicine [64,65], this proving their relationship with the life period concerned (e.g., labour inducing, post-labour coadjuvant, dangerous in pregnancy).
Poaceae are the most relevant regarding the genitourinary system (17.66%), mostly by Zea mays L., followed at a considerable distance by Cynodon dactylon (L.) Pers., both (especially the first one) are much reputed as diuretic [66]. Ranunculaceae are the top family in addressing neoplasia (21.51%), because of Ranunculus parnassifolius L., a high mountain plant much appreciated popularly for this purpose in a Pyrenean region [45,67]. Finally, Malvaceae leads the ranking in troubles related to the nervous system and mental disorders (29.58%). The success in this use is basically explained by the genus Tilia (733 UR, 97.34%)recently incorporated into the Malvaceae, where Tiliaceae have been merged-very popular and largely studied as hypnotic, sedative and tranquilizer [68,69].

Phylogenetic Distribution of Families with Medicinal Use
To investigate the degree of phylogenetic clustering of families for each trouble or system, and to detect hot nodes for further studies, we mapped the reported medicinal uses grouped in the 15 troubles or systems addressed on the phylogeny of the families.
A robust hot node appears in three medicinal groups: immune system disorders and neoplasia; pain and inflammation; and pregnancy, birth and puerperal disorders. The hot node is constituted by the Iridaceae and three families classically included in the Liliaceae l.s., Amaryllidaceae, Asparagaceae, and Asphodelaceae. The last three families also constitute a hot node clade for tonic and restorative. In addition, only Amaryllidaceae and Asparagaceae are hot nodes for the endocrine system and metabolic disorders.
Tonic and restorative activities also have another robust hot node, constituted by Betulaceae, Juglandaceae and Fagaceae. For the endocrine system and metabolic disorders, the clade of Apiaceae and Araliacaeae was detected as relevant.
Finally, two robust hot nodes appear for poisoning, on the one hand, Cucurbitaceae and Coriariaceae, and on the other hand Malvaceae, Cistaceae and Thymelaeaceae.

Diversity Indices
The results of the Shannon and Margalef indices for each family are shown in Supplementary Materials. Although the values of the two indices are zero for several families and some of them present low values, such as the Vitaceae (H = 0.01 and k = 0.11), other families show a moderate diversity. The family Asteraceae is the one that presents the highest diversity according to the Shannon index (H = 1.45), while, according to the Margalef index, the Campanulaceae is the family with the greatest diversity (k = 0.70). These differences are due to the fact that the Margalef index is higher when the number of taxa and the number of use reports are equal or close within a family, while if there are many use reports for a few taxa, the diversity decreases. Despite showing a different sensitivity to the variation in the number of taxa and use reports, both indices are well correlated, (r = 0.841, p < 0.001 for a whole dataset). For this reason, we believe the two indices are robust and can be used to measure ethnobotanical diversity, even taking into account their limitations.

Needs for Further Research
This study draws our attention to the relevance of the family taxonomic level in ethnobotany. A few points in which further research is needed have arisen from our analyses.
The taxonomy of specific and infraspecific taxa in ethnobotanical works is usually given, as is logical, according to local floras but, as from the consolidation of the APG family updates, almost all papers use its system for families, which is not coincidental to those used in floras before APG rearrangements. This creates a difficulty in the comparison of data related to families in ethnobotanical research, either in one area or in different ones: the number of families and, more importantly, their rates of presence in each ethnoflora vary when applying the last classical [70] or the APG systems. An international effort should be carried out to implement the APG family system in the ethnobotanical databases in order to facilitate suitable meta-analytic work. Although new prospects will always be positive for a bigger and better knowledge of plant medicinal uses, at present a considerable amount of ethnobotanical information is already accumulated in many parts of the world, so that this is the appropriate moment to undertake comparative analyses between close and not so close ethnofloras, following the initiative, here seconded, pioneered by of Moerman [20] and followed by Weckerle et al. [22] or Dal Cero et al. [23], for which adopting the APG familiar treatment is important.
Given their relevance in most ethnofloristic surveys, the role of commercially-acquired plants in the ethnoflora, and the comparison of their uses with those of autochthonous or allochthonous plants currently present in the flora -and, then, the comparison between these two categories-is a subject that should be addressed in detail in the different major geographical areas of the world. At the family level, as treated in this work, 15 families not present in the local flora (apart from some others present hosting non-native taxa) have been recorded in a relatively small territory. This is a consequence of cultural exchanges through time, recently accelerated by the globalisation process.
Finally, a relevant aspect is a relationship between ethnobotany and phytochemistry (the latter leading to pharmacology and linked to phylogeny). Although there is not an obvious and unidirectional relationship, and little is known about phytochemical composition as compared with evolutionary aspects [71], ethnobotanical knowledge has systematic and evolutionary significance [16,17,71], and thus can help in progressing in the necessary multidisciplinary approach to ethnopharmacology and ethnomedicine. Projecting data such as those treated here in a phylogenetic framework has allowed us to detect hot nodes, richer in families useful in pharmaceutical ethnobotany. Further deeper combination of ethnobotanical and phylogenetic information within one family or a group of a few related families could lead to detecting and predicting taxa useful for particular troubles. Several ethnobotanical works include the phytochemical and/or pharmacological validation of the folk uses reported or discuss ethnobotanical knowledge in the light of chemical plant composition [13,[72][73][74][75], and this will probably be more common in the near future. Family level is particularly adequate to be addressed for establishing relationships between chemical composition, phylogenetic aspects and ethnobotanical knowledge, the three fields confronted either two by two or altogether.

Data Sources and Field Work Methods
In this study, we used data on plants and their folk medicinal uses obtained from 44 ethnobotanical research prospects (Supplementary Material S1) carried out in the Catalan linguistic domain (Figure 3). We utilized the use report (hereinafter, UR), i.e., the report of the use of one taxon by an informant, as the unit of measurement [76]. Veterinary uses are excluded, and human medicinal uses are classified in 15 troubles or systems categories, according to Cook [77] with minor modifications. We grouped some categories by affinity in order to achieve more robustness in the analyses, as some of them had very few reports. The names of the troubles or system categories have remained as in Cook's classification (Table 5), and we just added a new category: tonic and restorative.
The fieldwork method used in these researches was the semistructured interview [9,78], always taking into account the code of ethics of the International Society of Ethnobiology [79], complemented by the collection of plant specimens to be identified and deposited in public herbaria. All interviews were digitalised, transcribed and introduced into our database that contains all ethnobotanical data (on medicinal, food and other uses) collected.

Data Analysis
The simple quantitative analyses of descriptive statistics for categories (species, families, and medicinal uses) and the construction of families and disorders matrix (Supplementary Material S2) were carried out with Excel software (Microsoft Office, 2010). Results were summarized as heatmaps using the R Phytools package ( [80]; R version 3.6.0, [81]) and the family-level phylogeny of land plants from Zanne et al. [82], pruned to our taxonomic dataset. We tested whether each node in the phylogeny was significantly more represented by a family in a given category of trouble/system than would be expected by chance alone (i.e., hot nodes; [83]), using the nodesig test originally implemented in the PHYLOCOM software [84] and adapted for R by Abellán et al. [85]. With the aim of assessing the general state of pharmaceutical ethnobotanical knowledge in the studied area, we calculated the ethnobotanicity index (EI; [31]), which is the quotient between the number of plants used and the total number of plants that constitute the flora of the territory, expressed as a percentage. For this purpose, only the plants present in the Catalan linguistic area's flora [25] were considered, and 5,500 taxa at specific and infraspecific level (Sáez, 2019, pers. comm.) were adopted. Furthermore, we adapted this index to calculate it for families as well.

Bayesian Method to Evaluate over-and Underused Taxonomic Groups
To evaluate over-and underused flora for the Catalan linguistic domain, 794 medicinal plant taxa at specific and infraspecific levels with uses reported and included of the flora of the studied area [25], belonging to 103 families, were recovered from the ethnobotanical database. The total flora for the same families of the studied area is 4,647 taxa [25]. These families were assigned to eight taxonomic groups: ANA-grade, early-diverging eudicots, eudicots-superasterids, eudicots-superrosids, gymnosperms, magnoliids, monocots and pteridophytes, following Chase et al. [48].
In this scenario our null hypothesis (H 0 ) is the following: For a taxomic group (j), the proportion of medicinal taxa (θj) is equal to the overall proportion (θ), where θj is a random variable uniformily distributed between 0 and 1 (prior probability). The posterior probability can be estimated, its distribution will be conditioned by the observed data (see [22] for more details). Probability distribution differences from the common proportion are assessed by all families.
Calculations of the intervals of the most probable values of θ and θj were carried out by the function which returns the inverse of density of beta probability (INV.BETA.N) implemented in Excel software (Microsoft Excel 2011).

Diversity Indices
In order to quantify the diversity of taxa within the families referred by the informants, we calculated two diversity indices (Supplementary Material S2). The first one, the Shannon diversity index [86] from the theory of communication and largely used in ecology, has also been calculated in some previous ethnobotanical studies [87,88]. This index, calculated according to the formula H fam = −Σp tax log 2 p tax , where p tax , represents the citation frequency of each taxon and assesses the ethnobotanical taxa diversity within each family, i.e., the family richness from the ethnobotanical point of view. The second index, k = log S/log N where S is the number of species and N the number of individuals, proposed by Margalef [89] was adapted and used for the first time in an ethnobotanical study to calculate the diversity within the botanical families following the formula: k = log T/log UR, where T represents the number of taxa. Pearson's coefficient correlation (r) was calculated between these two datasets.

Conclusions
The medicinal ethnoflora of the Catalan-speaking territories includes 894 taxa belonging to 120 botanical families of vascular plants. The ethnobotanicity index (EI) is 14.44% and the familial EI is 68.21% for the studied area. This parameter allows us to compare the present data with other ethnofloras.
The most common families in the Mediterranean area, such as Lamiaceae (14.40%), Asteraceae (11.90%) or Rosaceae (5.57%) are among the most cited families which represent 57.34% of total use reports. Fagaceae, Rutaceae and Cannabaceae are the three most overused families and Cyperaceae, Plumbaginaceae and Poaceae the most underused.
To investigate the degree of phylogenetic clustering of families for each trouble or system and detect hot nodes for further studies we mapped the reported medicinal uses grouped in the 15 troubles or systems addressed on the phylogeny of the families.
In the phylogenetic reconstruction, a robust hot node appears in three medicinal groups: immune system disorders and neoplasia; pain and inflammation; and pregnancy, birth and puerperal disorders constituted by the Iridaceae, Amaryllidaceae, Asparagaceae, and Asphodelaceae. The last three families also constitute a hot node clade for tonic and restorative. Tonic and restorative activities also have another robust hot node, constituted by Betulaceae, Juglandaceae and Fagaceae. For the endocrine system and metabolic disorders, the clade of Apiaceae and Araliacaeae was detected as relevant. Two robust hot nodes appear for poisoning, on the one hand Cucurbitaceae and Coriariaceae, and on the other hand Malvaceae, Cistaceae and Thymelaeaceae. These results centred on the familial level are appropriate when establishing relationships between chemical composition, phylogenetic aspects and ethnobotanical knowledge.
Author Contributions: The subject and its reach have been designed by A.G., J.V. and T.G., with the assistance of the remaining authors in different points. A.G., M.P. and U.D. performed the database work to select and treat the ethnobotanical information of the areas chosen. O.H. carried out the phylogenetically informed analyses. A.G. and T.G. carried out the statistical analyses. A.G., J.V. and T.G. wrote a first version of the manuscript, which was read and discussed by the remaining authors. Finally, A.G. and J.V. prepared the final version of the manuscript, which was read and approved by all the authors. All authors have read and agreed to the published version of the manuscript.
Funding: This research was supported by projects 2017SGR001116 from the Generalitat de Catalunya (Catalan Government), PRO2017-S02VALLES and PRO2020-S02VALLES from the Institut d'Estudis Catalans (IEC, Catalan Academy of Sciences and Humanities), and CSO2014-59704-P and CGL2017-84297-R from the Spanish Government. AG received a predoctoral grant of the Universitat de Barcelona (APIF 2015-2018).

Institutional Review Board Statement:
This meta-analytic study, not implying any clinical or similar experiment, was conducted according to the guidelines of the International Society of Ethnobiology.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available in Supplementary Material S2, and further information could bo obtained on request from the corresponding authors.