Partitioning Pattern of Natural Products Based on Molecular Properties Descriptors Representing Drug-Likeness

: A cheminformatics procedure for a partitioning model based on 135 natural compounds including Flavonoids, Saponins, Alkaloids, Terpenes and Triterpenes with drug-like features based on a descriptors pool was developed. The knowledge about the applicability of natural products as a unique source for the development of new candidates towards deadly infectious disease is a contemporary challenge for drug discovery. We propose a partitioning scheme for unveiling drug-likeness candidates with properties that are important for a prompt and efﬁcient drug discovery process. In the present study, the vantage point is about the matching of descriptors to build the partitioning model applied to natural compounds with diversity in structures and complexity of action towards the severe diseases, as the actual SARS-CoV-2 virus. In the times of the de novo design techniques, such tools based on a chemometric and symmetrical effect by the implied descriptors represent another noticeable sign for the power and level of the descriptors applicability in drug discovery in establishing activity and target prediction pipeline for unknown drugs properties.


Introduction
Chemometric methods assigned to the new drug design paths and discovery are sufficient to support the computer-aided drug design using the advantage of representation of natural compounds variety from diverse classes by descriptors that capture their structural similarity faces and drug-like properties [1][2][3][4].Piles of molecular descriptors produced and assessed by different methods and approaches [5][6][7][8] have been described in the literature for the drugability and drug-likeness properties of small molecules [9][10][11][12][13][14][15][16].Targeting drug-like properties as proposed by Lipinsky [17] and relies on a well-known rule-of-five which is described on five simple physicochemical parameters (molecular weight ≤ 500, log P ≤ 5, H-bond donors ≤ 5, H-bond acceptors ≤ 10, Topological Polar Surface Area < 140 A 2 good intestinal absorption).
The progress in multivariate statistics has opened new answers for the questions about the proper interpretation and classification in the proper choice for the cross-cutting descriptors which are a main part of the computer-aided molecular design.Application of the latter mentioned data exploratory methods towards a range of natural compounds candidates uses a robust and coherent approach for the selection of not only suitable natural chemical compounds but also accessible descriptors.They are responsible for their pharmacophores and drug-like properties.The application of some physicochemical parameters becomes the meaningful criteria that make the studies object from candidate to drug.All the in silico methods are a comprehensive way to do and to evaluate new targets against the SARS-CoV-2 or any new threat.
Sesquiterpenes, alkaloids, curcuminoids, phenolics and terpenoids are natural-product drugs of plant origin that are pharmacologically active compounds and have been approved Symmetry 2021, 13, 546 2 of 20 by the US Food and Drug Administration (FDA) for curing various illnesses.The fact that the drug discovery from medicinal plants continues to attend to handle a source for new drug leads, and rise in the last years.
The deliberate targeting of natural products has been fruitful for developing new clinical drugs for a long time.In many studies [17][18][19][20][21][22][23], different natural products have been explored as the most important compounds for natural compounds drugs for developing and more data to come about the frameworks for new clinical trends and the challenge of recommending the new drugs from the medicinal plants.Additionally, the effectiveness of these candidates in a global pandemic with anti-coronavirus activity is proven.For example, in the article of Rivero-Segura et al. [24], they share the knowledge of the applicability of Mexican natural products against the SARS-CoV-2 virus in the frame of an in silico screening.The authors stated ten compounds that are perfect matches with a druglikeness criterion.The specific pattern and understanding of the therapeutic effect of plant flavonoids have been well documented for decades [25][26][27][28].Their main health benefits are related to comforting pain and inflammatory conditions and according to clinical research, these benefits are functions of the chemical structure of the flavonoid compounds (presence of three rings and specific position of the hydroxyl group −OH in one of the rings).It is worth noting that flavonoids are carefully studied with respect to possible antimicrobial, anti-inflammatory and anti-viral perspectives.There is evidence that worldwide more than 80% of the world population rely on medical plants as therapeutic prescriptions [18][19][20].
Since the group of the chemically identified flavonoids exceeds several thousand compounds and, additionally, the revived attention to them was triggered by the coronavirus pandemic appearance, any effort for specific partitioning or classification with respect to flavonoids is worth trying from any point of view (finding, for instance, discriminant pharmacophoric or molecular indices to reveal patterns of similarity between flavonoids and other medicinal plant remedies or between the different classes of flavonoids).
The same objectives are true for the largest group of natural compounds known as safe and effective therapeutic items-the terpenoids.Studies on mono-, di-and triterpenes indicate that these compounds possess flavonoid-like properties which could be of serious support to the solution of the problems of curing not only cancer problems but also the covid virus infection symptoms.Again, all attempts to find proper partitioning by the use of suitable descriptors between different medical plants are notable.
The opportunities offered by chemometric and machine learning procedures make it possible to solve many of the problems of reliable, simple and effective partitioning of the medical plants with respect to their specific fingerprint descriptors (related either to their medical effects and drugability, drug-likeness or to structural specificity).It is the aim of the present study to offer a simple option for such partitioning.
To facilitate the discovery process and to overcome the issues in the process of novel drug development, the rational methods in the drug design in combination with the plethora of all the in silico methods represent a pivotal role.The discovery and screening of the natural drugs with the potential of a new perspective treating agents is the intention in the present study.It combines some chemometric methods (multivariate statistics) for partitioning a group (135 natural compounds) of natural compounds by the use of descriptors related to pharmacophores and drug-likeness indications.It makes it possible to design and facilitate an accountable partitioning network for the next level of treating and exploring the data.If a suitable partitioning with respect to the different chemical classes could be achieved, then a next step might be performed allowing a more specific separation of some natural compounds using available descriptors.Finally, the specific partitioning could be used for the prediction of important properties of the natural compounds studied.

Natural Molecules Dataset
A pool of 135 natural compounds divided is used for the partitioning procedure (Supporting Information Figure S1.For the aims of the partitioning procedure, the sets of descriptors (i.e., drug-like indices, molecular properties and pharmacophore descriptors) were calculated by the AlvaDesc v.2 software (Milano, Italy) (https://www.alvascience.com/alvadesc/, access on 15 October 2020) [29].All obtained descriptor values (after variable reduction procedure using principal components analysis) were included in the next step of the partitioning with two sets of variables: 45 for the first run (all described in Table 1) and 17 for the second one (all CATS2D, all SHED, all TPSA, all MLOGP and Ro5 and cRo5).In Table 1 the whole list with the used molecular descriptors for this study was presented.The pool of molecular descriptors was extended with a Pharmacophore descriptor block, which includes two different types of descriptors: CATS2D descriptors and SHED descriptors.A novel set of molecular descriptors called SHED (Shannon Entropy Descriptors) is presented in [30].SHED are derived from the distributions of potential pharmacophore points (PPP) in the molecular structure, then the Shannon entropy is applied to quantify the variability in a feature-pair distribution.The CATS 2D (Chemically Advanced Template Search) descriptors are a particular case of autocorrelation descriptors, where the atom-type definition is related to the concept of potential pharmacophore points (PPP).CATS2D descriptors have been widely used for similarity search [31].
Both CATS2D and SHED descriptors were used for clustering along with drug-like indices (Table 1) for clustering performed separately with 47 and with 17 variables.The effect of additional partitioning within the obtained similarity groups on descriptor bases checks the possible differences in clustering patterns if different sets of descriptors are used.We obtained the same groups of similarity for both sets of descriptors The evaluation of which combinations or symmetry effects are the best gives the maximum covariant between the related objects/features.

Multivariate Statistical Methods
The multivariate statistical methods used are frequently used in chemometrics: cluster analysis, namely hierarchical and nonhierarchical (K-means) clustering.The hierarchical mode of clustering studies the data for the existence of groups of similarity (clusters) between the objects (natural compounds) or between the variables (descriptors).This method is using an unsupervised pattern recognition algorithm (the patterns of similarity are formed spontaneously by calculating similarity distances and linkage options to identify the clusters).It is important to mention that the initial data are normalized in order to avoid the impact of data dimensionality on the clustering procedure.Usually, the representation of the clusters obtained is on a planar plot called dendrogram.The statistical significance of the clusters formed depends on a preliminary chosen cut-off distance and it makes the interpretation of the clusters dependent to some extent on the selection of the cut-off.In the present study, the goal of the hierarchical clustering was to identify patterns of similarity between the natural compounds (objects).Similarity patterns between variables (descriptors) are not the subject of this study.
K-means clustering method is representative of the supervised pattern recognition methods.The main characteristic of this approach is that the cluster formation is not spontaneous but predetermined according to some preliminary hypotheses.This hypothesis corresponds to theoretical or experimental evidence and very often is related to the validation of preliminary obtained results (e.g., hierarchical clustering).The algorithm is based on calculation distances between each object and preliminarily formed group centroids.
The combination of descriptors based on pharmacokinetic and physicochemical properties of compounds in measured or judged drug-likeness arises in the initial phases of assignment of the drug discovery.Physicochemical properties which are the pillar in the assessment of the drug-likeness are the features such as molecular mass, polarity (polar surface area (PSA) and (TPSA) topical surface area), number of aromatic rings, number of heavy atoms, logP, logS, number of hydrogen bond donors and acceptors, and number of rotatable bonds.The plethora of all these properties is remarkable and allows blind scanning for completely new drug candidates.Our concepts about the partitioning was based on these molecular primer labels properties for predicting drug-likeness.In the obtained dataset with the descriptors, the Lipinski's rule of five for the prediction of drug-likeness within the data was as well tracked beside the ability of obtained partitioning pattern.
We can briefly write down the main points in Lipinski's rule of five: molecular weight not than 500 Da, logP up to 5, hydrogen-bond donor not more than 5 and hydrogen bond acceptor up to 10.

Hierarchical Cluster Analysis
Hierarchical Cluster Analysis As already mentioned, the chemometric analysis was initially performed on two dataset matrices having dimensions [135 × 45] and [135 × 17].In both cases, two major clusters of natural compounds were identified.The members belonging to each cluster are one and the same for both modes of matrices.Below the hierarchical dendrogram is presented as well as the members of each cluster as confirmed by K-means clustering (Figures 1-3).
corresponds to theoretical or experimental evidence and very often is related to the validation of preliminary obtained results (e.g., hierarchical clustering).The algorithm is based on calculation distances between each object and preliminarily formed group centroids.
The combination of descriptors based on pharmacokinetic and physicochemical properties of compounds in measured or judged drug-likeness arises in the initial phases of assignment of the drug discovery.Physicochemical properties which are the pillar in the assessment of the drug-likeness are the features such as molecular mass, polarity (polar surface area (PSA) and (TPSA) topical surface area), number of aromatic rings, number of heavy atoms, logP, logS, number of hydrogen bond donors and acceptors, and number of rotatable bonds.The plethora of all these properties is remarkable and allows blind scanning for completely new drug candidates.Our concepts about the partitioning was based on these molecular primer labels properties for predicting drug-likeness.In the obtained dataset with the descriptors, the Lipinski's rule of five for the prediction of druglikeness within the data was as well tracked beside the ability of obtained partitioning pattern.
We can briefly write down the main points in Lipinski's rule of five: molecular weight not than 500 Da, logP up to 5, hydrogen-bond donor not more than 5 and hydrogen bond acceptor up to 10.

Hierarchical Cluster Analysis
As already mentioned, the chemometric analysis was initially performed on two dataset matrices having dimensions [135 × 45] and [135 × 17].In both cases, two major clusters of natural compounds were identified.The members belonging to each cluster are one and the same for both modes of matrices.Below the hierarchical dendrogram is presented as well as the members of each cluster as confirmed by K-means clustering (Figures 1-3).Next, Table 2 summarized the distribution of contribution of each descriptor (simply by assigned as "high" and "low" levels of the descriptors for the partitioning (with respect to "0").In the supplementary information, the complete list of all members according to the distance between objects in the clusters driving the separation pattern inside the formed groups for both clusters are listed (SI-Tables 2 and 3).Next, Table 2 summarized the distribution of contribution of each descriptor (simply by assigned as "high" and "low" levels of the descriptors for the partitioning (with respect to "0").In the supplementary information, the complete list of all members according to the distance between objects in the clusters driving the separation pattern inside the formed groups for both clusters are listed (SI-Tables 2 and 3).Next, Table 2 summarized the distribution of contribution of each descriptor (simply by assigned as "high" and "low" levels of the descriptors for the partitioning (with respect to "0").In the Supplementary Information, the complete list of all members according Symmetry 2021, 13, 546 7 of 20 to the distance between objects in the clusters driving the separation pattern inside the formed groups for both clusters are listed (Supplementary Information-Tables S1 and S2).
All CATS2D (7), all SHED (4), Ro5, MLOGP All TPSA (2), LOGPGP99, cR05 Small cluster Flavonoids Triterpenes Saponines In Table 3, the partitioning of the natural compounds into both identified cluster is summarized.The chemical structures of the a135 natural molecules are presented in the Supplementary Information.Therefore, the small cluster consists mainly (90%) of three classes of natural compounds: flavonoids, triterpenes and saponins and isolated presence of alkaloids and tannic acid.
Big cluster (94 members) for both 45 and 17 descriptors: The big cluster has many more members and a variety of ligand classes.However, over 50% of all objects are flavonoids, monoterpenes and sesquiterpenes, a few percent for alkaloids and curcuminoides and single cases of many other natural compounds like esters, lecithin, etc.
The general conclusion in this simple but effective partitioning procedure is that both groups of descriptors act very similarly and achieve partitioning for the natural compounds except for the class of flavonoids which is distributed between both clusters.Thus, if most of the natural compounds included in the study-specific descriptors could be found on the basis of their values, the big group of very important compounds as that of flavonoids is not entirely partitioned.
In the next step of the study, an effort is made to assess by chemometric procedure the option for partitioning the flavonoids.For this purpose, only the group of flavonoids was subject to partitioning.The procedure includes cluster analysis of all flavonoids, the flavonoids in the big cluster only and flavonoids in the small cluster only.Both sets of descriptors (45 and 17) were used.
In Figure 4, the hierarchical dendrogram for clustering of all flavonoids (38 objects) by 17 descriptors is presented.
In the next step of the study, an effort is made to assess by chemometric procedure the option for partitioning the flavonoids.For this purpose, only the group of flavonoids was subject to partitioning.The procedure includes cluster analysis of all flavonoids, the flavonoids in the big cluster only and flavonoids in the small cluster only.Both sets of descriptors (45 and 17) were used.
In Figure 4, the hierarchical dendrogram for clustering of all flavonoids (38 objects) by 17 descriptors is presented.Three clusters (Sneath's cluster significance test-1/3Dmax) could be identified based on the upper clustering: C1-the smallest cluster consists of four members which are part of the flavonoid natural compounds included in the small cluster from the overall partitioning procedure.The rest of the flavonoids of the small cluster are included in C3 (another 12 members).The biggest cluster C2 (22 members) corresponds entirely to the flavonoid class members included in the big cluster result of the overall partitioning procedure.
The present plot (Figure 5) illustrates the relationship between the descriptors and the clusters formed.The four members of C1 (celastrol, epitaraxerol, glycyrrhizin and Three clusters (Sneath's cluster significance test-1/3Dmax) could be identified based on the upper clustering: C1-the smallest cluster consists of four members which are part of the flavonoid natural compounds included in the small cluster from the overall partitioning procedure.The rest of the flavonoids of the small cluster are included in C3 (another 12 members).The biggest cluster C2 (22 members) corresponds entirely to the flavonoid class members included in the big cluster result of the overall partitioning procedure.
The present plot (Figure 5) illustrates the relationship between the descriptors and the clusters formed.The four members of C1 (celastrol, epitaraxerol, glycyrrhizin and pristemerin) form a specific subgroup of flavonoids which differ significantly from all other flavonoids (highest levels for all CATS2D and Shed descriptors as well as for MLOGP2 descriptor).The other two patterns of flavonoids are very similar with respect to almost all descriptor values except for Ro5 (high for C3) and cRo5 values (high for C2).
If to the same group of flavonoids 45 descriptors are used for partitioning, one gets the following separation (Figure 6).
The use of molecular properties descriptors causes another partitioning scheme, again, three clusters are identified but the members of each cluster are mixed (with respect to the partitioning achieved with all classes of natural compounds).It could be said that C1 is the biggest cluster (dominantly flavonoid members of the big cluster of the overall partitioning of 135 objects), C2 and C3 are the intermediate and the smallest cluster (dominantly flavonoid members of the small cluster of the overall partitioning of 135 objects).(Figure 7).
In the next step of the partitioning scheme, an effort was made to understand if the two separate clusters of flavonoids (big one and small) could be additionally partitioned to deliver more specific information.Both groups were partitioned with 45 and 17 descriptors.
The results of this clustering (Figure 8) indicate the formation of two clusters that are well separated.
pristemerin) form a specific subgroup of flavonoids which differ significantly from all other flavonoids (highest levels for all CATS2D and Shed descriptors as well as for MLOGP2 descriptor).The other two patterns of flavonoids are very similar with respect to almost all descriptor values except for Ro5 (high for C3) and cRo5 values (high for C2).If to the same group of flavonoids 45 descriptors are used for partitioning, one gets the following separation (Figure 6).The use of molecular properties descriptors causes another partitioning scheme, again, three clusters are identified but the members of each cluster are mixed (with respect pristemerin) form a specific subgroup of flavonoids which differ significantly from all other flavonoids (highest levels for all CATS2D and Shed descriptors as well as for MLOGP2 descriptor).The other two patterns of flavonoids are very similar with respect to almost all descriptor values except for Ro5 (high for C3) and cRo5 values (high for C2).If to the same group of flavonoids 45 descriptors are used for partitioning, one gets the following separation (Figure 6).The use of molecular properties descriptors causes another partitioning scheme, again, three clusters are identified but the members of each cluster are mixed (with respect to the partitioning achieved with all classes of natural compounds).It could be said that C1 is the biggest cluster (dominantly flavonoid members of the big cluster of the overall partitioning of 135 objects), C2 and C3 are the intermediate and the smallest cluster (dominantly flavonoid members of the small cluster of the overall partitioning of 135 objects).(Figure 7).In the next step of the partitioning scheme, an effort was made to understand if the two separate clusters of flavonoids (big one and small) could be additionally partitioned to deliver more specific information.Both groups were partitioned with 45 and 17 descriptors.
The results of this clustering (Figure 8) indicate the formation of two clusters that are well separated.(Figure 7).In the next step of the partitioning scheme, an effort was made to understand if the two separate clusters of flavonoids (big one and small) could be additionally partitioned to deliver more specific information.Both groups were partitioned with 45 and 17 descriptors.
The results of this clustering (Figure 8) indicate the formation of two clusters that are well separated.In Figure 9, the plot of average values of each descriptor for each identified cluster is presented.
In Figure 9, the plot of average values of each descriptor for each identified cluster is presented.
The partitioning in the group of 17 flavonoids is based on the significantly different levels of the following descriptors: SHED_DL, SHED_AA, SHED_AL, Hy, MR99, MRcons, SAacc, SAtot, SAdon, SAscore, Vx, VvdwMG, Ro5, cRo5, DLS_01, DLS_02, DLS_06, DLS_07, DLS_cons, QEDu, QED (the groups of descriptors are well defined).If the same group of 17 flavonoids is clustered by 17 descriptors the compounds are partitioned again into two major clusters.However, they differ in membership-from the big group (obviously having equal levels of descriptors) a small group of four members (celastrol, pristimerin, epitaraxerol and glycyrrhizin) is partitioned.
As seen in Figure 10 the difference between two patterns of similarity is due to the difference in the levels of all CATS2D descriptors, SHED_AL descriptor, MLOGP2 descriptor and Ro5 and cRo5 descriptors.If the same group of 17 flavonoids is clustered by 17 descriptors the compounds are partitioned again into two major clusters.However, they differ in membership-from the big group (obviously having equal levels of descriptors) a small group of four members (celastrol, pristimerin, epitaraxerol and glycyrrhizin) is partitioned.
As seen in Figure 10 the difference between two patterns of similarity is due to the difference in the levels of all CATS2D descriptors, SHED_AL descriptor, MLOGP2 descriptor and Ro5 and cRo5 descriptors.
In Figure 9, the plot of average values of each descriptor for each identified cluster is presented.
The partitioning in the group of 17 flavonoids is based on the significantly different levels of the following descriptors: SHED_DL, SHED_AA, SHED_AL, Hy, MR99, MRcons, SAacc, SAtot, SAdon, SAscore, Vx, VvdwMG, Ro5, cRo5, DLS_01, DLS_02, DLS_06, DLS_07, DLS_cons, QEDu, QED (the groups of descriptors are well defined).If the same group of 17 flavonoids is clustered by 17 descriptors the compounds are partitioned again into two major clusters.However, they differ in membership-from the big group (obviously having equal levels of descriptors) a small group of four members (celastrol, pristimerin, epitaraxerol and glycyrrhizin) is partitioned.
As seen in Figure 10 the difference between two patterns of similarity is due to the difference in the levels of all CATS2D descriptors, SHED_AL descriptor, MLOGP2 descriptor and Ro5 and cRo5 descriptors.Therefore, the reduction of the number of descriptors could lead to minor changes in the partitioning scheme on one hand but, on the other, to underline some additional options to distinguish different flavonoids by the differences of their structural and drugability properties.
In this case, 22 flavonoids belonging to the bigger group of flavonoid partitioning are clustered with respect to 45 descriptors.In Figure 11 the hierarchical dendrogram for their separation is shown.Therefore, the reduction of the number of descriptors could lead to minor changes in the partitioning scheme on one hand but, on the other, to underline some additional options to distinguish different flavonoids by the differences of their structural and drugability properties.
In this case, 22 flavonoids belonging to the bigger group of flavonoid partitioning are clustered with respect to 45 descriptors.In Figure 11 the hierarchical dendrogram for their separation is shown.Three major clusters are formed with, respectively, 6, 4 and 12 members.The three clusters are very similar with respect to the levels of the mean values of the 45 descriptors.The significant differences are in CATSD2 descriptors, LOGPcons descriptor, (difference to the case with the smaller cluster of flavonoids), DLS_06 descriptors and QEDu and QED descriptors.It might be concluded that the additional flavonoids only partitioning reveals some specific properties of the compounds related to molecular differences as depicted in Figure 12 and was proven by the presented plot of means in Figure 13.Three major clusters are formed with, respectively, 6, 4 and 12 members.The three clusters are very similar with respect to the levels of the mean values of the 45 descriptors.The significant differences are in CATSD2 descriptors, LOGPcons descriptor, (difference to the case with the smaller cluster of flavonoids), DLS_06 descriptors and QEDu and QED descriptors.It might be concluded that the additional flavonoids only partitioning reveals some specific properties of the compounds related to molecular differences as depicted in Figure 12 and was proven by the presented plot of means in Figure 13.Therefore, the reduction of the number of descriptors could lead to minor changes in the partitioning scheme on one hand but, on the other, to underline some additional options to distinguish different flavonoids by the differences of their structural and drugability properties.
In this case, 22 flavonoids belonging to the bigger group of flavonoid partitioning are clustered with respect to 45 descriptors.In Figure 11 the hierarchical dendrogram for their separation is shown.Three major clusters are formed with, respectively, 6, 4 and 12 members.The three clusters are very similar with respect to the levels of the mean values of the 45 descriptors.The significant differences are in CATSD2 descriptors, LOGPcons descriptor, (difference to the case with the smaller cluster of flavonoids), DLS_06 descriptors and QEDu and QED descriptors.It might be concluded that the additional flavonoids only partitioning reveals some specific properties of the compounds related to molecular differences as depicted in Figure 12 and was proven by the presented plot of means in Figure 13.The hierarchical clustering revealed two significant clusters and two separate flavonoids (forming one small cluster or two outliers-papyriflavonol and psoralidin).The cluster significance is determined by Sneath's index-1/3 Dmax or 2/3Dmax.(Figure 14).The hierarchical clustering revealed two significant clusters and two separate flavonoids (forming one small cluster or two outliers-papyriflavonol and psoralidin).The cluster significance is determined by Sneath's index-1/3 Dmax or 2/3Dmax.(Figure 14).The hierarchical clustering revealed two significant clusters and two separate flavonoids (forming one small cluster or two outliers-papyriflavonol and psoralidin).The cluster significance is determined by Sneath's index-1/3 Dmax or 2/3Dmax.(Figure 14).The differences in the average values of the descriptors for the partitioned groups are negligible (Figure 15).Some slight differences in CATS2D are observed and the only significant difference is in the descriptor SHED_DL which makes the difference between the outliers and the rest of the flavonoids.The differences in the average values of the descriptors for the partitioned groups are negligible (Figure 15).Some slight differences in CATS2D are observed and the only significant difference is in the descriptor SHED_DL which makes the difference between the outliers and the rest of the flavonoids.It could be concluded that the treatment of the flavonoids as a separate group of objects indicates that:

•
The application of larger number of descriptors gives more opportunities to explain the partitioning; The use of a smaller number of descriptors elucidates some more specific properties of some flavonoids.

Conclusions
When discussing the drug-likeness for a particular class of natural compounds with an expected capability as drugs against anti-inflammatory viruses, the problem will require a combination of prediction methods and approaches of different levels of complexity.
The simple procedure of a partitioning approach towards drug-likeness to the drug discovery process of different natural medicine compounds seems effective to separate the selected large group of natural compounds into specific patterns depending on the descriptors used (dominantly sesquiterpenes, monoterpenes, curcuminoids as well as flavonoids as members of one of the identified pattern) and (dominantly flavonoids, triterpenes, saponines, alkaloids as members of the second pattern).
One important conclusion of the study carried out is that no specificity of the descriptors is found since testing a larger group of descriptors (45) and a selection of only 17 out of all 45 descriptors lead to one and the same partitioning model.
Another interesting conclusion can be assumed in the frame in two groups.The big group of flavonoids, however, does not belong selectively to one of these patterns but is mingled with the members of the first and second pattern.We are aware of the fact that the group of flavonoids is quite big and complex both as chemical structures and chemical It could be concluded that the treatment of the flavonoids as a separate group of objects indicates that:

•
The application of larger number of descriptors gives more opportunities to explain the partitioning; • The use of a smaller number of descriptors elucidates some more specific properties of some flavonoids.

Conclusions
When discussing the drug-likeness for a particular class of natural compounds with an expected capability as drugs against anti-inflammatory viruses, the problem will require a combination of prediction methods and approaches of different levels of complexity.
The simple procedure of a partitioning approach towards drug-likeness to the drug discovery process of different natural medicine compounds seems effective to separate the selected large group of natural compounds into specific patterns depending on the descriptors used (dominantly sesquiterpenes, monoterpenes, curcuminoids as well as flavonoids as members of one of the identified pattern) and (dominantly flavonoids, triterpenes, saponines, alkaloids as members of the second pattern).
One important conclusion of the study carried out is that no specificity of the descriptors is found since testing a larger group of descriptors (45) and a selection of only 17 out of all 45 descriptors lead to one and the same partitioning model.
Another interesting conclusion can be assumed in the frame in two groups.The big group of flavonoids, however, does not belong selectively to one of these patterns but is mingled with the members of the first and second pattern.We are aware of the fact that the group of flavonoids is quite big and complex both as chemical structures and chemical properties or medical impacts to be specifically partitioned.Our additional chemometric analysis of only the group of flavonoids confirms this complexity.

Figure 1 .
Figure 1.Hierarchical dendrogram for clustering of 135 natural compounds by 45 or 17 descriptors.Figure 1. Hierarchical dendrogram for clustering of 135 natural compounds by 45 or 17 descriptors.

Figure 1 .
Figure 1.Hierarchical dendrogram for clustering of 135 natural compounds by 45 or 17 descriptors.Figure 1. Hierarchical dendrogram for clustering of 135 natural compounds by 45 or 17 descriptors.

Figure 3 .
Figure 3. Plot of mean values (normalized) of 17 descriptors for each identified cluster of objects (blue-small cluster; red-big cluster).

Figure 3 .
Figure 3. Plot of mean values (normalized) of 17 descriptors for each identified cluster of objects (blue-small cluster; red-big cluster).

Figure 3 .
Figure 3. Plot of mean values (normalized) of 17 descriptors for each identified cluster of objects (blue-small cluster; red-big cluster).

Figure 9 .
Figure 9. Plot of means of 45 descriptors (standardized values) for each of the two identified clusters of flavonoids.

Figure 10 .
Figure 10.Plot of means of 17 descriptors (standardized values) for each of the two identified clusters of flavonoids.

Figure 9 .
Figure 9. Plot of means of 45 descriptors (standardized values) for each of the two identified clusters of flavonoids.

Figure 9 .
Figure 9. Plot of means of 45 descriptors (standardized values) for each of the two identified clusters of flavonoids.

Figure 10 .
Figure 10.Plot of means of 17 descriptors (standardized values) for each of the two identified clusters of flavonoids.Figure 10.Plot of means of 17 descriptors (standardized values) for each of the two identified clusters of flavonoids.

Figure 10 .
Figure 10.Plot of means of 17 descriptors (standardized values) for each of the two identified clusters of flavonoids.Figure 10.Plot of means of 17 descriptors (standardized values) for each of the two identified clusters of flavonoids.

Figure 13 .
Figure 13.Plot of means of 45 descriptors (standardized values) for each of the three identified clusters of flavonoids.

Figure 13 .
Figure 13.Plot of means of 45 descriptors (standardized values) for each of the three identified clusters of flavonoids.

Figure 13 .
Figure 13.Plot of means of 45 descriptors (standardized values) for each of the three identified clusters of flavonoids.

Figure 15 .
Figure 15.Plot of means of 17 descriptors (standardized values) for each of the three identified clusters of flavonoids.

Figure 15 .
Figure 15.Plot of means of 17 descriptors (standardized values) for each of the three identified clusters of flavonoids.

Table 2 .
Contribution Descriptor Distribution for Each Cluster.

Table 3 .
Partitioning of the Natural Compounds in the Formed Clusters. 1 Is for the Bigger (More Members) Cluster; 2-for the Smaller Cluster (Less Members).