- freely available
Int. J. Mol. Sci. 2009, 10(3), 1193-1214; doi:10.3390/ijms10031193
Published: 16 March 2009
Abstract: Aiming to assess the role of individual molecular structures in the molecular mechanism of ligand-receptor interaction correlation analysis, the recent Spectral-SAR approach is employed to introduce the Quantum-SAR (QuaSAR) “wave” and “conversion factor” in terms of difference between inter-endpoint inter-molecular activities for a given set of compounds; this may account for inter-conversion (metabolization) of molecular (concentration) effects while indicating the structural (quantum) based influential/detrimental role on bio-/eco- effect in a causal manner rather than by simple inspection of measured values; the introduced QuaSAR method is then illustrated for a study of the activity of a series of flavonoids on breast cancer resistance protein.
Being used in Chemistry during the second half of 20th century as an extended statistical analysis [1–8], the quantitative structure-activity relationship (QSAR) method had attained in recent years a special status, officially certified by European Union as the main computational tool (within the so called “in silico” approach) for the regulatory assessments of chemicals by means of non-testing methods [9–15].
However, while QSAR primarily uses the multiple regression analysis [6–8], alternative approaches as such neuronal-network (NN) or genetic algorithms (GA) have been advanced to somehow generalize the QSAR performance in delivering a classification of variables used, in the sense of principal component analysis (PCA) and partial least squares (PLS) methodologies; still, the claimed advantage of the NN over QSAR techniques is limited by the fact the grounding physical-mathematical philosophies are different since highly non-linear with basic multi-linear pictures are compared, respectively [16–23].
Actually, the chemical-physical advantage of QSAR stands in its multi-linearity correlation that resembles with superposition principle of quantum mechanics, which allow meaningful interpretation of the structural (inherently quantum) causes associated with the latent or unobserved variables (sometimes called as common factors) into the observed effects (activity) usually measured in terms of 50%-effect concentration (EC50), associated with various types of bioaccumulation and toxicity .
Nevertheless, many efforts have been focused on applying QSAR methods to non-linearity features from where the “expert systems” emerged as formalized computer-based environments, involving knowledge-based, rule-based or hybrid automata able to provide rational predictions about properties of biological activity of chemicals or of their fragments; it results in various QSAR based databases: the model database (QMDB) - inventorying the robust summaries of QSARs that can be appealed by envisaged endpoint or chemical, the prediction database (QPDB) - when data from QMDB are used for further prediction to be stored, or together towering the chemical category database (CCD) documentation [25–31].
Therefore, although undoubtedly useful, the “official” trend in employing QSAR methods is to classify, over-classify and validate through (external or molecular test set) prediction, a gap between the molecular computed orderings and the associate mechanistic role in bio-/eco- activity assessment remains as large as the QSAR strategy has not turned into a versatile tool in identifying the inter-molecular role in receptor binding sites through recorded activities by means of structurally selected common variables; that is to use QSAR information for internal mechanistic predictions among training molecules to see their inter-relation respecting the whole class of observed activities employed for a specific correlation. Such an approach will also be helpful for checking the chemical domain spanned by training molecules – a feature of the paramount importance also for further external tests.
The present communication wishes to start filling this gap by deepening the modeling of inter-molecular activity through extending the main concepts of recent developed Spectral-SAR [32–40], developed the fully algebraic version of traditional statistically optimized QSAR picture, targeting the quantification of the competition between molecular inter-activity and inter-endpoints records.
2. QuaSAR Methodology
Paradoxically, the main problem for QSAR resides not in performing the correlation itself but setting the variable selection for it; the mathematical counterpart for such problem is known as the “factor indeterminacy” [41–45] and affirms that the same degree of correlation may be reached with in principle an infinity of latent variable combinations. Fortunately, in chemical-physics there are a limited (although many enough) indicators to be considered with a clear-cut meaning in molecular structure that allows for rationale of reactivity and bindings [46,47]. However, the main point is that given a set of N-molecules, one can chose to correlate their observed activities with M-selected structural indicators in as many combinations as:inked by different endpoint paths, as many as: ndexing the numbers of paths built from connected distinct models with orders (dimension of correlation) from k=1 to k=M.
Basically, for each of the C-combinations a correlation (endpoint) QSAR equation is determined, say , containing all computed activities for all considered N-molecules within the l- selected correlation. Now, the Spectral-SAR version of QSAR analysis computes these activities in a complete non-statistical way, i.e. by assuming the vectors for both observed (activities) and unobserved (latent variables) quantities while furnishing their correlation throughout a specific S-SAR determinant obtained from the transformation matrix between the orthogonal (desirable) and oblique (input) correlations. Yet, besides producing essentially the same results as the statistical least-square fit of residues the S-SAR method introduces new concepts as:
endpoint spectral norm
algebraic correlation factor
spectral path, with the distance defined in the Euclidian sense as:
least spectral path principle, formally shaped as:
Nevertheless, for present purpose another two quantities are here introduced, namely:
▪ inter-endpoint norm difference (IEND),
▪ inter-endpoint molecular activity difference (IEMAD),
This way, we can interpret the two fittest molecules (i, j) as reciprocally activated by the models (l, l’) through the spectral path whom they belong; put in analytical terms, the difference between quantities of Eqs. (6) and (7) may assure the “jump” or transition activity that turns the effect of i molecule on that of j molecule across the least spectral (here revealed as metabolization) path connecting the models l and l’:IEND of Equation (6) was assumed, i.e. ΔYl|l′ → iΔYl|l′ outside the factor . Remark that although the differences in Eqs. (6) and (7) were consider mathematically, along the “arrow” i-to-j, the “quantum transformation” from Equation (9) suggests that the bio-chemical-physical equivalence (metabolization) of the concentration effects evolves from-j-to-i, revealing a typical quantum behavior with the factor playing the propagator role as the quantum kernels in path integral formulation of quantum mechanics .
Equation (9) stands as the present “quantum”-SAR equation since:
○ it involves the wave-type expression of molecular effect of concentration, however, for special selected molecules (the fittest out of the C-models) and for special selected paths (the least for the M-ergodic assembly), being M and C related by Equation (1a);
○ it provides the specific transition or specific transformation of the effect of a certain molecule into the effect of another special molecule out from the N-trained molecules, paralleling the phenomenology of consecrated quantum transitions;
○ it has the amplitude of transformation driven by the so called quantum-SAR factor of an exponential formdefining the specific quantum-SAR wave;
○ it allows the identitywhen the reverse effects is considered and substituted in the direct one (9), as absorption and emissions stand as reciprocal quantum effects;
○ it has a “phase” with unity norm, in the same manner as ordinary quantum wave functions, allowing the inter-molecular “real” quantum-SAR transformationexclusively regulated by the quantum-SAR factor of Equation (10), in the same fashion as quantum tunneling is characterized by the transmission coefficient;
○ when multiple transformations take place across paths with multiple linked models, say (l, l’, l’ ’), the inter-molecular transformation i→j→t is characterized by the overall quantum-SAR factor (10) written as product of intermediary onesdue to the two-equivalent ways the effect may be described directly from t or intermediated by j molecular effect transformations, respectively: in the same way as the quantum propagators behave along quantum paths ; certainly, such contraction scheme may be generalized for least paths connecting the M-contained k-endpoints giving an overall quantum-SAR (“metabolization power”) factor as:
○ Equation (9) supports the self-transformation as well, with the driven qua-SAR factor given by:during its evolution along the least paths when the same molecule (i=j) is metabolized by activating certain structural features (l≠l’) though specific indicators (variables) in correlation (bindings with receptor site); this case resembles the stationary quantum case according which even isolated (or with free motion), the molecular structures suffer dynamical wave-corpuscular or fluctuant transformation along their quantum paths;
With the present Qua-SAR methodology one can appropriately identify the molecular pairs that drive certain bio-/eco- activities against given receptor by means of selected descriptors in a “wave”-or “quantum” mechanistic formal way. The ultimate goal will be the computation of quantum-SAR factors along the least paths of actions that give the potential information of the conversion power of the fittest molecules in their specific bindings.
However, in order to practically understand the actual Qua-SAR approach all steps above will be in next specialized through an application for identifying the most involved polyphenolic molecules for their activity related to mammalian breast cancer.
3. Application to Flavonoids’ Anticancer Bioactivity
Although in general considered beneficial for their protective role in many age-related diseases - flavonoids (see Figure 1 – with the general scheme in no.0) should be more carefully studied since their pharmacokinetics are not entirely elucidated [49–54].
For instance, recently, it was inferred that for certain flavonoids such as chrysin, nbiochanin A and apigenin a very low micromolar concentration is capable of producing 50% (EC50) of the maximum increase in mitoxantrone (MX) inhibitor substrate accumulation (interaction) with breast cancer resistance protein (BCRP), helping in reversing the multidrug resistance (MDR) mechanism of overexpressing MCF-7 MX100 cancer cells [51–54].
Therefore, in order to assess the molecular role and structural- related mechanisms for potential lead compounds in the drug design for anti-cancer treatment, a series of representative classes of flavonoids have been employed, see Figure 1, with their recorded biological activities (A) among the computed transport (hydrophobicity-LogP), the electrostatic (polarizability POL), and steric (total energy at optimized 3D-configuration ETOT) Hantsch correlation variables , see Table 1, to successively provide the QSAR, S-SAR and finally to unfold the Qua-SAR analysis.
Note that in Table 1 the molecules were displayed in ascendant order of their recorded activities, from no. 1 to no. 24, for having present which is superior to which each time they are reciprocally quotation. Such an arrangement allows the construction of an activity differences chart, see Table 2, with great utility in establishing the inter-endpoint molecular activity differences of Equation (7) entering quantum-SAR factor of Equation (10).
Next, for computing the other influential activity difference in Qua-SAR, namely the inter-endpoint norm difference of Equation (6), the C = 10 possible endpoint models with data of Table 1 are in Table 3 presented. However, worth remarking that the traditional hydrophobicity factor LogP seems to have quite little or even no-influence from traditional statistical correlation (model Ia).
The first conclusion is that flavonoids have practically no exclusive or primarily role in drug transporting to BCRP site; still, the electrostatic influence through POL is practically missing as well (model Ib), while the stericity through ETOT unfolds some statistically sensitive role in ligand (MX)-receptor (BCRP) binding (model Ic). The last assertion may also be sustained by going to the two-correlated parameters endpoint models, when one can see the confirmation of the stericity role through ETOT correlation variable: while combination LogP∧POL does not improve the statistical correlation of model IIa significantly over single-parameter LogP∨POL correlations, the total energy presence provides better and better correlation behavior as it is combined with LogP (the model IIb) and with POL (the model IIc), respectively. Instead, when all the Hansch structural variables are taken into account the model III is generated with appreciable statistical correlation respecting the other computed combinations.
Overall, it cannot be inferred that LogP and POL does have no influence on correlation only because when alone they do not correlate at all with flavonoids’ bioactivity, because their cumulative presence in model III highly improves the single ETOT correlation of model Ic as well as mixed correlations of bi-variable models IIb and IIc. Therefore, the mechanistic “alchemy” of structural features on molecular activity seems complex enough when all hydrophobicity, electrostatic and stericity influences combine as they are reciprocally activating one each other with a superior resultant in modeling ligand-receptor binding.
Yet, the algebraic correlation factors in Table 3 deserve special discussion: it is clear that as they are not measuring the dispersive character of the local computed (molecular) points against the average recorded activity as statistical metrics do, their values are all close to unity and close to each other as well; however, they are modeling another reality of computation, being closer to path integral approach than to differential analysis, through indexing the global behavior or the total length of the computed vector to the recorded one. Still, while between the algebraic and statistical correlations only an indirect connection exists , the one-to-one hierarchical ordering of models is always recorded thus supporting the usefulness of using algebraically scale when the shrink of correlation factors is more favorable. For instance, in the present case, as above revealed, according to the statistical analysis, there seems that LogP (Ia) and POL (Ib) have no influence on correlation, while when combined with ETOT in model III they considerably enrich the single ETOT correlation power of model Ic. Such behavior shows that orthogonal, i.e. independent, descriptors may provide better results when are combined than when considered apart due to the increase of the (inter) correlation space.
Having performed the QSAR analysis, the specific Spectral-SAR stage can be unfolded by means of the (K = 9, M = 3) ergodic paths with the spectral Euclidian lengths given by Equation (4) in both statistical and algebraic frameworks, as shown in Table 4. Next, the least M = 3 paths with the dominant M-factors influence are selected by applying the above exposed recursive rule of least path principle resumed by Equation (5). Remarkably, there follows that the resulting alpha (α), beta (β), and gamma (γ) most influential paths are identically shaped no matter whether statistical or algebraically schemes are undertaken. This result, although not necessarily viewed as a general rule, shows that in this specific case the algebraically analysis leaves with systematically the same mechanistically results as those obtained with statistical tools. However, once more, we stress on that algebraically measure may give more realistic inside in the Q(Spectral)-SAR phenomenology since its inner vectorial and norm-based algorithm accounting for each individual molecular contribution to the whole activity “basin” rather than respecting the average activity.
Going now to the individual molecular level analysis, Table 5 lists the residual activities between computed and observed activities for each of considered models, distributed along the already identified least paths. At this instance, the most fitted molecule is outlined out of each endpoint; most impressive, the actual research selected the same molecule as the best fitted one along the both α and β paths, namely molecule no. 12 (4′-5,7-trimethoxyflavanone) and molecule no. 13 (flavone), respectively. Moreover, these molecules are not among the most potent one respecting the observed activity of Table 1, being situated at the middle to second-half panel of the 24 molecules considered.
Such result tells us that the maximum recorded activity is not necessarily that one induced by specific chosen structural variables (here as LogP, POL, and ETOT). This is the case of the most fitted molecule on the most correlated endpoint (III) appeared to be no.3 (naringenin), with low activity on the observed range compared with the no. 25 (7,8-benzoflavone) in Table 1. Consequently, one can say that the first half of the observed activities in Table 1 may be attributed to certain physicochemical indicators with clear mechanistically roles, while the rest of observed activities may be due to other unidentified specific structural descriptors or even to non-specific ones (rooting in the sub-quantum nature of the particular observer-observed system). Nevertheless, this lower activity prescribed by the computational results is in accordance with the so called “homeopathic principle” prescribing cure by moderate-to-low active drugs while better monitoring their effects through controlled physico-chemical descriptors.
For the sake of comparison, the actual Spectral(Qua)SAR results are to be compared with the consecrated Principal Component Analysis (PCA) . This way, Figure 2 illustrates the graphical 3D correlations among the descriptors LogP, POL and ETOT used in this study; it offers a visual way for assessing the almost no-correlation of LogP with other concerned variables, POL and ETOT, respectively. This lead with conclusion that LogP is almost orthogonal (independent) on (respecting) the other two Hantsch variables. Instead, when further performing the factor analysis, the Table 6 is obtained while clearly revealing the scarce correlation carried by considering LogP variable alone. This is in close agreement with the Spectral-SAR results, see above. In any case, the hydrophobicity description and its descriptor cannot be rejected only by factor analysis since it drives (firstly or latter) the inter-membrane interaction that is essential for drug-cell binding. Spectral- and Qua-SAR highly proved the important role hydrophobicity plays in combination with electrostatic (POL) and steric (ETOT) interactions. Moreover, while PCA shows the POL factor influence equals that of ETOT, whereas their role in correlation is sensible different in Spectral-SAR analysis (compare model Ib-last column of Table 3 with POL-last column of Table 6). However, again, this discrepancy is in the favor of S-SAR since the PCA results are due to the sensitive degree of POL-ETOT correlation (see Figure 2), from where the PCA yield that POL and ETOT display similar correlation power, while S-SAR includes also the orthogonalization of POL and ETOT variables prior correlation takes effect and better discriminates among their influence in bonding.
Nevertheless, going ahead with the Spectral-SAR results the Qua-SAR factors may be immediately recover by employing the molecular activity differences from Table 2 for the best fitted molecules of Table 5 along the models of the most influential paths in molecular mechanism towards MX-BCRP binding. The resulted IEND and IEMAD of Table 7 are combined to produce the quantum-SAR factors of Equation (10) type for each two-molecules-two-models on specific paths, while the “metabolization power” per path is finally obtained by their couplings, according with multiplicative quantum rule of Equation (16). Worth noting that the overall quantum-SAR factors of paths are in total agreement with the previous spectral-SAR selected path hierarchy, i.e. the α path is associated with the highest q-SAR factor, being followed by that of β path and by that of γ one in last column of Table 7.
This result may be quite important if such a behavior may be proven to hold in general since it would allow the effective quantification of paths according with their metabolization power. However, such endeavor exceeds the present communication purpose and will remain as a future challenge in Qua-SAR studies.
Finally, all QSAR, Spectral-SAR and Qua-SAR computational results may be collected and resumed by associate “spectral” scheme for evolution of the fittest molecular structures along the endpoint models for the (M=)3 selected mechanistic paths of actions, see Figure 3. Note that algebraic correlation environment was chose as the “vertical” indicator for the degree with which a certain model reaches the observed activity in the vectorial norm sense (equally, the norm themselves could be used for ordinate axis ).
Going now to comment upon the “metabolization power” as indicated by the quantum-SAR factors on Figure 3, one can firstly observe that for the α path the “first movement” from the Ic (EToT) to IIb (LogP∧ETOT) corresponds to quantum free motion so that the null IEMAD for molecule no. 12 (4′-5,7-trimethoxyflavanone) is carried; here, the quantum metabolization factor is consumed only for strongly activating the membrane transporter feature (LogP) of the same molecule. Instead, on the last passage of the α path the factor is responsible for converting the electrostatic (POL) influence of the flavonoids no. 12 towards no.3 (naringenin) activity as well as for reverse-O-methylation (methoxylation) of oxygens in positions 5 and 7 (on ring A) and 4′ (on ring B) respecting the molecular pattern no.0 in Figure 1, respectively. Such result is in fully accordance with the reverse quantum influence that is at the foreground of quantum-SAR factor conversion prescribed by Equation (9), i.e. quantifying the power of back transformation of molecular EC50s respecting the “arrows” of IEND and IEMAND in Equation (10). However, the fact that such transformation is the first one acting at molecular level is sustained also by optimized 3D configurations of involved molecules no. 12 and 3, being both with rings A and B spatially bent in Figure 3 respecting the ring C of the planar pattern no.0 of Figure 1.
A somewhat different situation is met for β path in Figure 3; in its first part a higher q-SAR factor ( ) is needed for activating the transporter hydrophobicity feature in model IIa (LogP∧ POL) starting from model Ib (POL), while in its second part the molecule no. 13 (flavone) is shown to be metabolized in molecule no. 3 (naringenin) by a direct hydroxylation in positions 5, 7 (on ring A) and 4′ (on ring B), the same as before, respecting the molecular pattern no.0 in Figure 1, by a smaller q-SAR factor, , compared with that involved in the previous alpha path. Despite these, the overall quantum factor of beta path is lower than that of alpha, meaning a decrease capacity of metabolization since direct addition is involved, contrarily to the ordinary “inverse” Quantum-SAR transformation of Equation (9), while stericity (here founded as the most influential QSAR variable) is triggered by more steric energy difference consumed between the planar optimized configuration of molecule no. 13 on that spatially bended of molecule no. 3 in Figure 3.
Even more metabolization “operations” take place along the γ path of Figure 3: there is started on the same planar configuration of molecule no.13 (flavone); then, the q-SAR factor turn it into molecule no. 8 (6,2′,3′-7-hydroxyflavanone) by hydroxylation on the indicated positions (6 and 7 for ring A and 2′ and 3′ for ring B respecting the pattern molecule no.0 of Figure 1) while activating electrostatic and steric factors in model IIc (POL∧Etot) from independent hydrophobicity factor of model Ia (LogP) – a complex movement that explain why this molecular path comes at the final, with less probability and potency; nevertheless, this path has on its last passage no less complex transformation, i.e. turning the molecule no. 8 into no. 3 one by combined reverse hydroxylation in positions 2′ and 3′ with direct hydroxylation of position 4′ on B ring and with movement from ortho (6) – to – para (5) of hydroxyl group on ring A respecting pattern molecule no. 0 of Figure 1, respectively; the transformation efficiency is a bit higher than on the first part of the path since it require less steric energy consumption to bent the ring C respecting the A-B ones while accounting for electronic delocalization density (orbitals) over them until the configuration of molecule no. 3 is reached in Figure 3.
Overall, it is clear that the Qua-SAR scheme offers a quantification recipe along the most effective spectral paths combined with most fitted molecules for a trial basin of analogues compounds and structural variables. In the present case there was revealed that the energetic steric factor EToT seems to mainly drive the mechanistic molecular transformation in MX-BCRP binding phenomenology, while the molecule no. 3 (naringenin) appears as the best fitted molecules belonging to the most relevant endpoint, in clear disjunction with the roughly molecular selection upon initial input observed activity data. That is, naringenin (no. 3) is shown to be the best adapted molecule for the actual LogP∧POL∧ETOT structural (independent) factors being metabolized from molecules as 6,2′,3′-7-hydroxyflavanone (no. 8), 4′-5,7-trimethoxyflavanone (no. 12), and flavone (no.13) by specific molecular mechanistically paths. However, there appears that these molecules are not linked even through the paths with the most active compounds of Table 1; statistically, this can be explained by the so called “regression towards the mean” effects, in the sense that the best correlations translated to the compounds found in the middle of the mentioned sorted Table 1; from the structural point of view such behavior may attributed to the specific parameters used for correlations that best describe molecules with specific groups, most favorable for the descriptor’s nature.
On the other hand, the present study affirms the position 7 of ring A and position 4′ of ring B respecting the pattern molecule no. 0 of Figure 1 as the most suitable ones for producing an increase in BCRP inhibition activity, given that these positions belong to the α and β paths and being common to the rest of spectral paths as well. Instead, the position that does not appear at all in any of the α, β, or γ paths, namely position 8 on ring B may present adverse drug interactions.
Further Qua-SAR studies are necessary and will be developed for exploring other bio- and eco-active compounds for their interactions with organs and organisms; they may hopefully lead to a coherent analytical picture of chemical-biological bonding focused on selecting the most adapted molecules and of the most privileged molecular positions for delivering controlled structural based chemical reactivity and biological activity.
The modern in silico (computational) chemical analysis respecting the bio- activity and availability of analogues substances, potentially beneficial or detrimental for specific interaction in organs and organisms, faces with a paradoxical dichotomy: if searching for the best correlation useful for prediction of specific molecular bio- or eco- activity QSAR models involving un-interpretable many latent variables may be produced, while always remaining the question of correlation factor indeterminacy (i.e. the assumed descriptors can be at any time replaced with other producing at least the same correlation performances); instead, when restricting the analysis to search for molecular design and mechanisms throughout performing SARs by means of special structural indicators for a given class of relevant molecules, arises the price of limiting the use of generated models for further prediction. The present communication is mainly devoted in developing the second (Q)SAR facet by extending the recent introduced notion of spectral-path-linking-endpoints and the associate least action principle to spectral path quantification, in terms of the best fitted molecules, along the contained computed models, by means of the introduced q(uantum)-SAR factor within the generally called Quantum-SAR (QuaSAR) methodology.
As an application, for representative flavonoids’ inhibiting activities on breast cancer resistant protein there was clearly shown that the newly introduced q-SAR factor offers relevant analytical characterization of previously conceptually introduced spectral path hierarchy; moreover, the present QuaSAR may allow interpretation inter-conversion of concerned molecules’ towards receptor binding since belonging to the same class of analogs, while they certainly undertaking such transformation during their interaction with macromolecules, proteins and enzymes present on cellular walls or with in vivo environment.
Basically, the QuaSAR stands as the first step in assessing the quantum mechanically equivalent of wave function to the sample of molecules interacting with a specific organism site; it will eventually lead with the hyper-wave function with the help of which the associate hyper-density probability of binding (metabolization) is to be computed; the last information may provide the density probability map of the ligand-receptor interaction abstracted from the structural Spectral-Qua-SAR correlations; with this tool the molecular design of new chemical structures may be appropriately undertaken.
However, the actual QuaSAR scheme and quantum factor carry the main features of quantum dynamical systems and may stimulate future computational and conceptual developments in molecular design for structurally controlled activity. Further generalization of the present QuaSAR method to modeling all potential inter-conversions of employed molecules involved in correlation as well as for establishing their quantum metabolization complete map (through, for instance, hydrophobic, electrostatic and steric barrier tunneling) is actually in progress and will be reported in subsequent communications.
Authors are sincerely grateful to anonymous referees for their indeed useful and constructive comments.
- Anderson, TW. An Introduction to Multivariate Statistical Methods; Wiley: New York, USA, 1958.
- Draper, NR; Smith, H. Applied Regression Analysis; Wiley: New York, USA, 1966.
- Shorter, J. Correlation Analysis in Organic Chemistry: An Introduction to Linear Free Energy Relationships; Oxford Univ. Press: London, UK, 1973.
- Box, GEP; Hunter, WG; Hunter, JS. Statistics for Experimenters; John-Wiley: New York, USA, 1978.
- Green, JR; Margerison, D. Statistical Treatment of Experimental Data; Elsevier: New York, USA, 1978.
- Topliss, J. Quantitative Structure-Activity Relationships of Drugs; Academic Press: New York, USA, 1983.
- Seyfel, JK. QSAR and Strategies in the Design of Bioactive Compounds; VCH Weinheim: New York, USA, 1985.
- Chatterjee, S; Hadi, AS; Price, B. Regression Analysis by Examples, 3rd Ed ed.; John-Wiley: New-York, USA, 2000.
- European Commission. Off. J. Eur. Union, L 396/1 of 30.12.2006; Office for Official Publication of the European Communities (OPOCE): Luxembourg, 2006.
- European Commission. Off. J. Eur. Union, L 396/850 of 30.12.2006; Office for Official Publication of the European Communities (OPOCE): Luxembourg, 2006.
- Worth, AP; Bassan, A; Gallegos Saliner, A; Netzeva, TI; Patlewicz, G; Pavan, M; Tsakovska, I; Vracko, M. The characterization of quantitative structure-activity relationships: Preliminary guidance; European Commission - Joint Research Centre: Ispra, Italy, 2005.
- Worth, AP; Bassan, A; Fabjan, E; Gallegos Saliner, A; Netzeva, TI; Patlewicz, G; Pavan, M; Tsakovska, I. The characterization of quantitative structure-activity relationships: Preliminary guidance; European Commission - Joint Research Centre: Ispra, Italy, 2005.
- Benigni, R; Bossa, C; Netzeva, TI; Worth, AP. Collection and evaluation of [(Q)SAR] models for mutagenicity and carcinogenicity; European Commission - Joint Research Centre: Ispra, Italy, 2007.
- So, SS; Karpuls, M. Evolutionary optimisation in quantitative structure-activity relationship: An application of genetic neural network. J. Med. Chem 1996, 39, 1521–1530, doi:10.1021/jm9507035. 8691483
- Kubinyi, H. Evolutionary variable selection in regression and PLS analysis. J. Chemometr 1996, 10, 119–133, doi:10.1002/(SICI)1099-128X(199603)10:2<119::AID-CEM409>3.0.CO;2-4.
- Teko, IV; Alessandro, VAEP; Livingston, DJ. Neutral network studies. 2. Variable selection. J. Chem. Inf. Comput. Sci 1996, 36, 794–803, doi:10.1021/ci950204c. 8768768
- Kubinyi, H. Variable selection in QSAR studies. 1. An evolutionary algorithm. Quant. Struct.-Act. Relat 1994, 13, 285–294.
- Haegawa, K; Kimura, T; Fanatsu, K. GA strategy for variable selection in QSAR Studies: Enhancement of comparative molecular binding energy analysis by GA-based PLS method. Quant. Struct.-Act. Relat 1999, 18, 262–272, doi:10.1002/(SICI)1521-3838(199907)18:3<262::AID-QSAR262>3.0.CO;2-S.
- Zheng, W; Tropsha, A. Novel variable selection quantitative structure-property relationship approach based on the k-nearest neighbour principle. J. Chem. Inf. Comput. Sci 2000, 40, 185–194, doi:10.1021/ci980033m. 10661566
- Lucic, B; Trinajstic, N. Multivariate regression outperforms several robust architectures of neural networks in QSAR modelling. J. Chem. Inf. Comput. Sci 1999, 39, 121–132, doi:10.1021/ci980090f.
- Duchowicz, PR; Castro, EA. The Order Theory in QSPR-QSAR Studies; Mathematical Chemistry Monographs, University of Kragujevac: Kragujevac, Serbia, 2008.
- Zhao, VH; Cronin, MTD; Dearden, JC. Quantitative structure-activity relationships of chemicals acting by non-polar narcosis - theoretical considerations. Quant. Struct.-Act. Relat 1998, 17, 131–138, doi:10.1002/(SICI)1521-3838(199804)17:02<131::AID-QSAR131>3.3.CO;2-C.
- Pavan, M; Netzeva, T; Worth, AP. Review of literature based quantitative structure-activity relationship models for bioconcentration. QSAR Comb. Sci 2008, 27, 21–31, doi:10.1002/qsar.200710102.
- Pavan, M; Worth, AP. Review of estimation models for biodegradation. QSAR Comb. Sci 2008, 27, 32–40, doi:10.1002/qsar.200710117.
- Tsakovska, I; Lessigiarska, I; Netzeva, T; Worth, AP. A mini review of mammalian toxicity (Q)SAR models. QSAR Comb. Sci 2008, 27, 41–48, doi:10.1002/qsar.200710107.
- Gallegos Saliner, A; Patlewicz, G; Worth, AP. A review of (Q)SAR models for skin and eye irritation and corrosion. QSAR Comb. Sci 2008, 27, 49–59, doi:10.1002/qsar.200710103.
- Patlewicz, G; Aptula, A; Roberts, DW; Uriarte, E. A mini-review of available skin sensitization (Q)SARs/Expert systems. QSAR Comb. Sci 2008, 27, 60–76, doi:10.1002/qsar.200710067.
- Netzeva, T; Pavan, M; Worth, AP. Review of (quantitative) structure-activity relationship for acute aquatic toxicity. QSAR Comb. Sci 2008, 27, 77–90, doi:10.1002/qsar.200710099.
- Cronin, MTD; Worth, AP. (Q)SARs for predicting effects relating to reproductive toxicity. QSAR Comb. Sci 2008, 27, 91–100, doi:10.1002/qsar.200710118.
- Putz, MV. A spectral approach of the molecular structure – biological activity relationship part I. The general algorithm. Ann. West Univ. Timişoara Ser. Chem 2006, 15, 159–166.
- Putz, MV; Lacrămă, A-M. A spectral approach of the molecular structure – biological activity relationship part II. The enzymatic activity. Ann. West Univ. Timişoara Ser. Chem 2006, 15, 167–176.
- Putz, MV; Lacrămă, A-M. Introducing spectral structure activity relationship (S-SAR) analysis. Application to ecotoxicology. Int. J. Mol. Sci 2007, 8, 363–391, doi:10.3390/i8050363.
- Lacrămă, A-M; Putz, MV; Ostafe, V. A Spectral-SAR model for the anionic-cationic interaction in ionic liquids: Application to Vibrio fischeri ecotoxicity. Int. J. Mol. Sci 2007, 8, 842–863, doi:10.3390/i8080842.
- Putz, MV; Lacrămă, A-M; Ostafe, V. Spectral-SAR ecotoxicology of ionic liquids. The daphnia magna case. Res Lett Ecol 2007, 1–5.
- Putz, MV; Duda-Seiman, C; Duda-Seiman, DM; Putz, A-M. Turning SPECTRAL-SAR into 3D-QSAR analysis. application on H+K+-ATPase inhibitory activity. Int. J. Chem. Model 2008, 1, 45–62.
- Lacrămă, A-M; Putz, MV; Ostafe, V. Designing a spectral structure-activity ecotoxico-logistical battery”. In advances in quantum chemical bonding structures; Putz, MV, Ed.; Transworld Research Network: Kerala, India, 2008; pp. 389–419.
- Putz, MV; Putz (Lacrămă), A-M. Spectral-SAR: Old wine in new bottle. Studia Universitatis Babeş-Bolyai Chemia 2008, 53, 73–81.
- Putz, MV; Putz, A-M; Ostafe, V; Chiriac, A. Application of spectral-structure activity relationship (S-SAR) method to ecotoxicology of some ionic liquids at the molecular level using acethylcolinesterase. Int J Chem Model 2009, 2.
- Steiger, JH; Schonemann, PH. A history of factor indeterminacy. In Theory Construction and Data Analysis in the Behavioural Science; Shye, S, Ed.; Jossey-Bass Publishers: San Francisco, CA, USA, 1978.
- Spearman, C. The Abilities of Man; MacMillan: London, UK, 1927.
- Wilson, EB. Review of the abilities of man, their nature and measurement, by Spearman, C. Science 1928, 67, 244–248, doi:10.1126/science.67.1731.244.
- Wilson, EB; Hilferty, MM. The distribution of chi-square. Proc. Nat. Acad. Sci. USA 1931, 17, 684, doi:10.1073/pnas.17.12.684. 16577411
- Wilson, EB; Worcester, J. A note on factor analysis. Psychometrika 1939, 4, 133–148, doi:10.1007/BF02288492.
- Topliss, JG; Costello, RJ. Chance correlation in structure-activity studies using multiple regression analysis. J. Med. Chem 1972, 15, 1066–1068, doi:10.1021/jm00280a017. 5069775
- Topliss, JG; Edwards, RP. Chance factors in studies of quantitative structure-activity relationships. J. Med. Chem 1979, 22, 1238–1244, doi:10.1021/jm00196a017. 513071
- Dittrich, W; Reuter, M. Classical and Quantum Dynamics From Classical Paths to Path Integrals; Springer-Verlag: Berlin, Germany, 1992.
- Havsteen, BH. The biochemistry and medical significance of the flavonoids. Pharmacol. Ther 2002, 96, 67–202, doi:10.1016/S0163-7258(02)00298-X. 12453566
- Middleton, E, Jr; Kandaswami, C; Theoharides, TC. The effects of plant flavonoids on mammalian cells: implications for inflammation, heart disease, and cancer. Pharmacol. Rev 2000, 52, 673–751. 11121513
- Zhang, S; Yang, X; Coburn, RA; Morris, ME. Structure activity relationships and quantitative structure activity relationships for the flavonoid-mediated inhibition of breast cancer resistance protein. Biochem. Pharmacol 2005, 70, 627–639, doi:10.1016/j.bcp.2005.05.017. 15979586
- Zhang, S; Yang, X; Morris, ME. Combined effects of multiple flavonoids on breast cancer resistance protein (ABCG2)-mediated transport. Pharm. Res 2004, 21, 1263–1273, doi:10.1023/B:PHAM.0000033015.84146.4c. 15290869
- Zhang, S; Yang, X; Morris, ME. Flavonoids are inhibitors of breast cancer resistance protein (ABCG2)-mediated transport. Mol. Pharmacol 2004, 65, 1208–1216, doi:10.1124/mol.65.5.1208. 15102949
- Sargent, JM; Williamson, CJ; Maliepaard, M; Elgie, AW; Scheper, RJ; Taylor, CG. Breast cancer resistance protein expression and resistance to daunorubicin in blast cells from patients with acute myeloid leukaemia. Br. J. Haematol 2001, 115, 257–262, doi:10.1046/j.1365-2141.2001.03122.x. 11703319
- Hypercube, Inc. HyperChem 701, Program package, Semiempirical, AM1, Polak-Ribier optimization procedure 2002.
- Hansch, CA. A quantitative approach to biological-structure activity relationships. Acta Chem. Res 1969, 2, 232–239, doi:10.1021/ar50020a002.
- Miller, JN; Miller, JC. Statistics and Chemometrics for Analytical Chemistry, 4th Ed ed.; Pretience Hall: Harlow, England, 2000.
- StatSoft, Inc. STATISTICA for Windows, Computer program and manual 1995.
|Table 1. The flavonoids of Figure 1 arranged by their ascending observed activities, defined as A= -log10(EC50[μM]) , along the associate computed structural parameters like the hydrophobicity (LogP), electronic cloud polarizability (POL) and the ground state configurationally optimized total energy (ETOT) .|
|No.||Molecular Name||Activity||Structural parameters|
|(15)||Biochanin A||5.79||1.53||29.10||− 87961.2812|
|(24)||7,8 – Benzoflavone||7.14||3.35||32.63||− 74634.5234|
|Table 2. The anti-symmetric matrix of the inter-molecular activity differences for the working flavonoids of Table 1.|
|Table 3. QSAR equations through Spectral-SAR multi-linear procedure [32–34] for all possible correlation models considered from data of Table 1; here |X0 〉 is the unitary vector|11...124〉, while the structural variables are set as |X1〉 = LogP, |X2〉 = POL, and |X3〉 = ETOT; the predicted activities’ norms where calculated with Equation (2), while the algebraic correlation factor of Equation (3) uses the measured activity of ‖| A〉‖ = 26.9357 computed upon Equation (2) with data of Table 1; RStatistic is the traditional Pearson correlation factor [1–8].|
|Model||Variables||(Q/S-)SAR Equation||‖|Y〉 PREDICTED‖||RAlgebraic||RStatistic|
|Ia|||X0>, |X1>|||Y>Ia = 5.39837|X0>+0.0179106|X1>||26.6138||0.988049||0.0175601|
|Ib|||X0>, |X2>|||Y>Ib = 5.67735 X0>–0.00834411|X2>||26.61425||0.988065||0.0409922|
|Ic|||X0>, |X3>|||Y>Ic = 6.48303|X0>+0.0000124625|X3>||26.6344||0.988812||0.252513|
|IIa|||X0>, |X1〉,|X2>|||Y>IIa = 5.64318|X0> +0.0178242 |X1〉–0.00833676|X2>||26.614349||0.988069||0.0445618|
|IIb|||X0>, |X1>,|X3>|||Y>IIb = 6.93331|X0> − 0.120924|X1>+0.0000150708|X3>||26.638||0.988947||0.273909|
|IIc|||X0>, |X2>,|X3>|||Y>IIc = 4.99884|X0> +0.122989|X2>+0.0000376701 |X3>||26.6681||0.990063||0.409837|
|III|||X0>, |X1>,|X2>, |X3>|||Y>III = 5.59424|X0>–1.05993|X1>+0.400704|X2>+0.000117452|X3>||26.7758||0.994064||0.708509|
|Table 4. Synopsis of paths connecting the endpoints of Table 3 in the norm-correlation spectral-space.|
|Table 5. Residual activities Ai – YiModel of the compounds of Table 1 for the Spectral-SAR models of Table 3 ordered according with the alpha, beta and gamma paths of Table 4; that residue which is closes to zero in each considered endpoint is marked by a line border.|
|Table 6. Principal Component Analysis (PCA) for the data of Table 1 within unrotated (unnormalized) factor score coefficients .|
|% total variance:||65.27195||29.73757||4.99049||factors’|
|Table 7. Determination of the quantum-SAR, see Equation (10) with Eqs. (6) and (7), associate with certain couple of molecules involved in activating specific structural quantum indices (or their combinations) driving spectral paths of Table 4, by employing minimum residue recipe throughout Table 5 for each considered endpoint, as well as the associate recorded bioactivity differences of Table 2, respectively.|
#Inter-Endpoint Norm Difference, Equation (6);♣Inter-Endpoint Molecular Activity Difference, Equation (7);*Note that here the basic relation of Equation (10) was considered in decimal base since originally, the associated activities in Table 1 were as such defined.
© 2009 by the authors; licensee Molecular Diversity Preservation International, Basel, Switzerland. This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/). This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).