Quantitative Structure-Antioxidant Activity Models of Isoflavonoids: A Theoretical Study

Seventeen isoflavonoids from isoflavone, isoflavanone and isoflavan classes are selected from Dalbergia parviflora. The ChEMBL database is representative from these molecules, most of which result highly drug-like. Binary rules appear risky for the selection of compounds with high antioxidant capacity in complementary xanthine/xanthine oxidase, ORAC, and DPPH model assays. Isoflavonoid structure-activity analysis shows the most important properties (log P, log D, pKa, QED, PSA, NH + OH ≈ HBD, N + O ≈ HBA). Some descriptors (PSA, HBD) are detected as more important than others (size measure Mw, HBA). Linear and nonlinear models of antioxidant potency are obtained. Weak nonlinear relationships appear between log P, etc. and antioxidant activity. The different capacity trends for the three complementary assays are explained. Isoflavonoids potency depends on the chemical form that determines their solubility. Results from isoflavonoids analysis will be useful for activity prediction of new sets of flavones and to design drugs with antioxidant capacity, which will prove beneficial for health with implications for antiageing therapy.


Introduction
Flavonoids and isoflavonoids influence intercellular redox status to interact with specific proteins in intracellular signaling pathways and present antioxidant properties [1]. Antioxidants are chemical entities that function breaking free-radical chain reaction and metal ion chelation, which would catalyze free-radical-induced systemic damage. The molecules are polyphenolic and electron-rich, potentially acting as substrate inhibitors for the cytochrome P450 (CYP) enzymes and inducing detoxification enzymes, e.g., CYP-dependent monooxygenases (MOs) [2]. Some polyphenols penetrate the blood-brain barrier (BBB) into regions mediating cognitive behavior [3]. Because of flavonoids structural diversity, quantitative structure-activity relationships (SARs) (QSARs) were studied via antioxidant capacity assays [4]. Flavonoids potency depends on their chemical structure, which is influenced by the number and position of hydroxyl groups (OH) attached to both aromatic rings [5]. Isoflavonoids QSARs are scarce [6][7][8][9]. Isoflavonoids antioxidant activity depends on the redox properties of their hydroxyphenolic groups and structural relationship among the different moieties of the chemical structure, which allows many substitution patterns and variations on ring C (Table 1). Promden et al. evaluated antioxidant activities of 24 isoflavonoids from Dalbergia parviflora via three complementary in vitro antioxidant-based assay systems [10]: xanthine/xanthine oxidase (X/XO) [11], oxygen radical absorbance capacity (ORAC) [12] and 2,2-diphenyl-1-picrylhydrazyl (DPPH) [13]. The isoflavonoids consist of three subgroups. The isoflavones exhibited the highest antioxidant potency based on all three assays. The additional presence of an OH in ring B at either R3′ or R5′ from the basic structure of R7-OH in ring A, and R4′-OH or -OMe of ring B increased the antioxidant activities of all isoflavonoid subgroups.
Modeling via QSAR became important in the drug candidate (new chemical entity, NCE) design, environmental fate modeling, toxicity and property prediction of chemicals, since they offer an economical and time-effective alternative to the medium-throughput in vitro and low-throughput in vivo assays [14,15]. A QSAR model is a simple mathematical equation, which is evaluated from a set of molecules with known activities, properties and toxicities via computational approaches. Hypothesis of QSAR supports the replacement, refinement and reduction (3Rs) in animals in the research paradigm as an alternative for untested NCEs [16]. Tropsha and co-workers reviewed QSAR [17]. A QSAR model is limited to query chemicals structurally similar to the training compounds in the applicability domain (AD). Robust validation of QSAR relationships is key for a predictive model, which may be considered for forecasting molecules via interpolation (true prediction) inside AD or extrapolation (less reliable guess) outside AD. A test molecule that is similar to those in the training set is predicted by QSAR model developed on the corresponding training set. On the contrary, a molecule quite dissimilar to the training ones will never be predicted with the same efficacy, since it is impossible for a single QSAR model to capture the property of an entire universe of chemicals. Relationships of QSAR present applications in drug discovery, environmental fate modeling, risk assessment and chemicals property prediction. The addition of descriptors to a model leads to a rise in the correlation coefficient but this does not always indicate an improvement in predictability. Models of QSAR were used for developing drugs. An objective of QSAR modeling is to predict absorption, distribution, metabolism, excretion (ADME), activity, property and toxicity (ADMET) of NCEs falling within developed-models AD. Chemical qualification (QSAR) programs depend on quantification of physicochemical and physiochemical properties, which facilitate selectivity towards antioxidant capacity.
In earlier publications, quantitative structure-property relationships (QSPRs) allowed prediction of chromatographic retention times of phenylurea herbicides [18] and pesticides [19]. This study aimed to investigate isoflavonoids QSARs via X/XO (pH 9.4), ORAC (blood-serum physiological pH 7.4) and DPPH (methanol, MeOH) assays via different solvents: inhibitions of water-soluble superoxide radical O2 •− formation and peroxyl radical HO2 • -induced oxidation, and water-insoluble DPPH, respectively. Antioxidant capacities were derived from Promden et al. [10]. The improvements with regard to this qualitative work have been illustrated and discussed. In our QSARs, the different activity trends for the three complementary assays are explained.

Results and Discussion
The molecular structures of 17 isoflavonoids, viz. eight isoflavones, six isoflavanones and three isoflavans, from the heartwood (duramen) of D. parviflora are displayed in Table 1. However, the obtained results are limited to the 17 substances contained in the ChEMBL database.
Isoflavonoids antioxidant activities in ORAC, X/XO and DPPH model assays were derived from Promden et al. [10]. However, no QSAR analysis was provided. For inactive Entries 12-14, 14 and 3-7-8-12-13-14 in Table 2, ORAC Trolox™ (a water-soluble vitamin-E analogue) equivalent antioxidant capacity (TEAC) was taken as minimum (minimum log ORAC), X/XO and DPPH concentration for 50% radical-trapping (scavenging, SC50) were taken as maximum. Notice the opposite trends of ORAC and X/XO-DPPH results.  logarithm of the 1-octanol-water partition coefficient (log P) calculated by the method ALog P; e ACD Log P: log P calculated by ACD/Log P; f ACD Log D: decimal logarithm of the 1-octanol-water distribution coefficient (log D) calculated by ACD/Log D at pH 7.4; g ACD Acidic pK a : pK a calculated by ACD/pK a ; h RBN: rotatable bonds; i QEDw: weighted quantitative estimate of drug-likeness; j PSA: topological polar surface area; k HBD: hydrogen-bond donor; l HBA: hydrogen-bond acceptor.
Isoflavonoids (IfOH) scavenge free radicals R • according to three possible reducing pathways. (i) H-atom transfer (HAT) from the molecule to the radical (direct O-H bond breaking): High HAT rate is expected for a low O-H bond dissociation enthalpy (BDE). (ii) Electron transfer (ET) from molecule to radical, leading to indirect H-abstraction or proton transfer (PT) (ET-PT): (iii) Sequential proton-loss-electron-transfer (SPLET). Since antioxidants primarily function by HAT, which involves formation of an H-bond with the harmful free radicals [20], a rise in the count of OH substituents facilitates interaction with the toxic radicals (Fujita-Ban analysis) [21].

Correlations between the Different Methods, and Physicochemical and Physiochemical Properties
Physicochemical and physiochemical properties of isoflavonoids were calculated (NH + OH, N + O) or taken from ChEMBL database: steric (molecular weight, Mw), lipophilic (log P/D, topological polar surface area, PSA), acid (pKa), flexibility (rotatable bond, RBN), drug-likeness (weighted quantitative estimate of drug-likeness, QEDw, QED) and H-bond donor/acceptor (HBD/A) [22]. All Mw < 400 Da were in agreement with the rule of five (RO5). Cajanin (Entry 5 in Table 2) Mw = 369 Da and its log P/PSA could be decreased. All ACD log P < 5 according to RO5 with the exception of Entries 8, 9 and 11. However, these results should be taken with care because atom type summation log P (Alog P) < 3 and log D < 4. All log D = 0-3 predicting high oral bioavailability (OB) except Entries 5, 11-13 and 15. All pKa = 2-10 and isoflavonoids are weak acids in water, most resulting anionic while they are neutral without separation of charges in organic solvents (MeOH).

Xanthine/Xanthine Oxidase Assay
Most isoflavonoids exhibited high antioxidant activity in X/XO assay. The role of ring C is confirmed in the presence of the 2,3-double bond. Fragment =O environment primarily dictates its contribution to the antioxidant capacity profile of isoflavonoids. The class of planar isoflavones showed the highest potency. The activity of the different divisions were confirmed comparing the capacity of compounds with the same substitution pattern: planar, ring-C-unsaturated isoflavone khrinone C was detected much more potent than nonplanar, ring-C-saturated isoflavan 3(S)-8-demethylduartin and isoflavanone 3(S)-secundiflorol H (Entries 1, 17 and 10, respectively). The X/XO correlated with PSA and HBD properties. Conversion of X/XO to its logarithm got a better relationship with log D and pKa descriptors. The best linear fit turns out to be: where n is the number of points, s standard deviation, F Fischer ratio, MAPE mean absolute percentage error, AEV approximation error variance and q, leave-1-out cross-validated (CV) correlation coefficient. The pKa correlates positively, while log D associates negatively, with −log X/XO. The positive coefficient for pKa implies that activity rises for weaker-acids isoflavonoids in agreement with the fact that the assay prefers isoflavans (pKa ≈ 10) to isoflavanones (pKa ~6). The negative coefficient for log D signifies that capacity rises for isoflavonoids more stable in the aqueous than in the organic phase. If a quadratic term is included in the fit, the model is improved: and AEV decays by 42%. Log D correlates negatively with −log X/XO in agreement with Equation (3). However, (log D) 2 correlates positively with −log X/XO in a model passing via a minimum, in agreement with log P parabolic models of in vitro penetration of xenobiotics across artificial lipoidal/biomembranes [23]. Its small absolute coefficient indicates a weak nonlinear relationship. Linear Equation (3) has only two variables and is better appropriated for extrapolation than nonlinear Equation (4).

Oxygen Radical Absorbance Capacity Assay
Most isoflavones showed high antioxidant activity in ORAC assay, which correlated with PSA and HBD properties. The conversion of ORAC to its logarithm got better relationship with the same descriptors. The best linear fit results: The HBD ≈ NH + OH correlates positively with log ORAC in agreement with Fujita-Ban analysis. However, PSA associates negatively with log ORAC. Adding two quadratic terms, fit is improved: and AEV decays by 79%. The NH + OH ≈ HBD correlates positively with log ORAC in agreement with Fujita-Ban analysis and Equation (5). However, log D and N + O associate negatively with log ORAC. Quadratic ACD log P 2 correlates positively with log ORAC in a parabola with a minimum, while (NH + OH) 2 associates negatively in a parabola with a maximum. Linear Equation (5), with only two variables, results better suited for extrapolation than nonlinear Equation (6).

2,2-Diphenyl-1-picrylhydrazyl Assay
Most isoflavones displayed high antioxidant activity in DPPH assay, which correlated with properties Alog P, QED, N + O and HBD. Best linear fit is: All descriptors correlate positively with −DPPH, and HBD ≈ NH + OH is in agreement with Fujita-Ban analysis and Equations (5) and (6). A positive coefficient for log P implies that antioxidant activity in the assay rises for isoflavonoids more soluble in the organic than in the aqueous phase. As DPPH assay is in MeOH (not water), the corresponding interpretation is that water, compared to MeOH, presents the capacity of forming a number of H-bonds (nets), while MeOH affinity for creating H-bonds is smaller because of the steric interference of the CH3 group and inability to receive-give more H atoms. This is in concordance with the positive sign of log P and N + O terms. If quadratic pKa 2 is included in the fit, the correlation is improved: and AEV decays by 32%. All linear descriptors correlate positively with DPPH in agreement with Equation (7), and HBD ≈ NH + OH is in concordance with Fujita-Ban analysis. Quadratic pKa 2 associates negatively with −DPPH in a parabolic model with a maximum. Linear Equation (7) with only four variables is better appropriated for extrapolation than nonlinear Equation (8). The use of log DPPH as dependent variable does not improve the models.

Comparison between the Three Methods
The log X/XO can be estimated from log ORAC: The log X/XO can be approximated from DPPH: The DPPH can be calculated from log ORAC: in agreement with the opposite trends of X/XO-DPPH and ORAC. The correlation is poor (Equation (11)). However, when a correction is made for the fact that ORAC assay is in water while DPPH assay is in MeOH, by adding a term in log N + O, a better fit is obtained: where the term in log N + O ≈ log HBA corrects for the fact that in the ORAC assay, water presents greater ability to H-bond transfer than MeOH in the DPPH test. The physicochemical and physiochemical properties used in Table 2 are simple to calculate, and their use gained widespread acceptance but the bulk physical properties of molecules are correlated [24]. One issue in using these properties is the potential redundancy, which is illustrated simply among isoflavonoids, where all four RO5 parameters are clearly linked: The standard errors of the coefficients show that all ones in Equations (3)-(13) are acceptable.
Drug design, discovery and development are complex and difficult because drug action is much more than binding affinity. A successful, efficacious and safe drug must present a balance of properties, e.g., activity against its intended target, appropriate ADME and acceptable safety profile. Based on the obtained results, new definitions of (stringent) drug-likeness, tractability and central nervous system (CNS)-active are proposed. Drug-likeness evaluates the suitability of the molecule under RO5, etc. The CNS-active is stricter. However, tractability is under more relaxed conditions. A summary of physicochemical and physiochemical descriptors was selected for every property (cf. Table 4

Discussion
This study is in agreement with Promden et al. [10], providing an extension and further discussion. It would be expected that the results of the present work had not change if the larger set of 24 compounds were considered. However, the obtained results are limited to the 17 substances contained in the ChEMBL database. The novelty finding in comparison to Promden et al. [10] is described in the following paragraphs, essentially: in the present study, a comparative analysis of the three assays in different solvents and pHs is illustrated and analyzed. The main difference is that the work of Promden et al. [10] is qualitative SAR while this study is QSAR. A possibility exists of integrating parameters sets but the structural data of Promden et al. would be only indicators of functional-groups absence/presence. The predictability of the approach would be qualitative but not quantitatively improved.
There are two main types of empirical QSAR models: linear models and nonlinear ones. The linear models provide an appropriate representation of the activity in a small neighborhood of a set of molecular properties. However, when the molecules are tried outside this constrained region, the model predictions will not be accurate. On the other hand, the quadratic models tend to capture more precisely the capacity behavior, making the adequate for predicting a real potency in a wide region of properties. Weak nonlinear relationships were detected between some physicochemical and physiochemical properties, especially log P, and isoflavonoids antioxidant activity in X/XO, ORAC and DPPH assays. Key strengths of the obtained descriptors follow: (1) easy to understand and apply; (2) compounds with non-drug-like properties lie in the regions of property space with poor precedence; and (3) good guide to avoid potential pitfalls.
Considering the structure of isoflavonoids, some parameters {log P, log D, PSA, HBD} are used. A simple linear correlation is proved to be a good model for the antioxidant activity of the molecules; other properties are redundant information. Procedure CV leave-m-out shows that {PSA, HBD} and {(ACD log P) 2 , ACD log D, N + O, NH + OH, (NH + OH) 2 } are the most predictive sets of descriptors for linear and nonlinear modeling isoflavonoids antioxidant capacity, respectively, according to the criterion of maximization of CV correlation coefficient. Both sets contain the essential characters of the antioxidant potency for isoflavonoid structures. The proposed method allows rapid estimation of the antioxidant activity for these molecules. The linear methods require that fewer parameters be estimated and, therefore, may be more parsimonious (Occam's razor). Linear and nonlinear correlation models were obtained for isoflavonoids antioxidant capacity, pointing, not only to a homogeneous molecular structure of these molecules, but also to the ability to predict and tailor drug properties. The latter is nontrivial in pharmacology.

Experimental Section
The 1-octanol-water partition coefficient P is the ratio of concentrations of compound S: Its decimal logarithm log P measures lipophilicity. The ALog P is calculated from a regression based on the hydrophobicity contribution of 115 atom {H, B-F, Si-Cl, Se-Br, I} kinds [26]. Every atom in every structure is classified into one of 115 sorts. Log P results: where ni is the number of the atoms of type i and ai is hydrophobicity constant. Codes ACD/Log P and calculated log P (CLog P) [27] predict it from structure. Distribution coefficient D is the ratio of sum of the concentrations of all forms of compound (unionized/ionized) in each phase; e.g., for a weak acid HA: As logD is pH dependent, aqueous phase pH is buffered, e.g., blood-serum physiological pH 7.4 in ORAC assay. For unionizable compounds, log P = log D. 0 < log D < 3 enhances OB [28]. Code ACD/Log D predicts it understanding ionizable-molecules lipophilicity from structure. Programs ACD/Log P − D are modules of ACD/Percepta (ACD/Labs).
An acid dissociation constant Ka measures the strength of an acid in solution. It is the equilibrium constant for acid-base dissociation reaction. The larger Ka, the more there is dissociation of the molecules in solution. Acids and neutrals present decreased toxicity risks related to bases [29]. Code ACD/pKa predicts dissociation constants from structure.
An RBN is any single non-ring bond, bounded to nonterminal heavy (non-H) atom. Amide C-N bonds are not considered because of their rotational energy barrier. The count of RBNs measures the molecular flexibility.
An H atom attached to a relatively electronegative (EN) atom is an HBD [30]. The EN atom usually ranges from N to F atoms. The count NH + OH ≈ HBD. An EN atom, e.g., N to F atoms, is an HBA, whether it is bonded to an H atom or not (e.g., HBD ethanol presents an H atom bonded to an O atom, HBA O atom in diethyl ether does not show an H atom bonded to it). The count N + O ≈ HBA. The solvatochromic parameters are: dipolarity-polarizability π*, HBD acidity α and HBA basicity β [31].
The PSA of an organic is calculated by Ertl et al. method as a sum of fragment contributions [32]. The N/O-centered polar fragments are considered [33]. The PSAs are similar to HBA trends. The PSA describes drug absorption (e.g., OB, human carcinoma of colon cell line type-2 (Caco-2) permeability, BBB penetration). In order to enter BBB, most CNS drugs show PSA ≤ 70 Å 2 but PSA ≤ 75 Å 2 when Clog P > 3 carries toxicity and promiscuity risks [34]. When Mw > 400 Da, Clog P > 4 presents some toxicity risk [35].
The correlation coefficient between CV representatives and the property values rcv has been calculated with the leave-m-out procedure [42]. The process furnishes a new method for selecting the best set of descriptors: leave-m-out selects the best set of descriptors according to the criterion of maximization of the value of rcv.
The statistics r, s and F were calculated with Microsoft Excel (Microsoft Office 2015); MAPE and AEV were computed with Knowledge Miner Insights for Excel; CV correlation coefficients (q, etc.) were evaluated with leave-m-out [42].

Conclusions
From the present results and discussion, the following conclusions can be drawn.
1. Seventeen isoflavonoids from Dalbergia were selected from ChEMBL database representing medicinal chemistry compounds. Most are detected highly drug-like. Binary rules for compounds selection result risky: filters neglect valuable opportunities. Structure-antioxidant activity analyses indicate most important properties: log D-pKa, PSA-HBD and log P-QED-N + O-HBD for X/XO, ORAC and DPPH assays, respectively. Capacity in X/XO prefers weaker-acids isoflavonoids more soluble in water than in 1-octanol, in agreement with X/XO (pH 9.4) favoring neutral isoflavans (pKa ≈ 10) rather than anionic isoflavanones (pKa ~6). However, DPPH chooses isoflavonoids more soluble in 1-octanol with greater N + O count because this test is in methanol with H-bond transfer ability smaller than water. Models of QSAR provide quantitative information that filters drugs based on log D, etc. suggesting strategies for priority. Some descriptors (PSA, HBD) are more important than others (size, HBA). An advantage of our QSARs is that they detect weak nonlinear relationships between log P, etc. and potency. Simple, consistent analyses are described, improving our general understanding of activity. The rules are consistent with the literature.
2. Isoflavonoid ring-C role was confirmed in the presence of isoflavones 2,3-double bond, explaining their greatest activity. Capacity gave preferences: Planar unsaturated isoflavones greater than non-planar saturated isoflavans and isoflavanones because unsaturation and planarity stabilize the phenoxyl radical. On comparing isoflavanones with isoflavans, this study demonstrates different favorites of X/XO, ORAC and DPPH: X/XO (pH 9.4) prefers neutral isoflavans (pKa ≈ 10) liking better phenoxyl-radical stabilization, which is not the case of anionic isoflavanones (pKa ~6); in DPPH (methanol), an intramolecular H-bond R4 = O…HO-R5 can be formed in isoflavanones, but not in isoflavans lacking this moiety; and ORAC (pH 7.4) liking is intermediate. Isoflavonoids potency depends on the chemical form determining its solubility, which is modified by changing pH or solvent. Models of QSAR may predict activity of new series of isoflavonoids and design strong drugs.