Antioxidant Activity of Pharmaceuticals: Predictive QSAR Modeling for Potential Therapeutic Strategy

Since oxidative stress has been linked to several pathological conditions and diseases, drugs with additional antioxidant activity can be beneficial in the treatment of these diseases. Therefore, this study takes a new look at the antioxidant activity of frequently prescribed drugs using the HPLC-DPPH method. The antioxidative activity expressed as the TEAC value of 82 drugs was successfully determined and is discussed in this work. Using the obtained values, the QSAR model was developed to predict the TEAC based on the selected molecular descriptors. The results of QSAR modeling showed that four- and seven-variable models had the best potential for TEAC prediction. Looking at the statistical parameters of each model, the four-variable model was superior to seven-variable. The final model showed good predicting power (r = 0.927) considering the selected descriptors, implying that it can be used as a fast and economically acceptable evaluation of antioxidative activity. The advantage of such model is its ability to predict the antioxidative activity of a drug regardless of its structural diversity or therapeutic classification.


Introduction
Oxidative stress can be defined as an imbalance between the production of reactive oxygen radicals and, on the other side, the cell's antioxidant capacity and exogenous antioxidant intake. Free radicals are generated from endogenous and exogenous sources and can cause significant chain chemical reactions in the body due to a quick reaction with other molecules. Oxidative stress has been linked to several pathological conditions and diseases, such as atherosclerosis and cardiovascular disease, cancer, neurological diseases (i.e., Alzheimer's disease, amyotrophic lateral sclerosis, Parkinson's disease, multiple sclerosis, depression, and memory loss), rheumatoid arthritis and respiratory diseases (asthma and chronic obstructive pulmonary disease). The basis of effective pharmacological treatment relies on both understanding the disease pathogenesis and the pharmacological drug effects. Although drugs belonging to the same therapeutic group have a common primary mechanism of action, pharmacokinetic and pharmacodynamic properties, as well as additional mechanisms of action, may differ among them. These diversities can single out one or more drugs as drugs of choice compared to other group members, especially in patients with comorbidities. According to the available literature data, antioxidant activity is especially underlined as one of the additional mechanisms of action that some drugs showed [1,2].
Quantitative structure-activity relation (QSAR) is a powerful tool for high-throughput virtual screening of various molecules based on the set of calculated descriptors representing each molecule [3]. Through the years, QSAR had a remarkable role in accelerating the drug development process. It was implemented in the various stages of this complex process, from the initial stages of drug design to pharmacokinetic and pharmacodynamic modeling [4][5][6]. The combination of a fast DPPH (2,2-diphenyl-1-picrylhydrazyl)-based method for antioxidant activity determination and QSAR is set to become a creating model for the prediction of the antioxidative activities of various compounds [7][8][9][10][11].
Since it is well recognized that reactive oxygen species are responsible for numerous types of cell damage, the objective of this study was to evaluate the antioxidant activity in vitro of frequently prescribed drugs in clinical practice. Accordingly, the antioxidative activity of 82 prescribing therapeutic agents from different Anatomical Therapeutic Chemical Classification (ATC) groups, such as antibiotics, antihistamines, beta-blockers, immunosuppressants and others, was determined using our previously developed DPPH-HPLC method [12]. Another objective was to develop a model for predicting antioxidant activity as an additional mechanism of action. Considering the rather large pool of structurally various drugs used in this high-throughput screening, obtained TEAC (TROLOX (6-hydroxy-2,5,7,8-tetramethylchroman-2-carboxylic acid) equivalent antioxidant capacity) values were used for QSAR model development to establish the unique model for the prediction of TEAC values. It must be pointed out that the advantage of such model is its ability to predict the antioxidative activity of a drug regardless of its structural diversity and therapeutic classification.

Method Verification
For DPPH assay monitoring, the applicability of our previously published method had to be verified following method transfer requirements. The method showed satisfactory linearity in the range from 0 to 0.3 mM of TROLOX (at seven concentration levels) with the following equation, y = −3.0101 x + 0.9997, and correlation coefficient, r = 0.9998. The accuracy of the method was examined by analyzing samples in triplicate on three different concentration levels, covering the whole range of the calibration curve, with the recoveries of 96.9% up to 104.1% accompanied by relative standard deviation values (RSD) no higher than 1.9%. The precision of the method was also examined by analyzing six samples at 0.15 mM of TROLOX with the RSD values not exceeding 5.8%.

The Antioxidative Activity of Selected Pharmaceuticals
The antioxidant activity of 82 pharmaceuticals was assessed by the HPLC-DPPH method (Table 1). According to obtained result, all analyzed drugs can be divided into three groups: (i) below 0.100 mM TEAC (63 pharmaceuticals; 77% of all), (ii) 0.100 to 0.200 mM TEAC (eight pharmaceuticals; 10% of all), and (iii) above 0.200 mM TEAC (11 pharmaceuticals; 13% of all).  The significant DPPH free-radical scavenging activity (0.059-0.141 mM TEAC) observed for all investigated statin (ATC class C10) drugs (atorvastatin, fluvastatin, pravastatin and simvastatin) bears a close resemblance to the one presented by previous research [13,14], although these authors used other experimental approaches. Our findings confirm that next to their already well-recognized antihyperlipemic and immunomodulatory effects, statins have a positive impact against oxidative stress levels.
Antibiotics (ATC class J) belong to one of the largest and structurally diverse group of pharmaceuticals. This study systematically improved our knowledge on the antioxidative activity of these widely prescribed medicines. Obtained values were in the range from 0.037 mM (azithromycin) to 0.302 mM (doxycycline) TEAC. Furthermore, the reaction of DPPH radicals and investigational antibiotics had not shown complete discoloration of the solution, as is the case with amoxicillin, erythromycin, sulfamethoxazole and sulphamic acid, but the applied HPLC method was sensitive enough to indicate small changes compared to the standard. It is interesting to note that cefradine, cephalexin and ciprofloxacin showed noticeable antioxidant activity, while oxytetracycline, doxycycline and rifampicin showed complete decolorization of DPPH solution, and have strong antioxidant activity. Despite antibiotics being frequently used pharmaceuticals, only a few researchers have addressed the question of their antioxidative activity. Kladna et al. [2] examined the antiradical activity of tetracyclines, including oxytetracycline and doxycycline, using DPPH, while Karunakar et al. [1] used the HPLC-DPPH method to determine the antioxidant activity of drugs, including rifampicin. The results showed that in the presence of rifampicin, the concentration of DPPH peaks decreased depending on the concentration. Furthermore, the free radical scavenging activity of rifampicin is measured by Kalpana et al. [15] and ascorbic acid was used for comparison. Although the above-mentioned researchers used different procedures, the results of this work are consistent with the available literature.
Analgesics (ATC class N02) are one of the most commonly self-prescribed medicines. The benefit of the combined use of analgesics and antioxidants in the treatment of chronic pain is attracting considerable interest [16,17]. It is well known that paracetamol is often used in combination with vitamin C. In this study, the ability to capture the DPPH radical Pharmaceuticals 2022, 15, 791 5 of 13 of the above analgesic was assessed (0.236 mM TEAC). Borges et al. [18] evaluated the antioxidant activity of paracetamol and salicylic acid in experimental and theoretical studies. In conclusion, it is stated that although both compounds are phenolic derivatives, paracetamol showed more pronounced antioxidant properties than salicylic acid in several models that caused oxidative stress. One of the theoretical mechanisms has shown that hydrogen transfer is responsible for a more pronounced antioxidant effect in paracetamol. On the other hand, there are numerous examples in the available literature of the toxicity of paracetamol, which is widely used as an analgesic and antipyretic. Although seemingly safe if used at the recommended therapeutic doses, higher doses of paracetamol can cause severe liver and kidney damage in humans and experimental animals.
Based on the obtained results (Table 1), aminosalicylates (ATC class A07) stand out as an interesting group of investigated drugs. Sulfasalazine (0.076 mM TEAC) and mesalazine (5-ASA; 0.296 mM TEAC) showed measurable antioxidant activities, while the antioxidant activities of olsalazine and balsalazide were not measurable by the proposed HPLC-DPPH method. Rafael et al. [19] used the DPPH radical in the analysis of mesalazine to quantify the content of the drug in finished, commercially available pharmaceutical forms using the spectrophotometric method. Additionally, a literature survey reveals that mesalazine had a potent radical scavenger activity compared to paracetamol and salicylic acid, as antipyretic and anti-inflammatory drugs [20]. Interestingly, the intensity of the antioxidant effect of mesalazine was similar to ascorbate, in contrast to salicylate, which did not react with the DPPH radical. Furthermore, these results suggest that most of the antioxidant activity of aminosalicylates is derived from 5-ASA. By being released from the prodrug structure due to the enzymatic degradation of the azo bond, the antioxidant activity could be part of the therapeutic effect of sulfasalazine, olsalazine and balsalazide, even if their prodrug activity was much weaker than 5-ASA alone. The highest antioxidative activity was found for 0.1 mM mesalazine (up to 310 times stronger than others), followed by aminosalicylates, sulfasalazine and balsalazide. On the other hand, olsalazine has shown no antioxidant activity.
According to the results of this study, the antioxidative activity of immunosuppressant drugs (ATC class L04) was observed: 6-mercaptopurine and 6-thioguanine showed twice as much antioxidative power compared to prodrug azathioprine.
As already mentioned, oxidative stress plays an important role in the degeneration of dopaminergic neurons in Parkinson's disease. Ropinirole is a non-ergoline D2/D3 dopamine agonist (ATC class N04) used to treat symptoms of Parkinson's disease and showed increasing free radical scavenging activity with increasing concentrations. Our results share a number of similarities with Selva et al.'s findings, as this group showed that ropinirole had good free radical scavenging activity and could play a role of a neoadjuvant antioxidant in a wide variety of neurodegenerative disorders [21].
Vitamins have been recognized as one of the essential antioxidants. Folic acid (vitamin B9) had significant antioxidant activity (0.230 mM TEAC), and other B vitamins, such as thiamine (vitamin B1) and nicotinamide, a form of vitamin B3, showed measurable antioxidant activity (0.115 mM TEAC and 0.048 mM TEAC, respectively). The obtained results extended our knowledge of them and demonstrated the highest antioxidant activity of L-ascorbic acid sodium salt (0.267 mM TEAC) compared to other investigated vitamins.

Development of QSAR Model for the Prediction of Antioxidative Activity of Drugs
The following uses the procedure described in Section 3.3. In the statistical procedure, we approached the development of the QSAR model using the obtained TEAC values and calculated descriptors. Data splitting to create training and the test set was performed using the rational splitting method based on activity value, to cover the whole range of TEAC values [22]. As a result, one-to eight-variable models were obtained. Equations of the developed models are presented in Table 2.  The values of the statistical parameters of selected one-, two-, three-, four-, five-, six-, seven-and eight-variable QSAR models for the training and the test set are presented in Table 3. To satisfy the fitting criteria of the model, parameters should act as follows: R 2 values should tend to be as high as 1, implying that calculated values are similar to the observed ones. The acceptable R 2 value of a model should be ≥0.6 (equivalent to r ≥ 0.774). The minimum acceptable R 2 ext is ≥0.6 as well; CCC ≥ 0.85; RMSE and MAE as close to zero as possible and RMSE tr should be of smaller value than RMSE cv . Robust QSAR models should have R 2 yscr > Q 2 yscr [23].  In addition, plots of obtained correlation coefficients, r, in relation to the number of variables in the obtained QSAR models are shown in Figure 1. In addition, plots of obtained correlation coefficients, r, in relation to the number of variables in the obtained QSAR models are shown in Figure 1. The r values for the training set models increase with the increment of variable number in the model (from 0.562 for the one-variable model, up to 0.914 for the eight-variable model); however, r values of obtained test set models have shown that four-and sevenvariable models have highest values (0.927 and 0.916, respectively) compared to other models (from 0.778 to 0.907). Although the seven-variable model exhibits a higher r value of the training set (0.891) and a somewhat similar r value for the test set, the four-variable model showed better predictive power. Both have a similar performance of external and internal validation parameters; however, a crucial part in deciding which model is more suitable for the prediction of TEAC played was the cross-correlation matrix of the sevenvariable model. Table 4 shows that several descriptors are highly correlated (H7s vs. CATS2D_04_AL, r = 0.759 and C-018 vs. F03[N-F], r = 0.766), meaning that both descriptors have the same contribution to the model outcome, thus one of them is unnecessary in the model, which is not the case in the four-variable model, in which the highest correlation of descriptors is 0.586, implying that every descriptor contributes to calculation of the TEAC value. On the other hand, generating models with more than eight variables would increase r of the training set; however, in the case of test sets, lower r values would be obtained, as is the case in the eight-variable model, due to the overtraining of the model. Therefore, building higher models with more than eight variables would not be efficient. The r values for the training set models increase with the increment of variable number in the model (from 0.562 for the one-variable model, up to 0.914 for the eight-variable model); however, r values of obtained test set models have shown that four-and sevenvariable models have highest values (0.927 and 0.916, respectively) compared to other models (from 0.778 to 0.907). Although the seven-variable model exhibits a higher r value of the training set (0.891) and a somewhat similar r value for the test set, the four-variable model showed better predictive power. Both have a similar performance of external and internal validation parameters; however, a crucial part in deciding which model is more suitable for the prediction of TEAC played was the cross-correlation matrix of the seven-variable model. Table 4 shows that several descriptors are highly correlated (H7s vs. CATS2D_04_AL, r = 0.759 and C-018 vs. F03[N-F], r = 0.766), meaning that both descriptors have the same contribution to the model outcome, thus one of them is unnecessary in the model, which is not the case in the four-variable model, in which the highest correlation of descriptors is 0.586, implying that every descriptor contributes to calculation of the TEAC value. On the other hand, generating models with more than eight variables would increase r of the training set; however, in the case of test sets, lower r values would be obtained, as is the case in the eight-variable model, due to the overtraining of the model. Therefore, building higher models with more than eight variables would not be efficient. The list of all descriptors included in one-to eight-variable models are presented and will be described in detail in the following section. Table 4. Correlation matrix of included descriptors in best four-and seven-variable QSAR models predicting TEAC for the entire set of compounds (72).

Mor16e
RDF145p Based on the facts stated above, the four-variable model was chosen as the best one for the prediction of TEAC. Model accuracy, as well as the domain of applicability, is presented in Figure 2. Figure 2A shows a plot of predicted TEAC values versus the measured ones, resulting in a distribution of values that show a linear trend. The points are distributed along the diagonal line, representing the accuracy of the model as well as the fact that the model was able to predict similar TEAC values, especially for the molecules that have shown strong antioxidative power. One cluster comprised molecules that exhibit low antioxidative power. On the other hand, the Williams plot detects both the response outliers (Y-outliers) and molecular structure outliers (X-outliers) [24]. As can be seen in Figure 2B, there are no response outliers since none of the values exceed ± 3δ. On the other hand, structurally influential molecules were observed (three in the test set and four in the training set), which is acceptable considering the structural diversity in the pool of examined molecules.
The model was further validated using the "Y-scrambling" method ( Figure 2C). The "Y-scrambling" test showed that R 2 and Q 2 LOO values of selected models are different than those calculated. The observed R 2 Y-SCRAMBLING and Q 2 Y-SCRAMBLING are of low value, indicating the validity of the calculated model. Additionally, the R 2 Y-SCRAMBLING > Q 2 Y-SCRAMBLING criteria are satisfied. The model was further validated using the "Y-scrambling" method ( Figure 2C). The "Y-scrambling" test showed that R 2 and Q 2 LOO values of selected models are different than those calculated. The observed R 2 Y-SCRAMBLING and Q 2 Y-SCRAMBLING are of low value, indicating the validity of the calculated model. Additionally, the R 2 Y-SCRAMBLING > Q 2 Y-SCRAMBLING criteria are satisfied.

Structural Characteristics Determining the Antioxidative Value of Selected Compounds
The list of descriptors in the one-to eight-variable models is presented in Table 5. In this section, the focus will be primarily on the descriptors found in the four-variable model, which was chosen as the best one for the prediction of TEAC.
Considering the observed equation describing the four-variable model:  (Table 5).

Structural Characteristics Determining the Antioxidative Value of Selected Compounds
The list of descriptors in the one-to eight-variable models is presented in Table 5. In this section, the focus will be primarily on the descriptors found in the four-variable model, which was chosen as the best one for the prediction of TEAC.  Table 5). The first descriptor found in our four-variable model is the Mor16e descriptor, belonging to the group of 3D-MoRSE descriptors, which stands for 3D-Molecule Representation of Structures based on electron diffraction. These descriptors were developed based on the idea of obtaining information from the 3D atomic coordinates by use of the transform used in electron diffraction studies for preparing theoretical scattering curves. The MoRSE descriptors have been shown to have good modeling power for different biological and physicochemical properties and can be used even for the simulation of infrared spectra [25]. In our model, we can see the negative contribution of the Mor16e descriptor, which represents signal16/weighted by Sanderson electronegativity. According to Sanderson's theory, a compound that exhibits high electronegativity, due to the equalization of electronegativity between two atoms, is associated with low reactivity; therefore, the lower the Mor16e value associated with the compound, the more pronounced its antioxidative potential is [26]. 3D-MoRSE type descriptors were also part of models obtained in the previously published studies for the prediction of antioxidative activity [23].
The second descriptor found in the obtained model is RDF145p. Radial Distribution Function (RDF) descriptors are based on the geometrical interatomic distance and constitute a radial distribution function code, and they show some characteristics in common with the 3D-MoRSe descriptors [25]. They contain information about the interatomic distances in a molecule, unweighted or weighted by different atomic properties, such as atomic mass, electronegativity, van der Waals volume and atomic polarizability [27]. We can see that increase in those properties of the molecule can negatively affect the antioxidative power of a compound due to the negative value in the obtained model.
The third descriptor in our model, C-018, belongs to the atom-centered fragment type of descriptors. Each atom type is an atom in the molecule described by its neighboring atoms. Hydrogen and halogen atoms are classified by the hybridization and oxidation state of the carbon atom to which they are bonded. Carbon atoms are classified by their hybridization state and depending on whether their neighbors are carbon or heteroatoms. C-018 is defined as =CHX atom-centered fragment, where X can represent any electronegative atom (O, N, S, P, Se, halogens) as well as an aromatic bond, as in benzene [28]. This descriptor strongly contributes to the TEAC values calculated by our model.
The last descriptor found in our model is from the group of CATS 2D descriptors, named CATS_2D_AL. The CATS 2D (Chemically Advanced Template Search) descriptors are a type of autocorrelation descriptors, where the atom-type definition is related to potential pharmacophore points (PPP). CATS2D descriptors are widely used for similarity search 17. The names of the CATS2D descriptors are coded as follows: "CATS2D_", "distance2D_", and "type atom pair". Thus, "CATS2D_06_AL" means the count of all molecular graph distances (6) between atom pairs are acceptor lipophilic (AL) [29]. It is visible that this descriptor has a positive impact on predicted TEAC values.

Chemicals and Reagents
Standards of pharmaceuticals selected for this study (Table 1)

HPLC-DPPH Method
The antioxidative activity of selected drugs was determined using the DPPH method followed by chromatographic analysis. Briefly, after 250 µL of 2.5 mM DPPH methanolic solution was added to 1 mL of 1 mM drug solution, the reaction mixture was stirred and stored in the dark for 30 min at room temperature. A chromatographic analysis was conducted on Agilent 1100 HPLC system (Agilent Technologies, Santa Clara, CA, USA) with diode array detector using our previously published method adopted for these specific samples [12]. For this reason, method verification was performed to confirm the applicability of the method to the different chromatographic systems. Determination of the antioxidant activity of the drug was performed in triplicate.
The ability of each drug sample to scavenge the 'stable' free DPPH radical was determined from the difference in the peak area of the initial solution of the radical itself and the solution of the radical after reaction with the sample. TROLOX was used as a standard antioxidant and the results were expressed as TEAC values determined from a standard calibration curve.

Data Set and Calculation of Descriptors
A set of 82 pharmaceuticals with accompanying TEAC values was used in the QSAR study. The set was divided into the training set, comprising 68 structurally various drugs, and the prediction set, which also included 14 structurally various drugs.
For calculation of descriptors, 3D structures of selected compounds were created in Chem3D Pro software (ChemOffice v15.0, Perkin Elmer, Waltham, MA, USA). Molecular conformations were optimised by the AM1 method using the MOPAC2012 interface. A total of 3242 molecular descriptors were calculated using the DRAGON v6.0 software (Milano chemometrics & QSAR research group, Milano, Italy), describing the chemical diversity of the studied set as well as capturing the relevant structural characteristics.

Statistical Correlation
To calculate the correlation between TEAC (response values used in QSAR) and DRAGON-generated structurally related descriptors, Genetic Algorithm (GA) and Multiple Linear Regression Analysis (MLRA) methods were used. The combination of the GA-MLRA methods was applied for the selection of descriptors and construction of one-, two-, three-, four-, five-, six-, seven-and eight-variable models using QSARINS v2.2. (QSAR Group, University of Insubria, Varese, Italy). GA variable selection technique started with a population of 200 random models and 2000 iterations to the evolution, with the mutation probability specified as 20%. All descriptors used in calculations of models were expressed as their normalized values for easier comparison of their contribution to QSAR responses. To exclude the models with highly cross-correlated descriptors as well as models with low correlation, filtering through the QUIK rule built-in QSARINS software was performed. Additionally, models with over-correlated descriptors as well as non-significant correlation coefficients were excluded from further study.
The best models were selected according to their correlation coefficients (r), model variance (R 2 ), F-ratio, leave-one-out cross-validation (Q 2 ), standard error (s) and standard error of the predictive residue of sum of squares (S PRESS ). The validation of the model was performed using leave-many-out (LMO) and "Y-scrambling" tests. A Williams plot was used to visualize the applicability domain (AD) of developed models. Such a plot of standardized cross-validated residuals (RES) vs. leverage (Hat diagonal) values (HAT) depicts both the response outliers (Y outliers) and structurally influential compounds (X outliers) in a model.

Conclusions
In this study, the applied HPLC-DPPH method has proven to be rapid and effective for determining the antioxidant activity of eighty-two pharmaceuticals in order to investigate additional mechanisms of their action that are widely used in clinical practice. In total, 23% of the analyzed pharmaceuticals had good free radical scavenging activity (above 0.100 mM TEAC) and could play the role of a neoadjuvant antioxidant.
Results of QSAR modelling showed that four-and seven-variable models had the best potential for prediction of TEAC. Looking into statistical parameters of each model, four-variable model showed to be superior to seven-variable model, due to the fact that seven-variable model showed highly cross-correlated descriptors, meaning that there is a lack of information variability. The chosen model showed good linearity and was successfully validated and tested. With the correlation coefficient of 0.927 for predicted TEAC values, it is shown that this model is adequate for the prediction of TEAC values considering the descriptors included in the model.
The conducted study systematically improved the knowledge of the antioxidative activity of these widely prescribed medicines and provide a scientific basis for the subsequent elucidation of the pharmaceuticals and their additional mechanism activities. Overall, the results lay the foundation for in-depth research on the antioxidative activity of drugs with high TEAC values. Furthermore, this research strategy can be used for the drug characterization according to various antioxidant tests and for enabling the further development and utilization of these pharmaceuticals.