In Silico Virtual Screening of Marine Aldehyde Derivatives from Seaweeds against SARS-CoV-2

Coronavirus disease 2019, caused by the outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is an ongoing global pandemic that poses an unprecedented threat to the global economy and human health. Several potent inhibitors targeting SARS-CoV-2 have been published; however, most of them have failed in clinical trials. This study aimed to assess the therapeutic compounds among aldehyde derivatives from seaweeds as potential SARS-CoV-2 inhibitors using a computer simulation protocol. The absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties of the compounds were analyzed using a machine learning algorithm, and the docking simulation of these compounds to the 3C-like protease (Protein Data Bank (PDB) ID: 6LU7) was analyzed using a molecular docking protocol based on the CHARMm algorithm. These compounds exhibited good drug-like properties following the Lipinski and Veber rules. Among the marine aldehyde derivatives, 4-hydroxybenzaldehyde, 3-hydroxybenzaldehyde, 3,4-dihydroxybenzaldehyde, and 5-bromoprotocatechualdehyde were predicted to have good absorption and solubility levels and non-hepatotoxicity in the ADME/Tox prediction. 3-hydroxybenzaldehyde and 3,4-dihydroxybenzaldehyde were predicted to be non-toxic in TOPKAT prediction. In addition, 3,4-dihydroxybenzaldehyde was predicted to exhibit interactions with the 3C-like protease, with binding energies of −71.9725 kcal/mol. The computational analyses indicated that 3,4-dihydroxybenzaldehyde could be regarded as potential a SARS-CoV-2 inhibitor.


Introduction
Coronavirus disease 2019 (COVID- 19) is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1,2]. The disease has spread worldwide, leading to the ongoing COVID-19 pandemic, which poses an unprecedented threat to the global economy and human health [1,2]. In addition, the World Health Organization suggested that the outbreak of an unknown pathogen may lead to further public health emergencies following the pandemics caused by Ebola, SARS, and Zika viruses [3,4]. Therefore, efforts to identify potent antiviral materials are crucial.
SARS-CoV-2 primarily spreads between people through aerosols and exhaled respiratory droplets when talking, breathing, coughing, or sneezing [5]. SARS-CoV-2 is a single-stranded RNA-enveloped virus which enters human cells via the viral spike protein binding to the angiotensin-converting enzyme 2 receptor. After virus entry, the incoming genomic RNA releases, and immediately two large open reading frames (ORFs), ORF1a and ORF1b are translated from the positive strand genomic RNA. The generating polyproteins pp1a and pp1ab are processed into individual nonstructural proteins that form the viral replication and transcription complexes [5,6]. The 3C-like protease is responsible for proteolytic processing of the majority of polyprotein cleavage sites [1,7]. Thus, 3C-like proteases are potential targets for SARS-CoV-2 treatment.
The global pandemic has led to the rapid growth of artificial intelligence technology in various industries [8,9]. In the pharmaceutical industry, virtual screening technology can minimize the cost and time required for novel drug development [10]. Virtual screening technology, including the screening of absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties and molecular docking, is crucial for identifying promising compounds for in vitro and in vivo testing [11]. Safe drugs exhibit a fine-tuned combination of pharmacokinetic and pharmacodynamic variables such as ADME/Tox properties [12]. New drugs which interact with the target proteins to produce therapeutic effects, typically fail in clinical trials because of unfavorable ADME/Tox properties [12,13]. Drugs with several side effects cannot be used commercially and cause enormous financial losses. Therefore, efforts are being made to predict ADME/Tox properties from drug structures in silico at an early development stage [14]. ADME/Tox modeling can aid in the success of drugs in clinical trials [13]. In addition, molecular-docking-based virtual screening identifies compounds with the highest binding affinities and correct binding modes [7,15].
Marine organisms have the ability to produce novel bioactive natural products with wide structural diversity, various important human health benefits, and pharmacological potential [16,17]. Seaweeds are photosynthetic organisms that are rich in bioactive materials such as polysaccharides, proteins, peptides, amino acids, and secondary metabolites, including polyphenolic compounds and natural pigments [18]. These bioactive materials have demonstrated various biological activities, including medicinal and health benefits, which has led to an increased demand for these compounds in food, nutraceutical, and cosmeceutical products [19]. Aldehydes are volatile biochemicals which are produced by seaweeds, have low molecular weights, are mostly lipophilic, and are considered the most important parameters of food flavor and quality [20,21]. Aromatic aldehydes have also been reported to possess a wide range of potential bioactive properties including anti-cancer, antibacterial, antioxidant, anti-inflammation, and immunomodulatory effects [22][23][24][25][26][27].
The present study aimed to assess the therapeutic compounds among the aldehyde derivatives from seaweeds as potential SARS-CoV-2 inhibitors using a computer simulation protocol. The ADME/Tox properties of the marine aldehyde derivatives were analyzed using a machine learning algorithm and subsequently prioritized based on these results. Next, the docking simulation of the marine aldehyde derivatives to the 3C-like protease (Protein Data Bank (PDB) ID: 6LU7) was analyzed using a molecular docking protocol based on the CHARMm algorithm, and the compounds exhibiting good interactions were selected.

Drug-Likeness Analysis of the Marine Aldehyde Derivatives
Eleven marine aldehyde derivatives were tested using Lipinski and Veber rules. These aldehyde derivatives exhibited good drug-like properties, based on the number of hydrogen bond acceptors, hydrogen bond donors, molecular weights, ALogP, rotatable bonds, and polar surface areas (Tables 1 and 2). Compounds with low molecular weights have shown drug-like properties. Therefore, these marine aldehyde derivatives can be considered for the development of new drugs.

ADME/Tox Analysis of the Marine Aldehyde Derivatives
The 11 aldehyde derivatives were tested in the ADME/Tox Discovery Studio 2021 protocol. In addition, the 2D polar surface area (PSA_2D) for each marine aldehyde derivative was plotted against the corresponding calculated atom-type partition coefficient (ALogP98).
The ADME/Tox properties of marine aldehyde derivatives are listed in Table 3. The marine aldehyde derivatives displayed good absorption, solubility, blood-brain barrier (BBB) permeability, and cytochrome P450 2D6 (CYP2D6) prediction levels. All the aldehyde derivatives in this study were located in the human intestinal absorption (HIA) 99% confidence ellipse, and the absorption grade indicated that all the compounds exhibited good absorption (Table 3, Figure 1). In addition, the solubility results indicated that all these compounds exhibited optimal or good solubility (Tables 3, S1 and S2). Drug failure is generally caused by insufficient absorption and distribution of the drug due to low solubility [15]. Thus, marine aldehyde derivatives could be selected as new drug candidates because of their valuable absorption and solubility levels. BBB grade predictions indicated that all the aldehyde derivatives had medium or low BBB permeability, except for MAD-5. These compounds were located in the 99% confidence limit ellipses corresponding to the BBB (Tables 3, S3 and S4, Figure 1). Thus, these marine aldehyde derivatives cannot be used to target nervous systems. CYP2D6 is one of cytochrome P450 enzymes catalyzing the metabolism of the most clinically important drugs. CYP2D6 inhibitors induce a drug-drug interaction, a reaction between two or more other drugs [28]. The CYP2D6 inhibition predictions revealed that none of the marine aldehyde derivatives inhibited the enzyme; therefore, these compounds cannot cause serious drug-drug interaction toxicity. The hepatotoxicity predictions revealed that among the 11 aldehyde derivatives, MAD-1, MAD-2, MAD-4, and MAD-11 exhibited safe hepatotoxicity grades. Drug activity is related to the free drug concentration, which is the drug concentration available for physiological interaction; therefore, it is necessary to determine whether the drug candidates may bind to plasma proteins [29]. The four selected marine aldehyde derivatives exhibited low plasma protein binding activity, with binding rates of <90% (Table 3). Collectively, the results indicate that these four aldehyde derivatives have the potential to be used as new drugs, whereas the remaining aldehyde derivatives require structural optimization before being used in new drugs. Table 3. In silico based ADME/Tox analysis of the marine aldehyde derivatives.  ADMET_AlogP98 and ADMET_PSA_2D attribute graph for the marine aldehyde derivatives. Plot of PSA_2D versus ALogP98 for the marine aldehyde derivatives showing the 95 and 99% confidence limit ellipses corresponding to HIA and BBB models.

TOPKAT Analysis of Marine Aldehyde Derivatives
Toxicological properties of 11 marine aldehyde derivatives were predicted in silico using the TOPKAT wizard and the 2D molecular structures of the compounds. TOPKAT uses a quantitative structure-toxicity relationship model to assess specific toxicological endpoints [30]. Aspirin and curcumin were selected as positive controls in the TOPKAT analysis. Aspirin, known as acetylsalicylic acid, is a drug used worldwide. Curcumin, a phenolic pigment, is an U.S. Food and Drug Administration-approved drug, and its effect on 3C-like protease of SARS-CoV-2 was evaluated through molecular docking analysis in the previous study [7]. The TOPKAT analysis results for 11 marine aldehyde derivatives are presented in Table 4. Among the aldehyde derivatives, MAD-1, MAD-2, MAD-3, MAD-4, and MAD-11 were predicted to be non-mutagenic according to the Ames test. These five aldehyde derivatives displayed rat oral median lethal doses (LD 50 ) ranging between 1.0018 and 2.67949 g/kg body weight (BW). The lowest LD 50 was exhibited by MAD-3 and the highest LD 50 by MAD-4. MAD-4 and MAD-11 exhibited higher safe doses compared to those for aspirin, a commercial drug (1.57076 g/kg BW). The rat inhalational median lethal concentration (LC 50 ) was predicted for the five aldehyde alternatives. Similarly, MAD-3 displayed the lowest LC 50 (1655.42 mg/m 3 /h). In addition, MAD-1, MAD-2, MAD-4, and MAD-11 LC 50 values were predicted to be 1744.04, 2660.83, 1794.97, and 1975.31 mg/m 3 /h, respectively. Curcumin displayed a LC 50 value of 1200.8 mg/m 3 /h, implying that the four aldehyde derivatives are safer compared to curcumin. MAD-3 was excluded from further analyses because of its low LD 50 and LC 50 . Next, skin irritancy evaluation indicated that MAD-2, MAD-4, and MAD-11 were absent in the skin, whereas MAD-1 was present. In addition, the three aldehyde derivatives which were absent from the skin were predicted to be non-carcinogenic in both female and male rats according to the US National Toxicology Program (NTP) model, except for MAD-11, which revealed potential carcinogenicity in the male rat NTP. Therefore, the TOPKAT prediction results indicated that MAD-2 and MAD-4 are expected to display the best pharmacokinetic and pharmacodynamic behaviors with no mutagenic, carcinogenic, or irritant effects; thus, MAD-2 and MAD-4 are potential SARS-CoV-2 inhibitors.

Molecular Docking Analysis of Marine Aldehyde Derivatives on 3C-like Protease
Several molecular docking studies targeting specific proteins, including enzymes and receptors, have recently been published [7,18,[31][32][33]. Among the docking tools, CDOCKER, a CHARMm-based docking algorithm, found favorable docking poses between small molecules and target proteins based on their structural characteristics such as unshared electron pairs, double bonds, hydrophobicity, and charge [34].
The inhibitory effects of marine aldehyde derivatives on the 3C-like protease of SARS-CoV-2 were predicted by simulating the biological network dynamics of the marine aldehyde derivatives and 3C-like protease in a computational space. The binding energies of the compounds to the 3-C like protease were compared with that of curcumin as a positive control, following a previous study [7,35]. The crystal structure of the 3C-like protease was obtained from the Protein Data Bank (PDB ID 6LU7, [1]), and the structure was confirmed and revised through structural optimization. The binding site was analyzed using the binding sphere of the already docked inhibitor N3 to the 3C-like protease. The binding pattern of the marine aldehyde derivatives to the 3C-like protease were analyzed, and these docking poses were expressed as a 3D chart using -CDOCKER interaction energy (kcal/mol), -CDOCKER energy (kcal/mol), and binding energy (kcal/mol) ( Figure 2B and Table 5). Among the marine aldehyde derivatives, MAD-4 bound most stably to the 3C-like protease, with the highest -CDOCKER energy (22.4808 kcal/mol) and -CDOCKER interaction energy (23.2915 kcal/mol), and the lowest binding energy (−71.9725 kcal/mol). In succession, MAD-11, MAD-2, and MAD-1, which were predicted to exhibit favorable ADME/Tox or TOPKAT levels, displayed relatively low binding energies compared to that for other marine aldehyde derivatives ( Figure 2B and Table 5). In particular, these four marine aldehyde derivatives were docked to the active site of the 3C-like protease and displayed lower binding energies than that of curcumin, which was used as a positive control (Table 5). This implies that these aldehyde derivatives could bind more stably to the 3C-like protease compared to that seen with curcumin.  Each oxygen molecule in MAD-4 formed four hydrogen bonds with ASN142, SER144, CYS145, and HIS163. The phenol ring of MAD-4 formed a pi bond with CYS145 ( Figure 3A). MAD-2 bound to the 3C-like protease exhibiting -CDOCKER energy of 17.2844 kcal/mol, -CDOCKER interaction energy of 19.576 kcal/mol, and the lowest binding energy of −74.1383 kcal/mol. The hydrogen in MAD-2 formed hydrogen bonds with HIS164, and the phenol ring of MAD-2 formed a pi-pi stacked bond with HIS41 and a pi-sulfur bond with MET49 ( Figure 3B). MAD-1 bound to the 3C-like protease exhibiting −CDOCKER energy of 16.3341 kcal/mol, -CDOCKER interaction energy of 18.8237 kcal/mol, and the lowest binding energy of −69.5871 kcal/mol. The oxygen molecules in MAD-1 formed hydrogen bonds with ASP187, and the phenol ring of MAD-1 formed a pi-pi stacked bond with HIS41 and a pi-alkyl bond with MET49 ( Figure 3C). MAD-11 bound stably to the 3Clike protease exhibiting -CDOCKER energy of 21.4484 kcal/mol, -CDOCKER interaction energy of 21.8194 kcal/mol, and the lowest binding energy of −74.9887 kcal/mol. The oxygen molecules in MAD-11 formed hydrogen bonds with GLY143, CYS145, and HIS163, and the phenol ring of MAD-11 formed a pi-alkyl bond with CYS145. The bromine of MAD-11 formed a pi-alkyl bond with HIS41 ( Figure 3D). These in silico based results indicated that MAD-4 has the potential to be used as a SARS-CoV-2 inhibitor. Further studies are needed to confirm the inhibitory effects of these marine aldehyde derivatives on SARS-CoV-2 in vitro.
Marine aldehyde derivatives including indole-4-carboxaldehyde (MAD-7, from Sargassum thunbergii), 3,4-dihydroxybenzaldehyde (MAD-4, from Polysiphonia morrowii), and 5-bromo-3,4-dihydroxybenzaldehyde (from P. morrowii), have been reported to possess various bioactivities, including inhibition of hepatic inflammation [22] and anti-allergy [23,26], and hair growth effects [24]. An aromatic aldehyde, 3-chloro-4,5-dihydroxybenzaldehyde, has been shown to inhibit adipogenesis in adipocytes [25]. Additionally, 5-bromo-2hydroxy-4-methyl-benzaldehyde has been shown to exhibit anti-inflammatory activities via the inactivation of the ERK, p38, and NF-κB pathways [27]. These studies suggest that marine aldehyde derivatives have pharmaceutical potential. According to the drug-like properties predictions of the 11 marine aldehyde derivatives, these compounds also are likely to have good drug-like properties for commercial use. In addition, the marine aldehyde derivatives tested in this study, including MAD-4 exhibit viral inhibition potential through diverse cell signaling pathways. Future research may investigate the various antiviral activities of marine aldehyde derivatives via in vitro studies.

3D Structure of Proteins and the Marine Aldehyde Derivatives
For molecular docking studies, the crystal structure of the 3C-like protease (PDB ID: 6LU7) was obtained from the Protein Data Bank. The crystal structure had a resolution of 2.16 Å; therefore, it was used in this computational study. The "prepare protein" protocols of the Discovery Studio 2021 tool were applied for protein structure preparation. The binding site of the 3C-like protease was defined from the current docking site of the N3 ligand, following a previous paper [3]. The 3D structures of 11 marine aldehyde derivatives were obtained from PubChem. The compound names and PubChem compound ID (CID) numbers are as follows  Table 1). The geometry optimization of the marine aldehyde derivatives was performed using the BIOVIA Discovery Studio 2021 protocol of energy minimization and numbered the compounds in order from MAD-1 to MAD-11 (Table 2).

Drug-like Properties and ADME/Tox Predictions of the Marine Aldehyde Derivatives
Marine aldehyde derivatives were subjected to drug-like properties and ADME/Tox profiling using Discovery Studio 2021 (Biovia, San Diego, CA, USA). The drug-like properties of the marine aldehyde derivatives were predicted as follows: Lipinski rules (number of hydrogen bond donors < 5, number of hydrogen bond acceptors < 10, molecular weight < 500 Da, ALogP < 5, and no more than one violation of the above criteria) and Veber rules (rotatable bonds < 10, Polar Surface Area < 140, hydrogen bond donors and acceptors < 12) [45,46]. The ADME/Tox properties of the marine aldehyde derivatives were predicted using the Discovery Studio 2021 Descriptor module: six mathematical models, including HIA level, aqueous solubility level, BBB level, CYP2D6 prediction, hepatotoxicity prediction, and PPB prediction. The ADME/Tox protocol of Discovery Studio 2021 defines a perfect drug candidate as exhibiting good absorption, optimal/good/low solubility, medium/low BBB permeability, non-inhibitor/inhibitor CYP2D6, non-toxic hepatotoxicity, and <90% PPB [47]. The reliability of the predictions was analyzed via the HIA 99% confidence ellipse and BBB 99% confidence ellipse.

TOPKAT Predictions of Marine Aldehyde Derivatives
The toxicity of the marine aldehyde derivatives was predicted using the TOPKAT module of Discovery Studio 2021 (Biovia, San Diego, CA, USA). The chosen parameters in the TOPKAT prediction were the Ames test for mutagenicity, rat oral LD 50 , rat inhalational LC 50 , skin irritation, and female and male rat NTP. Results from these tests were used to select compounds with considerable pharmacokinetic behavior and low toxicity.

Molecular Docking Analysis of the Marine Aldehyde Derivatives on the 3C-like Protease
CDOCKER docking was performed to assess the binding poses of the marine aldehyde derivatives within the active site of the 3C-like protease. Molecular docking analysis was performed using the CDOCKER and Calculate Binding Energies tools in Discovery Studio 2021 (Biovia, San Diego, CA, USA). In particular, a docking mechanism based on CHARMm [34] was used to execute the docking protocol for CDOCKER. Docking of marine aldehyde derivatives to the 3C-like protease was performed as follows: (1) a 2D structure was converted into a 3D structure; (2) proteins were prepared, and the binding site was defined; and (3) docking of compounds was performed using the CDOCKER tool [32]. The binding pocket of the 3C-like protease was assigned as the area from the center of the active site within a radius of 14.8 Å. The binding site and ligand were allowed to move freely during docking. The water molecules are removed from the protein in the flexible docking process because the fixed water molecules might alter the generation of the ligand-receptor complex. After the removal of water molecules, hydrogen atoms were attached to the protein. The ligand-binding affinity was assessed for all complexes by applying the CHARMm force field to the interaction energy. Based on CDOCKER's interaction energy, distinct conformational poses for each molecule were produced and examined. The binding energies of the produced small-molecule-protein complexes were calculated using the Calculate Binding Energies tool. Three types of energy values (-CDOCKER interaction energy, -CDOCKER energy, and binding energy) were generated for the produced complexes; these were used to select candidate compounds. The docking positions of the selected marine aldehyde derivatives to the 3C-like protease were expressed as 2D diagrams and 3D crystal structures.

Conclusions
Marine aldehyde derivatives have good drug-like properties following Lipinski and Veber rules. MAD-4 and MAD-2 were selected for further study following in silico bioavailability and toxicity results. These compounds were predicted to interact more strongly with the 3C-like protease than curcumin, with binding energies of −74.1383 and −71.9725 kcal/mol, respectively. In addition, MAD-4 displayed higher −CDOCKER energy and −CDOCKER interaction energy. This study suggests that the drug discovery approach used provides insight into therapeutics that might be helpful in treating COVID-19. These in silico based results indicate that MAD-4 has the potential to be used as a SARS-CoV-2 inhibitor. Future research utilizing in vitro studies can be performed to confirm the ADME/Tox levels and antiviral-activity predictions of the marine aldehyde derivatives.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/md20060399/s1, Table S1: In silico-based aqueous solubility analysis of the marine aldehyde derivatives; Table S2: Key to aqueous solubility levels indicator; Table S3: In silico-based BBB permeability analysis of the marine aldehyde derivatives; Table S4: Key to BBB permeability levels indicator.