Structure-Function Characteristics of SARS-CoV-2 Proteases and Their Potential Inhibitors from Microbial Sources

The COVID-19 pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), is considered the greatest challenge to the global health community of the century as it continues to expand. This has prompted immediate urgency to discover promising drug targets for the treatment of COVID-19. The SARS-CoV-2 viral proteases, 3-chymotrypsin-like protease (3CLpro) and papain-like cysteine protease (PLpro), have become the promising target to study due to their essential functions in spreading the virus by RNA transcription, translation, protein synthesis, processing and modification, virus replication, and infection of the host. As such, understanding of the structure and function of these two proteases is unavoidable as platforms for the development of inhibitors targeting this protein which further arrest the infection and spread of the virus. While the abundance of reports on the screening of natural compounds such as SARS-CoV-2 proteases inhibitors are available, the microorganisms-based compounds (peptides and non-peptides) remain less studied. Indeed, microorganisms-based compounds are also one of the potent antiviral candidates against COVID-19. Microbes, especially bacteria and fungi, are other resources to produce new drugs as well as nucleosides, nucleotides, and nucleic acids. Thus, we have compiled various reported literature in detail on the structures, functions of the SARS-CoV-2 proteases, and potential inhibitors from microbial sources as assistance to other researchers working with COVID-19. The compounds are also compared to HIV protease inhibitors which suggested the microorganisms-based compounds are advantageous as SARS-CoV2 proteases inhibitors. The information should serve as a platform for further development of COVID-19 drug design strategies.


Introduction
In late December 2019, a new strain of coronavirus resulted in the outbreak of a pneumonia-like illness in Wuhan, China and has become a life-threatening concern worldwide in the present time [1,2]. Accordingly, WHO names the disease as the coronavirus disease 2019 (COVID-19) [1,2]. The new strain of this coronavirus has been named Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) by the World Health Organization (WHO) [3] since the RNA genome is about 82% identical to that of the SARS coronavirus (SARS-CoV). A person with COVID-19 will usually suffer from fever, dry cough, tiredness, and shortness of breath under severe conditions; others may be just silent carriers of the virus [4]. Due to the high transmission efficiency of this new coronavirus, on 11 March 2020, WHO has officially declared the COVID-19 as a pandemic.
Coronaviruses (CoVs) are the largest group of viruses under the order Nidovirales, family Coronaviridae, and subfamily Orthocoronavirinae [5][6][7]. The Orthocoronavirinae are further classified into four genera: alphacoronavirus (α-CoV), betacoronavirus (β-CoV), gammacoronavirus (γ-CoV) and deltacoronavirus (δ-CoV), based on genetic and antigenic criteria [8]. All viruses in the Nidovirales order are spherical and enveloped with clubshaped spikes on the surface giving the appearance of a crown-like protrusion [8]. They antigenic criteria [8]. All viruses in the Nidovirales order are spherical and enveloped with club-shaped spikes on the surface giving the appearance of a crown-like protrusion [8]. They are zoonotic viruses that infect various vertebrates (pets, bats, livestock, poultry, and human), and among human, CoVs are responsible for respiratory, gastrointestinal, and neurological problems [9,10]. They all contain very large RNA viruses' genomes, with some having the largest identified RNA genomes [5]. The first decade of the 21st century has witnessed an increase in the number of CoVs that caused a major outbreak of fatal human pneumonia. In 2003, Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) broke out in five continents, and the Middle East Respiratory Syndrome Coronavirus (MERS-CoV) broke out in the Arabian Peninsula in 2011 with mortality rates of 9.6% and 35.5%, respectively [11][12][13]. To date, there is no specific therapeutic drug for the treatment of the human coronavirus, of which the outbreak poses a huge threat to humans.
The SARS-CoV-2 belongs to clade B of β-CoV genus and has a (+) sense ssRNA genome [14]. The (+) sense ssRNA genome of CoVs is ~30,000 nucleotides in length with a 5′-cap structure and a 3′-poly(A) tail consisting of at least six open reading frames (ORFs) [15,16]. It contains several essential genes that encode the viral proteins necessary for replication, transcription, and infectious virus assembly. Upon entry into the host cell, to reach the replication stage, the CoV-2 genomic (+) sense ssRNA is used as mRNA to ultimately produce 16 non-structural proteins (nsps) (nsp1- 16), numbered according to their order from the N-terminus to the C-terminus of the ORF 1 polypeptides from two large polyproteins, pp1a (4405 amino acids) and pp1ab (7096 amino acids). These two large polyproteins are processed by the main protease, Mpro [also known as the 3-chymotrypsinlike protease (3CLpro)], and by one or two papain-like proteases (PLpro) and translated from the first ORF (ORF 1a/b) which contains a ribosomal frameshift around the middle ( Figure 1). This ribosomal frameshift enables a change in the reading frame to form pp1ab after pp1a translation. Therefore, the proper polyprotein processing is essential for the release and maturation of the 16 nsps and assembly into cytoplasmic, ER membranebound multicomponent replicase-transcriptase complex (RTC), which is responsible for directing the replication, transcription, and maturation of the viral genome and subgenomic mRNA. Meanwhile, all the four structural and nine accessory proteins (ORF3A, ORF3B, ORF6, ORF7A, ORF7B, ORF8, ORF9B, ORF9C, and ORF10) are translated from subgenomic RNAs (sgRNAs) produced from (−) sense ssRNA of CoVs [14], which is located on the one-third of the genome near the 3′-terminus [1].

Figure 1.
Schematic representation of single stranded SARS-CoV-2 genome structure with ∼30 kb nucleotides in length with a 5 cap structure and a 3 -poly(A) tail. The first ORF contains a frameshift in between a-1 of ORF1a and ORF1b which directly translated two polypeptides (pp1a and pp1ab). These polyproteins are processed by one or two papain-like proteases (PLpro) in which the dark blue upside-down triangle sign indicated the cleavage sites of 3CLpro, and by the 3C-like protease (3CLpro) in which the light blue upside-down triangle indicated the cleavage sites of 3CLpro, into the 16 nsps (nsp1- 16). The four main structural proteins are the spike (S), membrane (M), envelope (E), and nucleocapsid (N) proteins which are involved in infectious virus assembly [17][18][19][20]. However, more recently, it has become clear that some CoVs do not require the full ensemble of structural proteins to form a complete, infectious virion, suggesting that some structural proteins might be dispensable or that these CoVs might encode additional proteins with overlapping compensatory functions [21][22][23][24][25][26]. Thus, the vital role of the SARS-CoV-2 3CLpro (3CLpro-CoV2) and PLpro (PLpro-CoV2) in proteolytic cleavage of the large viral polyprotein orf1ab and viral replication has established them as promising drug and vaccine targets in the areas of therapeutic research against COVID- 19. Many reports on the screening of natural compounds against SARS-CoV-2 proteases are available which indicate the promising potency of the natural resources to be harnessed and developed as COVID-19 treatment. Nevertheless, most of compounds were derived from the plants. To our knowledge, the microorganisms-based compounds were also reported to be promising to inhibit the SARS-CoV-2 protease. Nevertheless, the reports remain scattered with no solid conclusion. In the present review, literature reports highlight the structure and function of SARS-CoV-2 proteases in viral replication and infection as the promising target for drug development. A more comprehensive understanding towards the structural and functional determination will allow a comprehensive understanding of the infection mechanism and facilitate the processes that can be exploited for structure-guided drug and vaccine design. Furthermore, the development of fungi, cyanobacteria, and their metabolites and peptides as potential drugs for these two viral proteases are also compiled.

Structure of 3CLpro
The 3CLpro belongs to the clan PA (family 30) with a catalytic type of mixed cysteine, serine, and threonine [27]. It is a dimer consisting of two monomers that are arranged almost perpendicular to one another (Figure 2), and the individual monomer is enzymatically inactive [28][29][30][31][32]. Each monomer exhibits three-domain structures, as shown in Figure 3 [33,34]. The N-terminal domains (domains I and II) consist of six-stranded antiparallel β-barrel structures, which together resemble the architecture of chymotrypsin and the other 3C proteases found in picornaviruses [35]. It forms a chymotrypsin-like fold between these two domains, hosting the complete catalytic machinery [36]. Therefore, the name of '3C-like protease' refers to the similarities between this protease and the other 3C proteases found in picornaviruses, namely their similar core structural homology and substrate specificities [35].    Unlike other chymotrypsin-like enzymes and many Ser (or Cys) hydrolases, 3CLpro possesses an unconventional Cys catalytic residue. It consists of Cys145 and His41 to form the catalytic Cys-His dyad pair, instead of a canonical Ser(Cys)-His-Asp(Glu) triad in the centre of the cleft between the N-terminal domains [36][37][38][39]. Mechanistic studies suggest an "electrostatic" trigger initiates the acylation step. The Cys145 residue serving as a nucleophile in the enzyme-catalyzed proteolytic reaction, and the imidazole motif of His41 serving as a general base [40]. The cleft accommodates four substrate residues in positions P1′ through P4, and it is flanked by residues from domains I (residues 10-99) and II Unlike other chymotrypsin-like enzymes and many Ser (or Cys) hydrolases, 3CLpro possesses an unconventional Cys catalytic residue. It consists of Cys145 and His41 to form the catalytic Cys-His dyad pair, instead of a canonical Ser(Cys)-His-Asp(Glu) triad in the centre of the cleft between the N-terminal domains [36][37][38][39]. Mechanistic studies suggest an "electrostatic" trigger initiates the acylation step. The Cys145 residue serving as a nucleophile in the enzyme-catalyzed proteolytic reaction, and the imidazole motif of His41 serving as a general base [40]. The cleft accommodates four substrate residues in positions P1 through P4, and it is flanked by residues from domains I (residues 10-99) and II (residues 100-182) [36]. The active sites of 3CLpro are highly conserved, composing of four subsites (S1 , S1, S2, and S4), and the binding pocket of 3CLpro is mostly hydrophilic, except for the S2 subsite [40].
An extended loop region connects the catalytic domains to the C-terminal domain (domain III). This latter domain (residues 198-303) is composed of a globular cluster of five antiparallel α-helices, which is unique to the CoVs 3CL proteases [41] and is responsible for the enzyme dimerization. A contact interface (∼1394 Å) for the tight dimer is formed predominantly between domain II of monomer A and the NH2-terminal seven residues (N-finger) of monomer B in the 3CLpro-CoV-2 dimeric structure. The dimerization is essential for the enzymatic activity of 3CLpro to assist in the correct orientation of the S1 pocket of the substrate-binding site as the N-finger of each of the two monomers interact with Glu166 of the other monomer [42]. The N-finger is squeezed in between domains II and III of the parent monomer and domain II of the other monomer to reach this interaction site. Sequence alignment revealed that the 3CLpro-CoV-2 shares 96% similarity with SARS, which is highly conservable among CoVs ( Figure 4) [33]. Meanwhile, the structure of these proteins (in apo forms) was also found to be very similar as indicated by the low R.M.S.D. value (0.459 Å) from the structural superimposition ( Figure 5). Notably, sequence alignment ( Figure 4) showed that there are 12 non-conserved residues (out of 306 total residues) in 3CLpro-CoV-2. Nevertheless, mutational works on these residues indicated that none of the mutations on these residues (T35V, A46S, S65N, L86V, R88K, S94A, H134F, K180N,  L202V, A267S, T285A, I286L) seriously affected the catalytic activity of 3CLpro-CoV-2. This indicated that the non-conserved residues have no major roles in the enzymatic activity of 3CLpro-CoV-2 [4]. In addition, none of the mutation effected the overall structure of 3CLpro-CoV-2. Interestingly, polar interactions via a hydrogen bond between two Thr285 of each monomer was observed in SARS-CoV, but not in SARS-CoV-2 [40]. The absence of this residue in 3CLpro-CoV-2, nevertheless, has no effect on its dimerization and the activity of this protein.

Functions of 3CLpro
The 3CLpro, also known as nsp5, is not only the most cysteine protease conserved in structure, but also in its function in all known CoVs [52]. It serves as the main protease for proteolytic processing of the replicase two large polyproteins pp1a and pp1ab (replicase 1ab, ∼790 kDa), and is indispensable for virus replication [5,[53][54][55]. The enzyme has a recognition cleavage sequence of Leu-Gln↓Ser-Ala-Gly (↓ marks the cleavage site), in which the Gln forms hydrogen bonds with residues in the S1 subsite and a small amino acid (Ser, Ala, or Gly) occupies the S1 subsite [40]. The absolute dependence of the virus on the correct function of this protease, along with the absence of a homologous human protease, makes 3CLpro one of the most pursued targets for the development of specific protease inhibitors [33,56].
Prior to processing, 3CLpro is found within a >800 kDa precursor, which is processed into a 150 kDa, comprised of a nsp4 to nsp10/11 precursor [57,58]. The 3CLpro is first automatically cleaved from polyproteins (at its own N-terminal and C-terminal auto processing sites). According to the theory, two 3CLpro proteases anchored to membranes by the transmembrane proteins nsp4 and nsp6 form a dimer and initiate cleavage in trans [59][60][61][62]. Following its maturation cleavage, 3CLpro is believed to target nsp9-10 for processing before moving on to the nsp8-9 and nsp7-8 sites, respectively [63]. Once these sites are processed, the other nsps that 3CLpro is responsible for cleaving are detached from the nsp7-10 site individually. Prior to the final processing of the nsps, one of the intermediate complexes, nsp7+8, performs a significant job catalyzing the cleavage of nsp12, a critical viral polymerase. Therefore, the disruption of the nsp7-nsp8 and nsp8-nsp9 cleavage sites results in nsps loss of virus viability, whereas other sites, such as the nsp9-nsp10 site, can be tolerated with reduced replication in studies involving a mutation of the 3CLpro cleavage sites [64]. Thus, the 3CLpro's ordered processing may be a unique aspect of viral replication that inhibitors can disrupt.
Other than that, 3CLpro interacts with a variety of different replication complex components. Several studies have found significant intramolecular and intermolecular connections between the 3CLpro and the remainder of the replicase gene, with mutations in the 3CLpro domain as well as mutations in nsp3 and nsp10 all harming 3CLpro activity [58,65,66]. These findings strongly suggest that 3CLpro and other members of the replicase gene have crucial allosteric interactions. Furthermore, multiple temperaturesensitive mutations in the 3CLpro of the mouse hepatitis virus (MHV) and HCoV-OC43 were discovered, which are selected for second-site compensatory alterations that were more than 15 mutations away from the initial mutation site [54,65]. These findings suggest that complex interactions across all three domains of the protease are essential for the structure and function of the enzyme. More research is needed to fully comprehend their role, as they could represent new avenues for proteolytic inhibition.

Structure of PLpro
The PLpro belongs to the clan of cysteine proteases CA but is affiliated to the family C16, which contains polyprotein endopeptidases from coronaviruses [27]. It exists as a monomer (about 300 residues) in biological settings and has the USP fold, typical for the ubiquitin-specific proteases (USP) family in humans [67]. The sequence of SARS-CoV-2 PLpro is quite similar with SARS-CoV PLpro as evidence by an 83% sequence identity ( Figure 6). In addition, the superimposition between the three-dimensional structure of both enzymes yields 0.792 Å as the value of the R.M.S.D. showing that the proteins are structurally similar (Figure 7). C16, which contains polyprotein endopeptidases from coronaviruses [27]. It exists as a monomer (about 300 residues) in biological settings and has the USP fold, typical for the ubiquitin-specific proteases (USP) family in humans [67]. The sequence of SARS-CoV-2 PLpro is quite similar with SARS-CoV PLpro as evidence by an 83% sequence identity ( Figure 6). In addition, the superimposition between the three-dimensional structure of both enzymes yields 0.792 Å as the value of the R.M.S.D. showing that the proteins are structurally similar (Figure 7).  The PLpro-CoV-2 consists of four distinct domains ( Figure 8) [68]. The first 60 residues form an independent N-terminal ubiquitin-like (Ubl) domain that is well separated from the other three domains (palm, thumb, and finger domains), which forms a C-terminal ubiquitin-specific protease (Usp) domain [69]. The PLpro Ubl domain has five βstrands, one α-helix, and one 310-helix [14]. It adopts a β-grasp fold which is similar to ubiquitin and Ubl domains of several proteins, including ISG15, yeast yukD, elongin B, tubulin-binding cofactor B, and modifier protein hub 1 [70]. The function of this domain is not well understood, and some studies suggested that it has no effect on the function of PLpro [71]. However, according to Bosken et al. [67], the transposition of Ubl towards the thumb domain resulted in hydrophobic interactions between Pro59 of the Ubl domain and Pro77 and Thr75; Thr75 then interacts with Phe69 of the "ridge" helix and thus, alters the latter residues conformation. The PLpro-CoV-2 consists of four distinct domains ( Figure 8) [68]. The first 60 residues form an independent N-terminal ubiquitin-like (Ubl) domain that is well separated from the other three domains (palm, thumb, and finger domains), which forms a C-terminal ubiquitin-specific protease (Usp) domain [69]. The PLpro Ubl domain has five β-strands, one α-helix, and one 310-helix [14]. It adopts a β-grasp fold which is similar to ubiquitin and Ubl domains of several proteins, including ISG15, yeast yukD, elongin B, tubulinbinding cofactor B, and modifier protein hub 1 [70]. The function of this domain is not well understood, and some studies suggested that it has no effect on the function of PLpro [71]. However, according to Bosken et al. [67], the transposition of Ubl towards the thumb Meanwhile, the C-terminal folds in a canonical thumb-palm-fingers-like structure with the ubiquitin domain anchored to the thumb [72]. In the "open hand" architecture of PLpro, the ubiquitin sits on the "palm" domain and is held in place by the zinc-binding "fingers" domain [73]. The "thumb" domain (residues 107-113, 162-168) is formed by four prominent helices (α4-7), and the palm (residues 269-279) is made up of a six-stranded βsheet (β8-13) that slopes into the active site, which is housed in a solvent-exposed cleft between the thumb and palm domains [70]. A four-stranded, antiparallel β-sheet makes up the ''fingers'' domain, and within the fingertip's region, four cysteine residues coordinate to a zinc ion [74].
The Zn ion is labile and tetrahedrally coordinated by conserved Cys residues (Cys189−X−X−Cys192−Xn−Cys224−X−Cys226). It is essential for the catalysis process because it holds the structural integrity of PLpro-CoV-2 [40,70,75]. Hence, mutation of the zinc-coordinating cysteine caused a significant loss of enzymatic activity, suggesting that the zinc-binding ability is essential for its enzymatic function [75].
The active site of PLpro consists of Cys111, His272, and Asp286, forming the catalytic triad that catalyzed the peptide bond. The nucleophile cysteine (Cys111) is situated at the foot (N-terminus) of α-helix α4 in the thumb domain. The basic histidine (His272) is positioned 3.7 Å from the pros(p)-nitrogen atom of the side chain sulphur atom of Cys111, allowing facile proton transfer [40]. This His272 is located at the foot of the palm domain and adjacent to the flexible β-hairpin loop called the blocking loop two or BL2 (also called the G267-G272 loop) [75,76] or β-turn [77]. The BL2 is a flexible loop that can result in an open or closed conformation [68]. One of the oxygen atoms of the side chain of catalytic aspartic acid (Asp286) is located 2.7 Å from the tele(s)-nitrogen of the catalytic histidine at the foot of the palm domain [78]. Thus, the proposed catalytic cycle involves the catalytic Cys111 as a nucleophile, His272 as a general acid-base, and Asp286 paired with His272 to align and promote deprotonation of Cys111 [40]. Its catalytic domain is also flanked by numerous catalytically active enzymes, transmembrane domains, and Meanwhile, the C-terminal folds in a canonical thumb-palm-fingers-like structure with the ubiquitin domain anchored to the thumb [72]. In the "open hand" architecture of PLpro, the ubiquitin sits on the "palm" domain and is held in place by the zinc-binding "fingers" domain [73]. The "thumb" domain (residues 107-113, 162-168) is formed by four prominent helices (α4-7), and the palm (residues 269-279) is made up of a six-stranded β-sheet (β8-13) that slopes into the active site, which is housed in a solvent-exposed cleft between the thumb and palm domains [70]. A four-stranded, antiparallel β-sheet makes up the "fingers" domain, and within the fingertip's region, four cysteine residues coordinate to a zinc ion [74].
The Zn ion is labile and tetrahedrally coordinated by conserved Cys residues (Cys189−X−X−Cys192−Xn−Cys224−X−Cys226). It is essential for the catalysis process because it holds the structural integrity of PLpro-CoV-2 [40,70,75]. Hence, mutation of the zinc-coordinating cysteine caused a significant loss of enzymatic activity, suggesting that the zinc-binding ability is essential for its enzymatic function [75].
The active site of PLpro consists of Cys111, His272, and Asp286, forming the catalytic triad that catalyzed the peptide bond. The nucleophile cysteine (Cys111) is situated at the foot (N-terminus) of α-helix α4 in the thumb domain. The basic histidine (His272) is positioned 3.7 Å from the pros(p)-nitrogen atom of the side chain sulphur atom of Cys111, allowing facile proton transfer [40]. This His272 is located at the foot of the palm domain and adjacent to the flexible β-hairpin loop called the blocking loop two or BL2 (also called the G267-G272 loop) [75,76] or β-turn [77]. The BL2 is a flexible loop that can result in an open or closed conformation [68]. One of the oxygen atoms of the side chain of catalytic aspartic acid (Asp286) is located 2.7 Å from the tele(s)-nitrogen of the catalytic histidine at the foot of the palm domain [78]. Thus, the proposed catalytic cycle involves the catalytic Cys111 as a nucleophile, His272 as a general acid-base, and Asp286 paired with His272 to align and promote deprotonation of Cys111 [40]. Its catalytic domain is also flanked by numerous catalytically active enzymes, transmembrane domains, and domains of unknown function, and the entire nsp3 is localized to the ER membranes where most of the domains reside in the cytosol of the cell [79,80]. Accordingly, the active site residues are located at the interface between the thumb and palm subdomain [81].

Functions of PLpro
PLpro, also known as the protease domain of nsp3, work together with 3CLpro to generate a functional replicase complex and enable viral spread [82,83]. The enzyme is in nsp3 between the SARS unique domain (SUD/HVR) and a nucleic acid-binding domain (NAB) [14]. It is highly conserved and found in all coronaviruses [84]. The enzyme recognizes a consensus cleavage motif, LXGG (X = any amino acid, L = leucine, and G = glycine) (P4-P1), which is present in two large polyproteins pp1a and pp1ab, corresponding to the P4-P1 substrate positions of cysteine proteases. It will cleave the peptide bonds between nsp1 and nsp2 (LNGG↓AYTR), nsp2 and nsp3 (LKGG↓APTK), and nsp3 and nsp4 (LKGG↓KIVN), with no preference at the P1 position [14,40]. The cleaving process thus releasing nsp1, nsp2, and nsp3 proteins that will assemble and resulting in the generation of a multifunctional, membrane-associated replicase complex on host membranes, initiating replication and transcription of the viral genome [75,83,85].
The PLpro recognizes and hydrolyzes the cellular proteins of ubiquitin [75] and the Ubiquitine-like (Ubl) protein ISG15 (interferon-induced gene 15) [86,87] as both bearing the LXGG recognition motif at their C-terminus to remove them from host cell proteins [77]. Ubiquitin is a 76-amino-acid protein associated with the regulation of endocytic processes, the cellular response to DNA damage and immunologic processes with the activation of NFkB signalling [81]. The ISG15 modification induced upon viral infection, comprises two fused Ubl-folds structurally resembling diubiquitin [88,89].
Due to the similar recognition motif, PLpro also possesses deubiquitinating and deISGylating capabilities which interfere with critical signaling pathways leading to the expression of type I interferons [40,90]. This interference thus results in an antagonistic effect on the host innate immune response by inhibiting the production of cytokines and chemokines that are responsible for the activation of the host innate immune response against viral infection [78,85,91]. Additionally, PLpro suppresses the NFkB activation [86] and, subsequently, interferes with interferon-β production and suppresses the immune response [92]. Both ubiquitination and ISGylation play important roles in regulating innate immune responses to viral infection. Therefore, it may not be surprising to observe that multiple viruses have evolved different strategies to antagonize these pathways [93]. In conclusion, inhibition of PLpro activity can halt viral replication and disrupt its role in host immune response evasion, making it an excellent antiviral drug target.

Microorganisms as Sources of Inhibitors Targeting SARS-CoV2 Proteases
The abundance of microorganisms as a living entity on the earth's surface, which interacts with other organisms, flourishes in the biosphere. Both create an interactive network that constitutes the basis for life on our planet [94][95][96][97]. In recent decades, natural products have been a key source of drug discovery, accounting for 60% of the total market. Interestingly, natural microbial sources account for more than 40% of new drugs discovered since 1980 [98,99]. Until now, the global pharmaceutical industry has relied heavily on natural products from microbial sources to develop novel and effective therapeutics. Natural products derived from microbial sources are considered unique in their chemical diversity in comparison with plant-derived ones. This is due to the tremendous significance of bioactive substances acquired from microorganisms with various biological activities, including antiviral, antibacterial, anticancer, antifungal, and anti-inflammatory. Furthermore, these new substances and novel drugs are cost-effective and highly efficient [100,101]. Since the outbreak of SARS-CoV-2, researchers have studied various microorganisms-based compounds that are promising to inhibit the SARS-CoV-2 proteases (Table 1). Notably, the potency of these compounds is based on in silico studies against either 3CLpro-CoV2 or PLpro-CoV2. Unfortunately, experimental evidence confirming the potency of these compounds against SARS-CoV-2 are so far not available to our knowledge. As shown in Table 1, the promising compounds for 3CLpro-CoV2 or PLpro-CoV2 inhibitors are dominated from Aspergillus groups. Antiviral activity of Aspergillus groups against some viruses are widely reported [102][103][104][105], nevertheless no report for experimental evidence on the antiviral properties of this group against living SARS-CoV-2 exists. Interestingly, Koehler et al. [106] reported that COVID-19 patients are more susceptible towards invasive pulmonary aspergillosis (IPA). Reports of COVID-19-associated pulmonary aspergillosis have raised concerns about it worsening the disease course of COVID-19 and increasing mortality. This implies that the use of Aspergillus-originated compounds for COVID-19 treatments should be in pure form, with no contamination of its cell producers (Aspergillus). This is to avoid the possibility of an IPA event during the delivery of compounds.
Further, Table 2 showed the shortlisted microbial natural products with the most promising binding properties against SARS-CoV2 proteases. These compounds were selected based on the best binding energy during the in silico screening. Notably, the compounds shown in Table 2 are not only secondary metabolite compounds, but also some bacteriocins that belong to primary metabolite groups. While the antibiotic is grouped under primary metabolite compounds, bacteriocin bacteriocins are ribosomally synthesized and produced during the primary phase of growth [107].  11a-dehydroxyisoterreulactone A Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −8.9 [110] Aspergillide B1 Molecular docking using OpenEye's FRED −9.473 [111] 11adehydroxyisoterreulactone A Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −8.9 [110] Microorganisms 2021, 9,  11a-dehydroxyisoterreulactone A Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −8.9 [110] Aspergillide B1 Molecular docking using OpenEye's FRED −9.473 [111] Aspergillide B1 Molecular docking using OpenEye's FRED −9.473 [111] Microorganisms 2021, 9,   Fonsecin Molecular docking using AutoDock Vina, ADMET analysis pkCSM-pharmacokinetics server −7.25 [116] Microorganisms 2021, 9, x 16 of 27 Scedapin C Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Norquinadoline A Molecular docking using UCSF chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software Note: The structure of peptide-based compounds are displayed in their three-dimensional structure. Scedapin C Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Microorganisms 2021, 9, x 16 of 27 Scedapin C Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Norquinadoline A Molecular docking using UCSF chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software Note: The structure of peptide-based compounds are displayed in their three-dimensional structure. Norquinadoline A Molecular docking using UCSF chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Microorganisms 2021, 9, x 16 of 27 Scedapin C Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Norquinadoline A Molecular docking using UCSF chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Deoxycylindrospermopsin Note: The structure of peptide-based compounds are displayed in their three-dimensional structure. Scedapin C Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Norquinadoline A Molecular docking using UCSF chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Deoxycylindrospermopsin Note: The structure of peptide-based compounds are displayed in their three-dimensional structure. Scedapin C Molecular docking using UCSF Chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software −10.9 [110] Norquinadoline A Molecular docking using UCSF chimera, molecular dynamics using Amber 18, computational prediction of the absorption, distribution, metabolism, and excretion (ADME) properties using SwissADME software Note: The structure of peptide-based compounds are displayed in their three-dimensional structure.  [117] Note: The structure of peptide-based compounds are displayed in their three-dimensional structure.
Overall, the most promising compounds (Table 2) have a wide range of binding energy of −6.9 to −155.3 kcal/mol. Interestingly, plant compounds that were screened against 3CLpro-CoV2 or PLpro-CoV2 have a binding energy ranging from −4.7 to −10.0 (Table 3). This indicated that some microbial natural products are predicted to serve as better inhibitors than the plant compounds. These include citriquinochroman (−14.7 kcal/mol), scedapin C (−10.9 kcal/mol), norquinadoline A (−10.9 kcal/mol), tyrocidine A (−13.1 kcal/mol), gramicidin S (−11.4 kcal/mol), and bacteriocin glycocin F (−155.3 kcal/mol). The Natural Products Atlas (www.npatlas.org, accessed on 26 June 2021) has created a database of natural products that includes >20,000 compounds from bacteria and fungi and contains referenced data for structure, compound names, source organisms, isolation references, total syntheses, and instances of structural reassignment [122]. To discover potential natural ligands that could block the 3CLpro active site, a study by Sayed et al. [108] utilized this database. The investigation conducted different stages of screening and identified six possible anti-SARS-CoV-2 candidates from the database. The top-scoring compounds in the study are citriquinochroman, holyrine B, proximicin C, pityriacitrin B, (+)-anthrabenzoxocinone, and penimethavone A. The citriquinochroman that can be found in the endophytic fungus Penicillium citrinum [123] that got the best hit with the 3CLpro-CoV2 by showing the least binding energy (∆Gaverage = −12.4 kcal/mol), with perfect fitting inside the enzyme active site in the crystallized form. The compounds anchored themselves via a network of H-bond interactions with the reported key binding residues (7 H-bonds) [34]. Despite the flexibility of the enzyme active site, citriquinochroman was able to keep its orientation during the course of simulation with a transient drop in its binding affinity at 3-5.5 ns (∆GVina = −8.9 kcal/mol).
Another study by Rao et al. [109] took the computational approach by screening 100 various small molecules of fungal metabolites from PubChem (https://pubchem.ncbi. nlm.nih.gov accessed on 26 June 2021) using molecular docking and dynamics simulation. The study proposed pyranonigrin A, a secondary fungal metabolite from Aspergillus niger [124], to possess potent inhibitory potential against the 3CLpro-CoV2 among these selected fungal metabolites. This fungal metabolite could make seven hydrogen bonds at par with N3 (the positive control compound in the study) and is also predicted to form a covalent bond with 3CLpro-CoV2 making it a promising compound that could be seen for an 3CLpro-CoV2 inhibitor with permanent (irreversible) and strong binding.
Quimque et al. [110] also demonstrated that 14 fungal secondary metabolites displayed a relatively high affinity with 3CLpro-CoV2 with binding energies ranging from −7.9 to -8.9 kcal/mol. Of which, the most notable inhibitor is the anti-HSV metabolite from Aspergillus terreus fungus [125], 11a-dehydroxyisoterreulactone A with a −8.9 kcal/mol binding energy. In their in silico study using molecular docking and molecular dynamics simulation, 11a-dehydroxyiso-terreulactone A is found bound to the catalytic residue His41 through pi-pi stacking with the pyranone ring fused in its polycyclic core. The enzyme-ligand complex is further stabilized by H-bonding and pi-alkyl interactions with its methoxyphenyl substituent.
Sterenin M, an isoprenylated depside, which was first isolated in from a culture of the mushroom Stereum hirsutumcan in the year of 2014 [126], also interacted with the 3CLpro-CoV2 catalytic dyad as reported by Prajapati et al. [112]. By screening more than 1800 chemically diverse and therapeutically important secondary metabolites available in the Medicinal Fungi Secondary Metabolites and Therapeutics (MeFSAT) database, their study found five fungal metabolites are having this interaction. However, only sterenin M is making hydrogen bonds with the key amino acids of 3CLpro-CoV2 (Gly143, His163, Phe140, and Glu166).
Other potential inhibitors that interacted with catalytic dyad amino acid residues of 3CLpro-CoV2 are stigmasterol, chondrillasterol, and hexadecnoic acid, compounds that were isolated from the crude extracts Bacillus species [113]. They bind in the substrate-binding pocket of 3CLpro-CoV2. Among the top three docking hits, hexadecanoic acid was found to be the most promising anti-COVID-19 lead against the main protease after further evaluation using 50 ns molecular dynamic simulation and MMPB-GBSA.
Not only compounds from the Bacillus species, metabolites from cyanobacteria also showed potential as 3CLpro-CoV2 inhibitors as found in the study by Naidoo et al. [114]. By employing an in silico molecular interaction-based approach, metabolites cylindrospermopsin, deoxycylindrospermopsin, carrageenan, cryptophycin 52, eucapsitrione, tjipanazole, tolyporphin, and apratoxin A exhibited promising inhibitory potential against this protease. Among the 23 chemically diverse biologically active metabolites from the cyanobacteria, the cyanotoxin, deoxycylindrospermopsin originally isolated from Cylindrospermopsis sp. displayed the most promising binding efficiency (−8.6 ± 0.02 kcal/mol) and interacted with the important residues at the active binding pocket of 3CLpro-CoV2. This includes the binding with Glu166 that is very important for the dimeric structure of this protease [42].
Apart from that, antimicrobial peptides (AMPs) produced by different organisms also have been considered as potential antiviral drugs against COVID-19. Many microorganisms produce these peptides as their innate immune response component against invading pathogens [115,127,128]. They also have been reported to exert a powerful antimicrobial effect on the membrane of different pathogenic microorganisms including bacteria, fungi, and viruses [127,129]. Due to this potential, Balmeh et al. [115] utilized StraPep (http://isyslab.info/StraPep/ accessed on 15 July 2021) and PhytAMP (http://phytamp.hammamilab.org/main.php accessed on 15 July 2021) databases to figure out AMPs. Among 500 bio-peptides that were analyzed, bacteriocin glycocin F from the Lactobacillus plantarum showed the highest binding affinity towards 3CLpro-CoV2 with 155.3 ± 7.5 kcal/mol. The obtained score was calculated as a high score in the HADDOCK molecular docking scoring system, thus, showing the potential of bacteriocin glycocin F as a 3CLpro-CoV2 inhibitor.

Microbial Natural Products as a Potential Inhibitor of PLpro-CoV2
For millennia, medicinal fungi have been used to treat human illnesses in traditional remedies. Fungi are abundant in secondary metabolites, which provide a valuable and diverse chemical resource of natural products with potential bioactivity [112]. Fungal metabolites have not just been screened for potential inhibitor against 3CLpro-CoV2 [109], but also for PLpro-CoV2. A study conducted by [116] retrieved 100 naturally occurring secondary fungal secondary metabolites with aromatic moiety. Its aim was to identify the analogue of GRL0617, the naphthalene-based compounds that can effectively inhibit PLpro-CoV2 [69]. Interestingly, six hits were found that can interact with the Tyr268 residue of PLpro-CoV2 and the lead fungal metabolite identified is fonsecin. Fonsecin is a naphthopyrone pigment isolated from a mutant of Aspergillus fonsecaeus [130] and had a binding energy at par with GRL0617. Thus, it is believed that this fungal metabolite can inhibit PLpro-CoV2 as deduced using docking and molecular dynamics.
Quimque et al. [110] also discovered the potential of secondary metabolites from fungi as potential drug prototypes against the SARS-CoV-2 virus. All the screened fungal secondary metabolites were selected with profound antiviral activity on a range of known pathogenic viruses such as the human immunodeficiency virus (HIV), influenza virus, herpes simplex virus (HSV), and hepatitis C virus (HCV). The study disclosed that the in silico aided discovery of two fumiquinazoline marine alkaloids of scedapin C (−10.9 kcal/mol) and norquinadoline A (−10.9 kcal/mol) exhibiting strong in silico binding activity against PLpro-CoV2. Scedapin C was isolated from the marine-derived fungus Scedosporium apiospermum F41−1 and displayed significant antiviral activity against hepatitis C [131]. It was noted to be bound to the putative binding site of PLpro-CoV2 through H-bonding with two ketones of the quinazolinedione core against Arg712 during the post-dock analysis. On the other hand, norquinadoline A that was isolated from the mangrove-derived fungus Cladosporium sp. PJX-41 showed activity against influenza A (H1N1) [132]. It was tightly nestled to the PLpro-CoV2 s binding site, stabilized mostly by van der Waals forces.
Apart from that, a study by Bansal et al. [117] has utilized non-ribosomal peptide synthetases (NRPs) from the PubChem database as potential inhibitors for PLpro-CoV2. Through the molecular docking, 21 pharmacologically active NRPs from marine microbes showed strong interaction with the protease. Out of the 21 screened ligands, the two peptides produced by Bacillus brevis [134,135], tyrocidine A and gramicidin S, showed the highest binding affinities with −13.1 kcal/mol and −11.4 kcal/mol, respectively, for PLpro-CoV2. Tyrocidine A exhibited hydrogen bonding with Lys105, and established π-alkyl interactions with Leu162 and Met208. It also showed π-cation interactions with Lys157, Asp164, and Glu167 residues. Meanwhile, gramicidin S formed four hydrogen bonds with Lys157, Glu161, Lys105, and Asp108, and π-cation and anion interactions with Lys157, Asp164, and Glu167 residues. The ligand also formed alkyl and π-alkyl interactions with Met208 and Pro247 residues. The docking results thereby further suggesting that these peptides might be used in inhibiting PLpro-CoV2.

Possibility of Inhibitors Targeting HIV Protease for SARS-CoV2 Proteases
It is interesting that the strategy of inhibition of proteases is also applied for the discovery of drugs targeting the human immunodeficiency viruses (HIV), the retrovirus that causes human immunodeficiency virus infection and acquired immunodeficiency syndrome (HIV/AIDS). This virus is equipped with an aspartic protease with the catalytic tried active sites of Asp-Thr-Gly (Asp25, Thr26, and Gly27). This protease is essential for virus replication, particularly to hydrolyze peptide bonds on the Gag-Pol polyproteins at nine specific sites, processing the resulting subunits into mature, fully functional proteins. These cleaved proteins, including reverse transcriptase, integrase, and RNaseH, are encoded by the coding region components necessary for viral replication [136].
To note, significant differences were observed on the genomic structure and proteases of SARS-CoV2 and HIV. Nevertheless, the recent reports on the promising use of lopinavir/ritonavir, approved anti-HIV drugs, to inhibit SARS-CoV2 [137,138] might be indicating that the repurpose strategy of HIV proteases for SARS-CoV-2 protease was apparently possible. Thran et al. [139] reported that lopinavir, darunavir, atazanavir, remdesivir, and tipranavir demonstrated the promising inhibitory properties against 3CLpro-CoV2 with the binding energy ranging from −8.4 to −9.2 kcal/mol. These compounds are the FDA-approved drugs of HIV. Similarly, Raphael and Shanmughan [140] also examined the HIV-protease inhibitors of atazanavir, darunavir, fosamprenavir, saquinavir, lopinavir, ritonavir, nelfinavir, and indinavir against 3CLpro-CoV2. The result showed that the binding energy ranged from −6.4 to −9.0 kcal/mol. Bolcato et al. [141] also demonstrated that lopinavir, ritonavir, and nelfinavir indeed were able to bind and lock the catalytic site of the 3CLpro-CoV2. Experimental evidence performed by Mahdi et al. [142] showed that lopinavir, ritonavir, darunavir, saquinavir, and atazanavir could inhibit the viral protease in cell culture, albeit in concentrations much higher than their achievable plasma levels, given their current drug formulations It is interesting to note that Table 2 showed that some microbial natural products have better binding energy to 3CLpro-CoV2 than these FDA-approved drugs of HIV.
In particular, aspergillide B1 (−9.47 kcal/mol), 3α-Hydroxy-3,5-dihydromonacolin L (−9.39 kcal/mol), citriquinochroman (−14.7 kcal/mol), scedapin C (−10.9 kcal/mol), norquinadoline A (−10.9 kcal/mol), tyrocidine A (−13.1 kcal/mol), gramicidin S (−11.4 kcal/mol), and bacteriocin glycocin F (−155.3 kcal/mol) have better binding energy than the HIV protease reported. This implied that these microbial compounds possibly bind and inhibit the SARS-CoV-2 proteases better than the HIV protease inhibitors. To note, cell culture works reported by Mahdi et al. [142] indicated that complete inhibition of the virus required a high concentration of HIV protease inhibitors. This limits the clinical potential of the inhibitors for SARS-CoV-2 drugs. The high concentration requirement is likely associated to weak binding affinity of HIV protease inhibitors toward SARS-CoV-2 proteases, particularly 3CLpro-CoV2. As some microbial compounds displayed better binding energy to 3CLpro-CoV2 than HIV protease inhibitors, this might imply that the microbial compounds are able to inhibit 3CLpro-CoV2 protease at a lower concentration than the HIV protease inhibitors. The weak binding affinity of HIV protease inhibitors against 3CLpro-CoV2 is understandable as the compounds were initially designed to block HIV protease active sites. Structurally, HIV protease and 3CLpro-CoV2 are significantly different. In addition, the nature of the HIV protease as an aspartic protease is indeed different to the 3CLpro-CoV2 as a cysteine protease. Accordingly, the microbial natural products that are screened directly against 3CLpro-CoV2 might likely perform better than the HIV protease inhibitors. To note, there is no study on the binding assay of HIV protease inhibitors against PLpro-CoV2 due to extreme structural differences between both proteases.
Altogether, the use of microbial compounds is therefore advantageous as the compounds can inhibit both SARS-CoV-2 proteases. Nevertheless, further experimental confirmation on inhibitory activities of microbial-based compounds against SARS-CoV-2 proteases remains to be conducted. For this purpose, our previous success in the production of recombinant 3CLpro-CoV2 and PLpro-CoV2 [143] is advantageous for further compound testing.

Conclusions
By a million confirmed cases worldwide, the SARS-CoV-2 pandemic crisis has caused mounting mortality rates and economic devastation that have made many people despair. In these few decades, this is the third pandemic caused by CoVs, after SARS and MERS, but to no avail, there are still no confirmed drugs for treatment. Thus, it has become challenging for researchers to formulate effective therapeutics that take years of investigation and cost billions of dollars. Targeting 3CLPro and PLpro has become the aim of many studies to find molecules of compounds that can become promising inhibitors to stop viral replication. A comprehensive understanding on the structure and function of SARS-CoV-2 proteases is undoubtedly essential to serve as platform for the discovery of promising inhibitors. Among many natural compounds, microorganism-based compounds were also reported to exhibit inhibitory properties toward SARS-CoV-2 s proteases. As the microorganisms are relatively easily cultivated and harnessed to get the compounds, the use of microorganismbased compounds to target SARS-CoV-2 proteases is therefore advantageous. In addition, comparison with the HIV protease inhibitors indicated that microbial-based compounds apparently have better inhibition properties against the SARS-CoV-2 protease. This should contribute to a speedy discovery of potential treatments for COVID-19.