Cocaprins, β-Trefoil Fold Inhibitors of Cysteine and Aspartic Proteases from Coprinopsis cinerea

We introduce a new family of fungal protease inhibitors with β-trefoil fold from the mushroom Coprinopsis cinerea, named cocaprins, which inhibit both cysteine and aspartic proteases. Two cocaprin-encoding genes are differentially expressed in fungal tissues. One is highly transcribed in vegetative mycelium and the other in the stipes of mature fruiting bodies. Cocaprins are small proteins (15 kDa) with acidic isoelectric points that form dimers. The three-dimensional structure of cocaprin 1 showed similarity to fungal β-trefoil lectins. Cocaprins inhibit plant C1 family cysteine proteases with Ki in the micromolar range, but do not inhibit the C13 family protease legumain, which distinguishes them from mycocypins. Cocaprins also inhibit the aspartic protease pepsin with Ki in the low micromolar range. Mutagenesis revealed that the β2-β3 loop is involved in the inhibition of cysteine proteases and that the inhibitory reactive sites for aspartic and cysteine proteases are located at different positions on the protein. Their biological function is thought to be the regulation of endogenous proteolytic activities or in defense against fungal antagonists. Cocaprins are the first characterized aspartic protease inhibitors with β-trefoil fold from fungi, and demonstrate the incredible plasticity of loop functionalization in fungal proteins with β-trefoil fold.


Introduction
Protease inhibitors are important regulators of proteolytic activity, which plays an important role in many physiological and pathological processes. Protease inhibitors from fungi show great versatility, including unique types of inhibitory mechanisms. Protease inhibitors with the β-trefoil fold are a well characterized group of fungal protease inhibitors [1,2]. The β-trefoil fold consists of 12 β-strands folded into structurally similar units related by a pseudo-3-fold symmetry. β-strands are connected by 11 loops of varying length and composition that constitute approximately half of the molecule. Fungal protease inhibitors with β-trefoil fold inhibit different classes of proteases [3,4]. Mycospins that inhibit S1 family serine proteases are substrate-like canonical inhibitors that utilize different inhibitory reactive loops [5,6], whereas mycocypins that inhibit C1 family cysteine proteases occlude the protease's active site by a distinct inhibitory mechanism [7]. Mycocypins, the cysteine protease inhibitors found only in higher fungi, have been described from

Identification, Expression and Homologs of Cocaprins
Differential gene expression analysis of C. cinerea strain AmutBmut identified genes Okayama 7 CC1G_05298 and Okayama 7 CC1G_05299 as highly expressed in stage 1 primordia and vegetative mycelium, respectively [18] ( Figure 1A). The latter gene was also strongly induced upon challenge of AmutBmut vegetative mycelium with the fungivorous nematode Aphelenchus avenae [19]. Sequence-based structural analysis with the SMART web resource [22] predicted that the two encoded proteins, CCP1 (protein ID 485770) and CCP2 (protein ID 441209), respectively, and their paralog CCP3 (protein ID 441207) encoded by gene Okayama 7 CC1G_05297 [23] contain an 83 ± 5 residue Ricin-type β-trefoil lectinlike domain. All three proteins lack a signal peptide for classical secretion and are, thus, predicted to be cytoplasmic. Remarkably, all three genes are arranged as a tandem within 4 kb (981721 bp-985676 bp) on scaffold 19 of the AmutBmut genome ( Figure 1A). Okayama 7 CC1G_05297 is highly expressed both in vegetative mycelium and stage 1 primordia [17] ( Figure 1B). Thus, the three paralogous genes have the same genomic location but differ significantly in their regulation pattern. 2) corresponding to 13 developmental stages in C. cinerea strain AmutBmut for ccp1, ccp2 and ccp3 [17]. Standard deviations are shown as error bars. My: Vegetative mycelium, Knot: Hyphal knots with vegetative mycelium, sPri: Small fruiting body primordia, 0 h Pri: Fruiting body primordia at 0 h, 12 h Pri: Fruiting body primordia at 12 h after the trigger light, 24

Biochemical Characterization
Cocaprins were expressed in bacteria using the pET24 vector and E. coli BL21(DE3). After solubilization of inclusion bodies in 8 M urea, cocaprins were purified by sizeexclusion and metal-affinity chromatography. Recombinant CCP1 and CCP2 resolved on SDS-PAGE under reducing conditions as a single 21 kDa and 20 kDa band, respectively (Figure 2A) Figure 2B), indicating dimer formation under these conditions. The experimentally determined isoelectric point of the His-tagged cocaprins ( Figure 2C) (at pH 4.2 for CCP1 and at pH 5.3 for CCP2) also differed from the theoretical one (at pH 4.8 for both). The inaccurate determination of molecular weight using SDS-PAGE has also (B) Mean RPKM (Reads Per Kilobase of transcript per Million mapped reads, n = 2) corresponding to 13 developmental stages in C. cinerea strain AmutBmut for ccp1, ccp2 and ccp3 [17].

Crystal Structure of Cocaprin 1
CCP1 crystallized in P21 space group with two molecules in the asymmetric unit (Table 1). CCP1 has a β-trefoil fold, consisting of only β-strands. The fold resembles a tree with a six-stranded β-barrel as a stem and additional three pairs of β-strands and connecting loops as the tree crown ( Figure 3).

Crystal Structure of Cocaprin 1
CCP1 crystallized in P2 1 space group with two molecules in the asymmetric unit (Table 1). CCP1 has a β-trefoil fold, consisting of only β-strands. The fold resembles a tree with a six-stranded β-barrel as a stem and additional three pairs of β-strands and connecting loops as the tree crown ( Figure 3).  CCP1 is structurally very similar to MpL, the Macrolepiota procera ricin-B-like lectin (RMSD of 0.75 Å for 134 aligned residues), CNL, the Clitocybe nebularis ricin-B-like lectin (RMSD of 1.23 Å for 129 aligned residues), and designed symmetric trefoil proteins (PDB IDs 3PG0 and 4F43) with RMSD in the range of 1.3-1.5 Å. Similarity to other β-trefoil protease inhibitors from higher fungi is lower, for example macrocypin (PDB ID 3H6Q, RMSD 1.9 Å), clitocypin (PDB ID 3H6R, RMSD 2.0 Å), and cospin, a proteinase inhibitor from Coprinopsis cinerea (PDB 3N0K, RMSD 2.1 Å) ( Figure 4).  We also attempted to crystallize CCP2, but expression levels were much lower and the protein tended to aggregate. Despite extensive crystallization attempts, we were unable to obtain useful crystals, so we calculated the AlphaFold2 model [26]. As expected, the AlphaFold2 model of CCP2 shows high similarity to the crystal structure of CCP1 (RMSD of 0.6 Å for 133 aligned CA atoms) with only minor differences, most of which are in loop regions despite a relatively large sequence difference (62.8% sequence identity). Overall, CCP1 and CCP2 differ in 50 residues. The differences are evenly distributed throughout the sequence (Supplementary Figure S1). The largest differences are found in the loop regions, where 29 of 74 loop residues are different. Smaller differences are observed in the β-strands, where 21 of 65 residues are different. Overall, differences in buried core residues are generally conservative, mostly between small hydrophobic residues (A, I, V, L), whereas differences in surface-exposed residues are generally much larger (charge differences, replacement of hydrophilic by charged residues, etc.).

Cocaprins Are Cysteine Protease Inhibitors
Cocaprins inhibit plant cysteine proteases belonging to the C1 family, papain and ficain, with Ki in the low micromolar range (Table 2). They do not inhibit the cysteine protease legumain from common bean, which belongs to family C13. Cocaprins did not inhibit human cysteine proteases, cathepsins L and H. Furthermore, they showed no inhibition of serine proteases belonging to families S1 or S8. We also attempted to crystallize CCP2, but expression levels were much lower and the protein tended to aggregate. Despite extensive crystallization attempts, we were unable to obtain useful crystals, so we calculated the AlphaFold2 model [26]. As expected, the AlphaFold2 model of CCP2 shows high similarity to the crystal structure of CCP1 (RMSD of 0.6 Å for 133 aligned CA atoms) with only minor differences, most of which are in loop regions despite a relatively large sequence difference (62.8% sequence identity). Overall, CCP1 and CCP2 differ in 50 residues. The differences are evenly distributed throughout the sequence (Supplementary Figure S1). The largest differences are found in the loop regions, where 29 of 74 loop residues are different. Smaller differences are observed in the β-strands, where 21 of 65 residues are different. Overall, differences in buried core residues are generally conservative, mostly between small hydrophobic residues (A, I, V, L), whereas differences in surface-exposed residues are generally much larger (charge differences, replacement of hydrophilic by charged residues, etc.).

Cocaprins Are Cysteine Protease Inhibitors
Cocaprins inhibit plant cysteine proteases belonging to the C1 family, papain and ficain, with K i in the low micromolar range (Table 2). They do not inhibit the cysteine protease legumain from common bean, which belongs to family C13. Cocaprins did not inhibit human cysteine proteases, cathepsins L and H. Furthermore, they showed no inhibition of serine proteases belonging to families S1 or S8. Table 2. Inhibitory pattern of cocaprins. Equilibrium constants (K i ) for the inhibition of papain, ficain and pepsin were determined according to Henderson [27]. IC50 values are marked with astersks on both sides and indicated for rennin and APR1. Experiments were performed at 30 • C. S.D. are given where appropriate; NI, no inhibition.

Protease
Protease

Cocaprins Are Aspartic Protease Inhibitors
Cocaprins inhibit the aspartic protease pepsin with K i in the low micromolar range (Table 2) and the aspartic protease rennin with IC50 at 44.5 µM (CCP1) and 20.5 µM (CCP2). Because these proteases derive from animals and their inhibition may not be biologically relevant, we tested the inhibition of a fungal aspartic protease, rhizopuspepsin, and cocaprins showed no inhibition. Suggestive for a function in defense, cocaprins showed specific inhibition for APR1, one of the digestive aspartic proteases from the parasitic nematode Haemonchus contortus, but did not inhibit a similar digestive protease PEP1 from the same organism.

Aspartic and Cysteine Proteases Are Not Inhibited through the Same Inhibitory Reactive Site
Based on the crystal structure, we designed mutations in the surface exposed loops, which were of sufficient length to be able to inhibit aspartic and cysteine proteases. Namely, we produced the G13E, N22R, FH32EE, and D47R mutants of CCP1 which were expressed in the same bacterial expression system (Supplementary Figure S2) as inclusion bodies in very low yield (approximately 2 to 5 mg/L), except for CCP1 FH32EE, which was expressed as a soluble protein in a higher yield (32 mg/L). Their correct folding and functionality were confirmed by measuring the CD spectra (Supplementary Figure S3) and inhibition of the target peptidases. Equilibrium constants were determined for the inhibition of papain and pepsin (Table 3), which indicated that the β2-β3 loop containing the N22R mutation was involved in papain inhibition because this mutant had an approximately ten-fold higher K i value. Since the same mutant had a K i value for pepsin inhibition comparable to that of the wild type CCP1, this suggests that the inhibitory reactive sites for aspartic and cysteine proteases are located at different positions on the protein. Table 3. Inhibition constants of papain and pepsin by CCP1 mutants. Equilibrium constants for the inhibition of papain and pepsin were determined according to Henderson [27]. Experiments were performed at 30 • C. S.D. are given where appropriate; NI, no inhibition.

Cocaprins Inhibit the Activity of Peptidases from C. cinerea Fruiting Bodies
To investigate whether cocaprins play an endogenous regulatory role, we analyzed the inhibition of proteolytic activities in C. cinerea fruiting bodies. Fruiting bodies are rich in proteolytic activity [28], and different types of proteolytic activity were also detected in gel filtration fractions of C. cinerea fruiting body extract (Supplementary Figure S4). Those cleaving the substrate Boc-Gly-Arg-Arg-MCA, indicating C1 family cysteine peptidase activity (papain-like) or S1 family serine peptidase activity (trypsin-like), were strongly inhibited by CCP2 and weakly inhibited by CCP1. In contrast, cleavage of the substrate Suc-Ala-Ala-Pro-Phe-MCA, indicative of S1 (chymotrypsin-like) or S8 (subtilisin-like) serine peptidases, was unaffected by either. The degradation of azocasein, indicative of general proteolytic activity in the same fractions, was only very weakly inhibited by both cocaprins, suggesting that cocaprins target specific proteases. In addition, strong inhibition of endogenous aspartic peptidases that cleave the substrate MocAc-Ala-Pro-Ala-Lys-Phe-Phe-Arg-Leu-Lys-DnpNH 2 at pH 4 was observed for both CCP1 and CCP2 (Supplementary Figure S5).

Are Cocaprins Lectins?
Based on the sequence and structural similarity of cocaprins to fungal lectins MpL and CNL ( Figures 3B and 5), we used glycan microarray analysis to test the possibility of carbohydrate binding by cocaprins (Supplementary Table S1 and Supplementary Figure S6). For CCP1, very weak binding was observed on a mammalian glycan array to structures including LacNAc or polyLacNAc and for CCP2 the binding was even weaker. This indicates a potential for glycan-binding activity in cocaprins.
inhibition of papain and pepsin were determined according to Henderson [27]. Experiments were performed at 30 °C. S.D. are given where appropriate; NI, no inhibition.

Cocaprins Inhibit the Activity of Peptidases from C. cinerea Fruiting Bodies
To investigate whether cocaprins play an endogenous regulatory role, we analyzed the inhibition of proteolytic activities in C. cinerea fruiting bodies. Fruiting bodies are rich in proteolytic activity [28], and different types of proteolytic activity were also detected in gel filtration fractions of C. cinerea fruiting body extract (Supplementary Figure S4). Those cleaving the substrate Boc-Gly-Arg-Arg-MCA, indicating C1 family cysteine peptidase activity (papainlike) or S1 family serine peptidase activity (trypsin-like), were strongly inhibited by CCP2 and weakly inhibited by CCP1. In contrast, cleavage of the substrate Suc-Ala-Ala-Pro-Phe-MCA, indicative of S1 (chymotrypsin-like) or S8 (subtilisin-like) serine peptidases, was unaffected by either. The degradation of azocasein, indicative of general proteolytic activity in the same fractions, was only very weakly inhibited by both cocaprins, suggesting that cocaprins target specific proteases. In addition, strong inhibition of endogenous aspartic peptidases that cleave the substrate MocAc-Ala-Pro-Ala-Lys-Phe-Phe-Arg-Leu-Lys-DnpNH2 at pH 4 was observed for both CCP1 and CCP2 (Supplementary Figure S5).

Are Cocaprins Lectins?
Based on the sequence and structural similarity of cocaprins to fungal lectins MpL and CNL ( Figures 3B and 5), we used glycan microarray analysis to test the possibility of carbohydrate binding by cocaprins (Supplementary Table S1 and Supplementary Figure  S6). For CCP1, very weak binding was observed on a mammalian glycan array to structures including LacNAc or polyLacNAc and for CCP2 the binding was even weaker. This indicates a potential for glycan-binding activity in cocaprins.

Discussion
Differential expression of genes encoding cytoplasmic lectins and protease inhibitors in vegetative mycelium and fruiting bodies is an economic strategy fungi use to defend different developmental stages from stage-specific predators and nutrient competitors [29]. The tandemly-arranged genes encoding the paralogous CCP1, CCP2, and CCP3 proteins differ in terms of their developmental regulation in that ccp3 is produced in both the vegetative mycelium and stage 1 primordia, whereas ccp1 is expressed only in vegetative mycelium and ccp2 in fruiting bodies and upon induction with fungivorous nematodes [19].
We hypothesize that their differential regulation during development may be due to differences in the specificity of the three inhibitors toward endogenous or exogenous proteases, as suggested by the data for CCP1 and CCP2.
The mode of inhibition of cysteine proteases by β-trefoil inhibitors has been structurally well characterised [4,7]. In the crystal structure of clitocypin, a cysteine protease inhibitor from C. nebularis, in complex with the papain-like cysteine protease cathepsin V, two clitocypin loops occlude the catalytic cysteine residue and prevent substrate binding to the active site. Based on this knowledge, it is easy to expect that other β-trefoil cysteine protease inhibitors follow the same mechanism. Because of the low sequence similarity between these inhibitors, one cannot rely solely on sequence conservation but must examine the structural motifs in the different loops and the plausibility of the interaction with the target proteases. Therefore, we engineered several mutants in the surface-exposed loops that are long enough to inhibit cysteine proteases by occluding the active site. Only the N22 mutation in CCP1 significantly reduced its ability to inhibit cysteine proteases, suggesting that the β2-β3 loop is critical for inhibition, which is different from that of clitocypin. In clitocypin, the β1-β2 and β3-β4 loops were shown to occlude the active site, with two Gly-Gly residues playing an important role. In contrast, the CCP1 mutation in a corresponding Gly had no effect on papain inhibition. Another difference between cocaprins and mycocypins is that, unlike mycocypins, cocaprins do not inhibit asparaginyl endopeptidase (AEP, legumain), a C13 family cysteine protease. These differences suggest that cocaprins are not mycocypin-like fungal cysteine protease inhibitors, but a distinct family of dual-headed protease inhibitors that target cysteine and aspartic proteases.
To the best of our knowledge, the structure of cocaprin 1 is the second crystal structure of aspartic protease inhibitors with β-trefoil fold after the structure of Potato Cathepsin D Inhibitor (PDI). Based on the loop length and docking experiments, the authors speculated that the 17-residue-long proline-rich loop between residues 142 and 159 connecting βstrands β9 and β10 in PDI might be involved in the inhibition of cathepsin D; however, they did not confirm their hypothesis. This loop is also absent in their structure, presumably due to its flexibility and disorder [11]. The corresponding loop in CCP1 is much shorter and consists of only five residues (Val99-Ala103). It makes a rather tight turn between β strands β9 and β10 and does not protrude from the globular core of the protein, so it is unlikely to be involved in aspartic protease inhibition. Another characterized aspartic protease inhibitor with β-trefoil fold is the β-trefoil aspartic protease inhibitor from tomato (SLAPI). Although crystal structures are not available, a model has been published, and based on this model, the corresponding β9-β10 loop is even longer than in PDI, 20 residues long [12,13].
β-Trefoil proteins are extremely versatile. It has long been known that they can serve many biological functions and act as inhibitors, lectins, growth factors, etc. [30]. Recently, our groups have characterized several new β-trefoil proteins from higher fungi that exhibit either protease inhibitory or glycan binding activities, which has expanded our understanding of β-trefoil fold plasticity in protease inhibitors and lectins [4,5,21]. Despite very low sequence homology, these proteins share a common feature, namely a very stable core decorated with versatile loops. These loops provide different surface topologies for protein-protein and protein-glycan interactions, and distinct differences in amino acid sequences provide different strengths and specificities of interactions for different targets. The different positions of the reactive loops in the different β-trefoil proteins appear to be random, but this suggests that the position of the loops is not important for activity. What is likely important is the sequence and length of the loops, which provide the required topology. Similarly, chemical specificity, which is determined by multivalent interactions of amino acids across an intrinsically disordered protein region, engages in essential molecular functions [31].
We show indications that cocaprins may have lectin activity in addition to protease inhibition. Glycan microarray analysis suggested a possible function of cocaprins in glycan binding, but the suggested glycan binding specificity could not be conclusively confirmed by additional methods, including affinity chromatography and lectin-blot. Further experiments are needed to confirm or reject this possibility.
The biological function of cocaprins is unknown. Based on previous research on β-trefoil protease inhibitors and lectins, we investigated the possible endogenous role in the regulation of proteolytic activities, focusing on the activities in the fruiting bodies and the possible role in the defense against antagonists. Regarding the former function, we detected partial inhibition of endogenous proteolytic activities, probably belonging to the cysteine and aspartic proteases (Supplementary Figure S5). Interestingly, CCP2, which is mainly expressed in fruiting bodies, inhibited cysteine protease-like activities more strongly, whereas CCP1, which is mainly expressed in mycelium, showed weaker inhibitory activity. However, both cocaprins inhibit aspartic-protease-like activities present in fruiting body extract to a similar extent. Regarding a potential role in defense, it is noteworthy that CCP2 expression was induced upon challenge with a fungivorous nematode [19]. However, no toxicity of the protein was detected against nematodes (Caenorhabditis elegans, Caenorhabditis tropicalis, Pristionchus pacificus) or dipteran insect larvae (mosquito Aedes aegypti) (Supplementary Figure S7), all of which have been shown to be targeted by other β-trefoil protease inhibitors and lectins [10,32]. We also tested the cytotoxicity of cocaprins against several mammalian cell lines, including CaCo2, HeLa, Jurkat, human microglia (Hµglia), and U937, and cocaprins showed no cytotoxicity except for limited cytotoxicity to the CaCo2 cell line (Supplementary Figure S8). This suggests a possible endogenous developmental role for cocaprins, but a defense function against other potential fungal antagonists remains possible.

Cloning, Heterologous Expression and Purification of Recombinant Cocaprins
Total RNA extraction and cDNA synthesis of CC1G_05299 and CC1G_05298 from C. cinerea AmutBmut stage 1 primordia and vegetative mycelium were performed as described previously [18]. In brief, cDNA was synthesized using cDNA transcriptor universal (Roche Life Science/Merck Millipore, Burlington MA, USA) from 2 µg of total RNA following the manufacturer's instructions. The His8-tagged version of CC1G_05299 was amplified from cDNA derived from C. cinerea AmutBmut vegetative mycelium, using primers CC1G_05299 Fw8HisNdeI and CC1G_05299 RvNotI (Supplementary Table S2). The PCR product was cloned into pGEM-T-easy vector (Promega, Madison WI, USA), which was then used to transform chemocompetent E. coli DH5α. Using the NdeI and NotI restriction sites, a mutation free insertion was cloned into the pET24b expression vector (Novagen/Merck Millipore, Burlington, MA, USA).
The His8-tagged version of CC1G_05298 was amplified from cDNA derived from C. cinerea AmutBmut stage 1 primordia, using the primer pair CC1G_05298 FwNdeI and CC1G_05298 Rv8HisBamHI (Table S1). The PCR product was cloned into the pET24b expression vector (Novagen) using the restriction sites NdeI and BamHI. Correct insertion was verified by sequencing.
For protein expression, the plasmids were transformed into E. coli BL21 (DE3). To test protein expression and solubility, transformants were cultivated in LB medium containing 50 µg/mL kanamycin to an OD600 = 0.7 and expression was induced with 1 mM isopropylβ-D-thiogalactopyranoside (IPTG) at 24 • C for 16 h. Solubility of the proteins was assayed as previously described [35].
In order to produce recombinant cocaprins, bacteria were collected by centrifugation (15 min, 6000 g, 4 • C) and sonicated in lysis buffer (50 mM Tris-HCl, 2 mM EDTA, 0.1% Triton X-100, pH 7.5). After washing the pellet with lysis buffer, the inclusion bodies were solubilized in lysis buffer containing 8 M urea. Following gel filtration on Sephacryl S200 in 20 mM Tris-HCl, 0.3 M NaCl, pH 7.5, cocaprins were purified using metal affinity chromatography with TALON ® Metal Affinity Resin following the protocols recommended by the manufacturer (Clontech/Takara Bio Inc., Kusatsu, Japan).

Mutagenesis
Mutants of CCP1 (CC1G_05299) D47R, N22R, FH32EE and G13E were produced by PCR site-directed mutagenesis (primers used are listed in Supplementary Table S2) using the appropriate pET vectors as templates followed by digestion with DpnI (Fermentas, St. Leon-Rot, Germany) and recovery of the vectors containing mutated inserts [36]. Their expression and purification were the same as for the wild type cocaprins.

SDS-PAGE, Native-PAGE, and Isoelectric Focusing
The proteins were routinely analyzed on 15% polyacrylamide gels under denaturing and reducing conditions, and visualized using Coomassie brilliant blue staining or silver staining. Low molecular weight markers of 14.4 kDa to 97 kDa (GE Healthcare Life Sciences, Buckinghamshire, England) were used for molecular mass estimations. The proteins were analyzed under non-denaturing conditions using blue native PAGE with a Novex NativePAGE Bis-Tris gel system with 4% to 16% gradient protein gels (ThermoFisher Scientific, Waltham, MA, USA), according to the manufacturer instructions. NativeMark unstained protein standards (ThermoFisher Scientific) was used for the molecular mass estimations. Isoelectric focusing was carried in precast Novex pH 3-10 IEF protein gels (ThermoFisher Scientific) following the manufacturer instructions. Marker proteins with pI values from 3.5 to 9.3 were used for calibration (GE Healthcare).

Structure Solution and Refinement
Cocaprin 1 was concentrated in 10 mM Tris-HCl, 100 mM NaCl buffer, pH 7.5 to 20 mg/mL. Crystals were grown in 0.1 M HEPES, 1.3 M Trisodium citrate, pH = 7.5. The data set was collected at the BM14 beamline (ESRF, Grenoble, France) to 1.7 Å resolution. The structure was solved with molecular replacement using Phaser [37] with poly alanine chain of MOA, a lectin from Marasmius oreades (PDB ID 2IHO) [38], as a search model. The whole structure was built using ArpWarp [39], combined with manual inspection and corrections in Coot [40]. The structure was finally refined with Refmac [41] and deposited to PDB with the PDB ID 7ZNX.

Glycan Microarray Analysis
Tests on the mammalian printed glycan array, version 5.2) (http://www.functionalglyc omics.org/static/consortium/resources/resourcecoreh8.shtml, accessed 14 January 2022) with 609 glycans were conducted by the Consortium for Functional Glycomics (Protein-Glycan Interaction Core, formerly Core H) as described previously [24,42]. Recombinant cocaprins were biotinylated using No-WeightTM NHS-PEO4-Biotin (Pierce, Rockford, IL, USA) in accordance with the manufacturer's instructions. Binding of cocaprins (at 200 µg/mL) to the array was detected by streptavidin Alexa Fluor 488 conjugate. The highest and the lowest results from each set of replicates were removed to eliminate false hits, and average binding (rank) was calculated.

Conclusions
The characterization of cocaprins as a new family of unique protease inhibitors from higher fungi highlights the incredible reservoir of β-trefoil fold diversity in fungi. Cocaprins are the first example of β-trefoil aspartic protease inhibitors from higher fungi and provide valuable information about this elusive group of proteins. Indeed, only a handful of aspartic protease inhibitors have been characterized to date, whereas there are many aspartic proteases known to be involved in pathogenic processes [43], warranting further efforts in the search and characterization of new aspartic protease inhibitors.