Advance in Heterologous Expression of Biomass-Degrading Auxiliary Activity 10 Family of Lytic Polysaccharide Monooxygenases

: AA10 family lytic polysaccharide monooxygenases (AA10 LPMOs) are mainly distributed in bacteria. Because of their characteristics of oxidative degradation of crystalline polysaccharides, such as cellulose and chitin, they have great application potential in industrial biomass conversion and have attracted wide attention. Efﬁcient heterologous expression of LPMOs by recombinant engineering bacteria has become the main strategy for the industrial production of enzymes. The research progress of AA10 LPMOs’ heterologous expression systems was reviewed in this paper. The construction strategies of its diversiﬁed heterologous expression system were introduced based on the design and processing of the expression host, vector, and LPMOs gene. The effects of different expression systems on the soluble expression of LPMOs and the development direction of the construction of LPMOs’ heterologous expression systems were discussed. The broad application prospect of LPMOs in the biomass conversion and biofuel industry has been prospected.


Introduction
Lytic polysaccharide monooxygenases (LPMOs) are copper-dependent dioxygenases that catalyze the oxidative degradation of polysaccharides, such as cellulose and chitin. This characteristic makes them a key tool in industrial biomass conversion processes [1]. LP-MOs were initially classified as Glycosyl Hydrolases (GHs) 61 and Carbohydrate-Binding Modules (CBMs) 33. However, with the acquisition of new crystal structures of LPMOs and functional verification, it was discovered that LPMOs lack the typical active site architecture found in conventional cellulases, including channels, grooves, or clefts. This indicates that LPMOs have a different cleavage mechanism for substrate bonds compared to glycosyl hydrolases; therefore, they do not belong to the glycosyl hydrolase family based on classification. The ability of LPMOs to bind to polysaccharide crystals is based on a relatively flat surface in their structure that holds an active site. This surface is used to bind cellulose molecules and metal ions. The substrate binding surface contains conserved amino acids with hydrophilic side chains, which interact with the substrate through hydrogen bonding [2,3]. Studies have shown that LPMOs enhance the accessibility of substrates for cellulases by oxidatively cleaving recalcitrant polysaccharide structures, providing more binding sites for glycosyl hydrolases. This promotes the degradation of soluble substrates by cellulase systems [4,5]. Therefore, in 2013, LPMOs were included in the Auxiliary Activity (AA) family of the Carbohydrate Active enZyme (CAZy) database [6]. Consequently, tures, providing more binding sites for glycosyl hydrolases. This promotes the degradation of soluble substrates by cellulase systems [4,5]. Therefore, in 2013, LPMOs were included in the Auxiliary Activity (AA) family of the Carbohydrate Active enZyme (CAZy) database [6]. Consequently, both the GH61 and CBM33 families are referred to as auxiliary activity families. Based on the amino acid sequence and functional similarities of LPMOs, they are categorized into the AA9-AA11 and AA13-AA16 auxiliary activity families in the CAZy database [7]. Among these families, AA10 LPMOs are mainly found in the phyla Proteobacteria, Firmicutes, and Actinobacteria. They are also present in a small number of viruses, fungi, insects, and archaea [6]. To date, the CAZy database has recorded a total of 9672 AA10 LPMOs, with 9113 proteins originating from bacteria, 279 proteins from viruses, 13 from fungi, 5 from archaea, and 262 remaining unidentified. AA10 LPMOs exhibit a wide range of substrate specificity, acting not only on cellulose and chitin, but also oxidatively, degrading some hemicelluloses such as xylan and mannan [8][9][10]. Additionally, when degrading natural biomass substrates, certain LPMOs show some degradation capacity for materials such as corn husks and straw [9,11,12]. Extracting second-generation biofuels from non-edible biomass is considered a key step in establishing a sustainable bioeconomic industry, and the properties of AA10 LPMOs make them important for biorefining applications.
Members of different LPMO families exhibit low sequence homology, but their structures show significant similarity ( Figure 1) [13]. LPMOs have a slightly distorted fibronectin/immunoglobulin-like β-sandwich core structure composed of two β-folded sheets, consisting of seven or eight β-strands. The extended flat surface can be expanded by αhelical loops, allowing LPMOs to bind to the surface of crystalline polysaccharides and cleave glycosidic bonds through an oxidative mechanism. The solvent-exposed active site consists of two completely conserved histidine residues, one of which is at the N-terminus. The arrangement of the two histidine side chains and the N-terminal amino group coordinates the copper ion in a T-shaped geometry known as the histidine brace [14]. Upon substrate binding, LPMOs first accept electrons from an electron donor and transfer them through an electron transfer chain to the Cu(II) in the active site, reducing it to Cu(I) [15]. Despite significant progress in LPMO research in recent years, many questions remain regarding the detailed catalytic mechanisms of LPMOs and how LPMOs can maximize their effectiveness in the sustainable utilization of biomass feedstocks. These questions include, for instance, whether individual LPMOs have multiple catalytic pathways, whether various catalytic intermediates can be observed spectroscopically, and how the pathways and rates of electron transfer occur, as well as which electron sources are most efficient [13].  The β-sheet fold in the figure is purple, the loop 2 region is cyan, the disulfide bond is pink, and the copper atom is spherical. (B) The rod-like structure represents an active center, the coordinated His residue is silver, and the deep blue ball is the copper atom.
Furthermore, structural domain analysis of characterized LPMOs has revealed that, in the AA10 family, besides the catalytic domain, they often contain CBMs [16]. It has been reported that CBMs can bring LPMOs to the substrate, increase the concentration of LPMOs around the substrate, and support the digestion of the substrate by the catalytic module [17]. Moreover, CBMs have a wide range of substrate specificity, suggesting that the broader substrate range of AA10 LPMOs compared to other families may be attributed to their specific domain composition. However, the exploration of these questions first requires efficient heterologous expression of the target LPMOs and obtaining enzymatic proteins with high activity and purity.
To promote the industrial application of AA10 family LPMOs, extensive work has been carried out on heterologous expression and the elucidation of catalytic mechanisms. However, to date, only 45 AA10 family LPMOs have been purified and functionally characterized (Table 1), accounting for only about 0.6% of the total number recorded in the CAZy database. This represents just the tip of the iceberg, highlighting the urgent need to accelerate the characterization and research of novel LPMOs. Currently, there are still many challenges in the heterologous expression of LPMOs in recombinant engineering bacteria. Issues such as incorrect folding after transcription and translation, formation of inclusion bodies, incorrect signal peptide selection, and low expression levels in the extracellular environment persist. LPMOs may undergo different post-translational modifications depending on their source, which can impact protein function and stability. Therefore, selecting an appropriate expression platform is crucial for obtaining functional proteins. In this regard, this article primarily summarizes the strategies for constructing heterologous expression systems of AA10 family LPMOs, thoroughly discussing various possibilities for soluble expression of LPMOs, and addressing the research challenges and prospects of AA10 family LPMOs. The aim is to provide a reference and scientific basis for fundamental and applied research on LPMOs. - Tectaria macrodonta -- [49] "-" represents unknown.

Expression of LPMOs in Escherichia coli (E. coli)
The majority of AA10 family LPMOs are derived from bacteria and are typically expressed in E. coli as a heterologous host. Among the 45 characterized LPMOs, the expression systems were summarized, and it was found that E. coli BL21(DE3) was the most commonly used strain, accounting for 19 LPMOs. Other frequently used strains included -Rosetta™-(DE3) pLysS E. coli, E. coli T7 Express, and E. coli RV308, with 4, 8, and 7 LPMOs, respectively. There were also a few LPMOs expressed in E. coli JM101 and E. coli JM109 ( Figure 2). Commonly used expression vectors included pRSET, pJB, and pET ( Table 1). The E. coli BL21(DE3) carries the T7 RNA polymerase gene controlled by the lacUV5 promoter on its chromosome. Therefore, upon induction with isopropyl β-D-1-thiogalactopyranoside (IPTG), the T7 promoter-driven exogenous genes can be efficiently expressed. Although the T7 promoter is tightly repressed, expression of T7 RNA polymerase and the target protein begins before IPTG induction, allowing background expression of LPMOs to be observed, even in the absence of IPTG induction [14,50]. nied by challenges in proper protein folding and secretion into the periplasmic space. To overcome this issue, Courtade et al. developed the PJB and PJB_SP vectors (PJB_SP includes the signal peptide from SmAA10A of Serratia marcescens) [19]. Using m-toluate as an inducer, they constructed the pGM29 vector with the XylS/Pm regulatory/promoter system and expressed it in E. coli RV308 and E. coli T7 Express strains. In this expression system, it is easy to encode LPMO genes with or without signal sequences and express fully processed and folded recombinant LPMO proteins. The shake flask yield ranged from 7 to 22 mg/L, and high-density fermentation of the recombinant strains demonstrated comparable relative yields, indicating strong potential for large-scale industrial production of LPMOs and providing a new approach [19].  Due to the potential toxicity of expressed proteins to the host strain, strict strategies are required to control background expression. This can be achieved by inhibiting the activity of T7 RNA polymerase. For instance, the Rosetta™-(DE3) pLysS E. coli can express T7 lysozyme, which binds to T7 RNA polymerase, thereby inhibiting transcription of the target gene. Additionally, some pET vectors carry both the T7 Lac promoter and an additional repressor, which binds to IPTG to inhibit transcription of T7 RNA polymerase and prevents transcription of the target gene [10]. It is worth noting that, when expressing proteins in specific culture media with glucose as the sole carbon source, overflow metabolism occurs, leading to low yields of vectors containing T7 and LacUV5 promoters (such as pRSET and pET) in the recombinant expression of LPMOs in bacteria. This is accompanied by challenges in proper protein folding and secretion into the periplasmic space. To overcome this issue, Courtade et al. developed the PJB and PJB_SP vectors (PJB_SP includes the signal peptide from SmAA10A of Serratia marcescens) [19]. Using m-toluate as an inducer, they constructed the pGM29 vector with the XylS/Pm regulatory/promoter system and expressed it in E. coli RV308 and E. coli T7 Express strains. In this expression system, it is easy to encode LPMO genes with or without signal sequences and express fully processed and folded recombinant LPMO proteins. The shake flask yield ranged from 7 to 22 mg/L, and high-density fermentation of the recombinant strains demonstrated comparable relative yields, indicating strong potential for large-scale industrial production of LPMOs and providing a new approach [19].

Expression of LPMOs in Other Host Microorganisms
In addition to E. coli, the expression of AA10 LPMOs has been explored in other host organisms. These investigations aim to leverage the advantages of alternative expression systems, such as post-translational modifications and protein folding capabilities. The expression of SgLPMO10F from Streptomyces griseus in a short rod-based expression system was first demonstrated by Yuko et al. in 2015, whereby the LPMOs secreted into the culture medium, and the yield was comparable to that obtained in other expression systems [48]. The gram-positive bacterium Bacillus subtilis has also been utilized as a host for LPMO expression. Bacillus subtilis is a good secretion host, allowing the successful expression of recombinant proteins for extracellular purification. Yu et al. achieved successful expression of BatLPMO10 from Bacillus atrophaeus in Bacillus subtilis, which eliminated the need for complex periplasmic separation, simplified the purification process, and resulted in a 3.7-fold higher yield compared to E. coli BL21(DE3) [47]. In recent years, attempts have been made to express TfAA10A from Thermobifida fusca in Pichia pastoris and the cyanobacterium Synechococcus elongatus [45,46]. Compared to traditional prokaryotic expression systems, such as E. coli and Bacillus subtilis, the Pichia pastoris expression system offers higher protein expression levels, better secretion efficiency, superior post-translational processing capabilities, improved stability, and ease of protein purification [51]. Kelly et al. confirmed the successful and active expression of TfAA10A in Pichia pastoris, demonstrating the suitability of this expression system for AA10 family LPMOs and providing additional possibilities for constructing efficient expression systems for LPMOs [45]. Cyanobacteria have advantages such as rapid growth and strong genetic adaptability. Russo et al. determined that the secretion level of TfAA10A in Synechococcus elongatus UTEX 2973 was 779 ± 40 µg/L, the highest reported secretion level in cyanobacteria to date [46]. The successful expression of LPMOs in host organisms other than E. coli holds significant importance for the development of novel and sustainable heterologous expression systems in the field of biocatalysis.

Selection of Signal Peptides for Secretion
The active site of LPMOs consists of two completely conserved histidine residues, one of which is located at the N-terminus. The arrangement of the two histidine side chains and the N-terminal amino group coordinates the copper ion in a T-shaped geometry known as the histidine brace. Considering that the first amino acid of AA10 family LPMOs is the active site, the N-terminus, which forms the catalytic site, must be correctly processed and maintained intact during expression and purification. Proper processing of the Nterminus largely depends on the expression strategy and compatibility of the signal peptide (if applied) with the protein secretion system of the expression host. Since most AA10 family LPMOs are derived from bacteria, they are predominantly expressed in the common prokaryotic expression host, E. coli, and directed for secretion using signal peptides. Due to the significant variability of signal peptides in terms of their secretion capabilities for specific proteins, the selection of an appropriate signal peptide is crucial for proper protein folding and secretion. Currently, commonly used signal peptides include the E. coli signal peptides PelB and OmpA, host-specific signal peptides, and native signal peptides of LPMOs (Table 2).

Selection of Signal Peptides in E. coli
For E. coli, secreted proteins are initially synthesized in the cytoplasm as pre-proteins and then translocated to the membrane or periplasmic space through secretion and processing. For example, the pre-protein of Outer membrane protein A (OmpA) traverses the inner membrane with the assistance of a signal peptide, which is cleaved during the secretion process, resulting in the mature protein being localized to the outer membrane. Therefore, when using E. coli as the host for heterologous expression, different signal peptides can be selected to direct the target protein for secretion, either into the periplasmic space or extracellularly, obtaining functionally intact recombinant proteins. Fusion expression of the E. coli signal peptide with the target protein enables the protein to be secreted into the periplasmic space and properly folded. Compared to extracellular expression, the main advantage of periplasmic expression is that, even at low or moderate expression levels, higher concentrations can be achieved in the periplasmic space. Additionally, due to the lower protein content in the periplasmic space, protease activity is lower than in the cytoplasm, allowing the expressed target protein to avoid intracellular degradation, eliminating the need for time-consuming concentration steps, such as ultrafiltration, before protein purification [7]. Nathan [33] and Sophani [52] successfully expressed TfAA10B from Thermobifida fusca and JdLPMO10A from Jonesia denitrificans in the periplasmic space using the E. coli signal peptides PelB and OmpA, respectively, and identified and analyzed their structural domains and catalytic functions. When using E. coli as the expression host, the E. coli signal peptide provides more accurate processing compared to native signal peptides. Yang et al. investigated the expression of SmAA10A in E. coli BL21(DE3) using 13 different signal peptides, including native signal peptides [53]. After screening, they found that PelB was the most effective signal peptide for transferring SmAA10A from the cytoplasm to the periplasmic space. Additionally, the application of CBHI, SacB, and XCs signal peptides also led to varying degrees of improvement in the yield of SmAA10A [53].

Native Signal Peptides of LPMOs
LPMOs typically possess native signal peptides, making it advisable to retain the original signal peptide sequence during heterologous expression to ensure sequence integrity and streamline the experimental timeline. Notably characterized AA10 family LPMOs with available native signal peptides include SmLPMO10A [19], SamLPMO10B [29], SamLPMO10C [29], ScLPMO10C [31], SgLPMO10F [48], TfLPMO10A [46], and MaLPMO10B [14]. In addition, researchers have successfully transferred native signal peptides from characterized LPMOs that exhibit proper functionality in E. coli to newly characterized LPMOs, enabling their successful extracellular expression. However, it should be noted that the utilization of native signal peptides may not always yield optimal outcomes. Courtade et al. employed the XylS/Pm regulatory factor/promoter system in E. coli RV308 to express the AA10 domains of four LPMOs. It was discovered that the use of the SmAA10A signal peptide outperformed the native signal peptides of the other three LPMOs derived from Gram-positive bacteria and the native signal peptide of CjAA10A from a Gramnegative bacterium [19]. Substituting the native signal peptide with the signal peptide SmAA10A from Serratia marcescens resulted in the production of more active AA10 family LPMOs [14,36].

Other Host-Specific Signal Peptides
For LPMOs expressed in other hosts, there are instances where the signal peptides from E. coli are no longer applicable. Russo et al. observed that, when utilizing the E. coli signal peptide TorA for the expression of TfAA10A from Thermobifida fusca in the cyanobacterium Synechococcus elongatus UTEX 2973, although the target protein was expressed in the periplasmic space, the yield was extremely low. However, when using the native signal peptide of TfAA10A, the protein could be successfully expressed in the extracellular space [46]. In the case of expression in Pichia pastoris, the recombinant LPMOs are often expressed in the form of fusion proteins using the pPICZα or pGAPZα vectors, with the signal peptide derived from the Saccharomyces cerevisiae α-mating factor [45].

Assistance of Chaperone Molecules and Selection of Protein Tags
The cell's crowded environment does not provide the ideal folding conditions for proteins. To prevent protein aggregation or misfolding, cells rely on a large class of specialized proteins called molecular chaperones to monitor the folding of the protein repertoire. Enzymes from extremophiles and marine organisms often undergo misfolding during synthesis, leading to protein aggregation and the formation of inclusion bodies during expression. Chaperone molecules assist in protein folding and have been shown to aid in the overexpression of recombinant proteins in E. coli [54]. When overexpressing LPMOs using gene mining strategies, the assistance of chaperone molecules can also be considered to facilitate protein folding.  of TtAA10A. After unsuccessful attempts using various solubility and affinity tags and secretion signals, successful transformation and expression were achieved using the pGro7 chaperone plasmid [44].
Additionally, protein tags also contribute to protein folding and facilitate efficient protein purification. Similar to other proteins, attaching protein tags to LPMOs can facilitate protein recognition, purification, or folding. Due to the unique properties of the N-terminus of LPMOs, plasmids with a 6×His-tag are typically chosen or a 6×His-tag is added to the C-terminus of LPMOs [9,32,38,40,42,48]. Due to the distinct characteristics of the N-terminus, it is not recommended to use protein tags at the N-terminus. Sometimes, a 6×His-tag is used, but only in combination with cleavage sites that can precisely remove the tag from the N-terminus of recombinant protein before the catalytic histidine. Highprecision proteases that can remove the 6×His-tag from LPMOs include Factor Xa, SUMO protease, and EKMax chymotrypsin [18,30]. For instance, Puangpen et al. expressed the catalytic domain of PcAA10A from Paenibacillus curdlanolyticus in the pCold-TF vector and achieved proper cleavage using the Factor Xa protease cleavage site present in the plasmid [9]. While adding a 6×His-tag to the recombinant protein greatly simplifies its purification, it has been observed that the 6×His-tag can bind to copper ions in the structure of LPMOs, which may interfere with substrate binding, redox stability, and characterization of LPMOs [55,56]. Therefore, for proteins containing metal ions, the use of Strep-Tag is recommended, as it does not chelate metal ions, unlike the 6×His-tag [57]. Strep-Tag (an eight-amino acid peptide) has been utilized for the purification of LPMOs, and Fowler et al. expressed recombinant LPMOs in the periplasm of E. coli and subsequently purified them using StrepTrap HP affinity columns [44]. In addition to 6×His and Strep purification tags, the influenza hemagglutinin (HA) and c-myc epitope tags can also be used for the purification of recombinant proteins. For example, Russo et al. used the HA epitope tag to determine the localization of recombinant LPMOs when expressed in the cyanobacterium Synechococcus elongatus UTEX 2973 [46]. In summary, when producing functional LPMOs, it is recommended to avoid using tags, or to use inert and/or removable tags.

The Influence of Copper Ions on the Stability of LPMOs
The copper ion in the active site of LPMOs plays a crucial role in protein thermal stability and proper folding. Removal of copper from the LPMO active site using EDTA results in a decrease in the protein's melting temperature (Tm) [58]. It is found that the copper ions in LPMOs may not necessarily bind to the active site and can instead form disordered active sites. Chaplin et al. identified two apo forms of LPMOs, both of which can bind copper at a single site, but exhibit different kinetics and thermodynamic properties [13]. This indicates that, if copper binds to the wrong site, LPMOs can lose their original functional form. Zinc serves as a good redox-active mimic of copper in the active site of LPMOs. Frandsen et al. used zinc ions to mimic the position of copper ions and found that a lack of zinc in the solution disrupted the active site [59]. This suggests that insufficient copper in the growth medium, metal losses during purification (e.g., due to pH dependence of metal binding), or the presence of chelators or divalent cations, other than copper during crystallization conditions, could result in metal deficiency or erroneous metal binding. To ensure proper copper ion binding, supplementation of copper ions in the growth medium and during enzyme purification can enhance the stability of LPMOs.

Conclusions
The broad distribution of AA10 family LPMOs enables the oxidative cleavage of crystalline and complex polysaccharide substrates, facilitating the enzymatic activity of glycoside hydrolases. LPMOs have attracted extensive attention due to their catalytic properties for the oxidative degradation of crystalline polysaccharides, as well as their extremely strong application potential in the conversion process of industrial biomass. However, the exploration of AA10 family LPMOs is still limited. In the field of crystalline polysaccharide degradation, there are still many novel LPMOs resources that need to be explored and investigated. However, the proteins obtained from natural strains often suffer from low yields, high costs, and poor specificity, which do not meet industrial requirements. Therefore, there is an urgent need to establish efficient and high-fidelity expression systems for AA10 family LPMOs to accelerate the characterization and research of novel LPMOs. Currently, existing expression strategies mainly focus on optimizing the expression systems through host and vector selection, gene design, and choice of tags and signal peptides. Considering the existing technologies and the problems encountered in research, the following recommendations are proposed for future research in this article. Efficient heterologous expression systems specifically for large, multi-domain AA10 family LPMOs should be developed, for example, introducing cold-shock expression vectors and optimizing the co-expression of chaperone proteins to obtain more soluble proteins. LPMOs are likely to be produced using expression systems in bacteria (Escherichia coli), yeast (Pichia pastoris), cyanobacteria, or filamentous fungi, depending on their sources. Optimal growth conditions should be optimized during heterologous expression to minimize oxidative stress and maximize the catalytic activity of LPMOs. When selecting purification tags, it is important to note that the 6×His-tag can bind to copper ions in the structure of LPMOs, affecting enzyme activity. Therefore, new protein tags, such as Strep-Tag, should be explored. Copper ions play a crucial role in the activity of LPMOs, so the appropriate amount of copper ions should be used. Additionally, synthetic biology techniques can be employed to innovate the heterologous expression systems of AA10 family LPMOs from multiple angles, improving their production efficiency. Since each LPMO is unique, different expression strategies are required to achieve optimal yields. This review aims to provide insights into the successful expression of LPMOs in expanding application areas and promoting their industrial application process.