Metagenomic Discovery and Characterization of Multi-Functional and Monomodular Processive Endoglucanases as Biocatalysts

: Biomass includes cellulose, hemicelluloses, pectin and lignin; constitutes the components of dietary ﬁbre of plant and alge origins in animals and humans; and can potentially provide inexhaustible basic monomer compounds for developing sustainable biofuels and biomaterials for the world. Development of efﬁcacious cellulases is the key to unlock the biomass polymer and unleash its potential applications in society. Upon reviewing the current literature of cellulase research, two characterized and/or engineered glycosyl hydrolase family-5 (GH5) cellulases have displayed unique properties of processive endoglucanases, including GH5-tCel5A1 that was engineered and was originally identiﬁed via targeted genome sequencing of the extremely thermophilic Thermotoga maritima and GH5-p4818Cel5_2A that was screened out of the porcine hindgut microbial metagenomic expression library. Both GH5-tCel5A1 and GH5-p4818Cel5_2A have been characterized as having small molecular weights with an estimated spherical diameter at or < 4.6 nm; being monomodular without a required carbohydrate-binding domain; and acting as processive β -1,4-endoglucanases. These two unique GH5-tCel5A1 and GH5-p4818Cel5_2A processive endocellulases are active in hydrolyzing natural crystalline and pre-treated cellulosic substrates and have multi-functionality towards several hemicelluloses including β -glucans, xylan, xylogulcans, mannans, galactomannans and glucomannans. Therefore, these two multifunctional and monomodular GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases already have promising structural and functional properties for further optimization and industrial applications. providing sustainable functional foods and food animal production and renewable lignocellulosic ethanol, functional soluble ﬁber and other biomaterials with an emphasis on the essential roles of biomass-degrading enzymes as novel industrial enzymes.


Introduction
Biomass primarily of plant cell wall and alge polysaccharides, including cellulose, hemicelluloses, pectin and lignin, constitutes the major constituents of dietary fibre [1,2]. As an important dietary polymer nutrient, dietary fibre is utilized by various species of animals [2,3]. Microbial degradation of these fibre components in the gastrointestinal tract is indispensable in maintaining healthy gut functions, gaining energy, aiding digestion of minerals and providing vitamins to the host [2,3]. Lignin, representing a large group of aromatic polymers, and native cellulose of the quasi-crystalline structure in the lignocellulose composite are mainly responsible for the recalcitrance of plant cell wall by limiting accessibility to biomass degrading microbial enzymes [4][5][6][7][8]. Microorganisms have evolved a multiplicity of enzymes to effectively degrade the composite structure of lignocellulosebased biomass, including laccases and peroxidases, glycoside hydrolases such as cellulases and hemicellulases, polysaccharide lyases, lytic polysaccharide mono-oxygenases (LPMO) and the carbohydrate esterases to harvest metabolic energy ATP in adapting to their diverse environments [9][10][11][12][13][14]. Thus, microbial biomass degrading cellulases are a group of important biocatalysts for biomass industrial applications.
The enzymatic degradation of biomass polysaccharides is initiated by the cleavage of glycosidic bonds in the backbone. Thus, the major scientific-technological hurdle for effective utilization of lignocellulose biomass is to develop efficacious industrial biomassdegrading biocatalysts for cost-effective degradation of lignocellulose polymer into their constituent simple sugars such as glucose and pentoses, including xylose, mannose, galactose and arabinose, as illustrated in Figure 1. Lignocellulosic matrix is covered by a protective sheath locked by heterogenous polymers such as hemicelluloses, pectin and guar gum in cross-linkages with lignin. Among a variety of glycosidic bonds, β-1,4-glycosidic linkage is the most widely distributed bond in plant cell wall polysaccharides such as in cellulose and some hemicelluloses [1,11,15]. Collectively, these two group fibre components account for approximately 60% of plant cell wall materials [16][17][18], making β-1,4-glycosidic bond cleavage one of the key biochemical reactions for biomass degradation [5,12,15]. As the most abundant biosynthetic polymer on the planet, native cellulose, referred to as cellulose I, has two distinct crystallite forms with Iα being dominant in bacterial and algal celluloses and Iβ being dominant in celluloses of higher plant origins [19]. It is believed that the two crystalline cellulose allomorphs of Iα and Iβ coexist with partially amorphous cellulose in most natural lignocellulose of biomass materials; and an all-atom coarse-grained simulation model has been developed in describing the conversion process between the cellulose Ia and Ib allomorphs, characterizing the interface between the two crystalline forms of the allomorphs and determining their amorphous transition states under the room-temperature [20]. It has been estimated that the yearly biomass production of cellulose in the world is at about 1.5 trillion tons, making it an essentially inexhaustible resource for mankind [21]. By assuming effective pre-treatment and the use of highly efficacious cellulase enzymes and microbes capable of fermenting and processing hexose and pentose sugars in biorefineries, an integrated methodology of modelling assessment revealed that deployment of wheat straw could be economically feasible for producing liquid transportation fuel ethanol, biomaterial ethyl levulinate and electricity meanwhile gaining a positive environmental margin in terms of mitigating a significant amount of annual CO 2 emission [22,23]. Saccharification of various natural sources of cellulose is a major focus of research efforts aimed at developing renewable biorefinery-based alternatives to fossil fuels [21]. Efficient cleavage of β-1,4-glycosidic bonds in natural and pretreated lignocellulose polymer substrates by cellulases is the limiting biochemical step and the bottleneck for the bioconversion of lignocellulose to biofuels and biochemicals [5,11,[24][25][26], food processing, textile, laundry detergent, paper and wood pulp light industries, plant and horticulture based agriculture [27], and for the improvement of efficiency of fibre feed utilization and food animal productivity [3,[28][29][30][31] and enteric health [30,32,33] in animal production industries, which are other core aspects of developing a sustainable bioeconomy on the global stage.
There has been extensive genomic and gene-cataloguing research examining the microbial anaerobic fermentation of dietary fibre in the mammalian gut, particularly in humans, rodents and ruminants [34,35]. In contrast, understanding of parallel microbial processes at functional and genomic levels in the non-herbivore and non-human farm animal monogastric hindgut is very limited [36]. Pigs, the single largest source of red meat for human consumption worldwide, are monogastric omnivores well-known for their ability to readily adapt to a variety of high-fibre feeding. Interestingly, the turnover rate of digesta throughout the porcine gut system is much faster than that of many other large frame size herbivore animals. For example, digesta retention time is approximately 2 to 6 h in the small bowel and 12 to 18 h in the large bowel in the pig [37] in comparison with cattle that require approximately 50-70 h for feed to pass through the gastrointestinal tract (GI) [38][39][40]. Monogastric pigs are surprisingly efficient at degrading a commercial crystalline cellulose product of Solka-Floc ® that is generated from wheat straw and woodchips via physicochemical pretreatment [41]. Dietary supplemental Solka-Floc ® was degraded by 62% and 83% at the distal ileum and the fecal levels, respectively [41]. These would represent estimated crystalline cellulose degradation rates of 10.4 and 4.6%/h at the distal ileum and the fecal levels, respectively, in the pig. This crystalline cellulose degradation process is dependent largely on the symbiotic microbiota in the porcine intestine [42,43]. However, ruminal degradation rate of this commercial crystalline cellulose Solka-Floc ® was estimated at 1.2%/h [3]. Wang et al. (2012; identified and characterized a multifunctional processive endocellulase from the porcine hindgut microbiome through a functional screening metagenomic expression library [44,45]. These research reports suggest that a unique and efficient microbial fibre degradation system that adapts to a faster rate of food digesta passage has evolved in the pig intestinal system for the discovery of novel and efficacious fibre degradation enzymes. Figure 1. An illustration of the major processes and biological steps in utilization and conversion of agricultural waste feedstock biomass in providing sustainable functional foods and food animal production and renewable lignocellulosic ethanol, functional soluble fiber and other biomaterials with an emphasis on the essential roles of biomass-degrading enzymes as novel industrial enzymes.

Metagenomic Discovery of New Microbial Cellulases
Commercial cellulase products are already widely used in some traditional industrial applications such as food processing, the textile industry, detergent products, and pulp and paper industries [4,25,27,45]. Enzyme companies Novozymes of Denmark and Monosanto of St. Louis in US have announced a joint global venture [46]; and aimed to increase crop yields while reducing inputs presumably by enhancing soil organic matter fertility via microbial enzyme products. More profound contemporary applications of new generations of highly efficient and capable cellulases in saccharification of raw and pretreated lignocellulose substrates for commercial production of cellulosic ethanol and in supplementation as exogenous feed enzyme additives in enhancing feed energy and carbon conversion efficiency in agricultural food animal feed industries are the two key directions in developing a biorefinery-based bioeconomy in the continued global economic development [11,25,26,47]. Novel and efficient cellulases as biocatalysts are the key driver for these new developments. Currently, commercial cellulase genes were obtained through the traditional cloning techniques from Trichoderma and Aspergillus fungal species and Bacillus sp. with further engineering for steadily improved enzyme functionality and protein yields and a dramatically reduced fermentation cost [4,25,48,49]. Prospectively, a commer-cial scale of cellulosic ethanol production was projected to launch businesses on multiple sites [50]. Needless to say, there is a continued need to further improve cellulase enzyme specific activities to ensure that this newly emerging cellulosic ethanol industry will become progressively viable and economically competitive [51]. On the other hand, there is still a lack of well recognized cellulase enzyme products due to limited enzyme activity in hydrolyzing natural raw lignocelluloses of feed sources for their effective applications in animal production [29][30][31][32]. Under this context, the discovery and characterization of newly emerged novel cellulases, as described herein, have been discussed for their potential commercial applications as novel and efficient cellulase products for both the cellulosic ethanol production and the animal feed enzyme industry, among others.
One newly evolved possible approach for discovering new cellulases involves metagenomic mining, including metagenomic analysis and metagenomic expression library, for microbial cellulases in random samples of DNA fragments from material extracted directly from complex environments. This metagenomic approach circumvents the highly restrictive requirement that organisms be culturable [52]. This allows enzymes from a far richer variety of microorganisms to be sampled and characterized. For example, functional screening of microbial metagenomic expression libraries [44,45] and gene-centric based metagenomic gene cataloguing have unearthed a diversity of cellulases from the bovine rumen [52][53][54][55], the rabbit cecum [56], the human gut [57], the rodent gut [58], hindgut of wood-feeding higher termites [59], the earthworm gut [60,61], decaying wood [62], soil [63][64][65] and environmental samples [66]. By comparing to the major metagenomic database, Zhou et al. (2010) have concluded that animal gut metagenomes are substantially enriched with novel glycosyl hydrolases [67]. Wang et al. (2019) identified a multifunctional processive endocellulase from the porcine hindgut microbiome [45]. Therefore, metagenomic tools are powerful for the discovery of novel cellulases and the animals' gut environment is a unique genomic resource for such discovery.
More recently, processive endocellulases have also been discovered and characterized to contain a GH5 or a GH9 catalytic domain (Table 1), and these are very rare endocellulases that can continue to hydrolyze their initially broken-apart cello-oligosaccharides into soluble cellodextrins (DP of 2-6) and glucose with one such type of cellulases playing both endo-acting and exo-acting hydrolytic roles [72,73]. β-Glucosidases (EC 3.2.1.21) are primarily responsible for the terminal hydrolysis of cellobiose and, to some extent, may also hydrolyze some soluble cellodextrins into D-glucose [5,15]. McCleary et al. (2012) further suggested that the classical β-glucosidases hydrolyze soluble cellodextrins (DP of 2-6) at similar rates whereas the rate of hydrolysis of exo-β-glucosidases would increase as the DP increases from 2 to 6 [74], further distinguishing between the classic β-glucosidase and the exo-β-glucosidase activities.
Furthermore, it is believed that endocellulase and exocellulase activities in hydrolyzing natural and pretreated cellulose substrates are the rate-limiting steps of the polymer cellulose biochemical degradation process [4,5,10,15]. This notion can be further substantiated by the following facts. For comparison, one standard enzyme activity unit (U) is defined to be 1 umol D-glucose or reducing sugar released per min. Firstly, the highest reported endocellulase activity is at about 30 U/mg protein on Avicel and the exocellulase activity is at about 246 U/mg protein on phosphoric acid swollen cellulose (PASC) [75]. Most of the well characterized microbial endocellulase and the exocellulase activities are typically below 0.200 U/mg protein as summarized by Lynd et al. (2002) [15] and compiled by the Braunschweig Brenda Enzyme Database (http://www.brenda-enzyme.org). Correspondingly, α-amylase (EC 3.2.1.1) and maltase-glucoamylase or amyloglucosidase (EC 3.2.1.3) activities in hydrolyzing α-glycosidic bond-based polysaccharides of starch and maltodextrins are very high at about 4000 U/mg and 758 U/mg proteins as compiled by the Braunschweig Brenda Enzyme Database (http://www.brenda-enzyme.org). Thus, endocellulase and the exocellulase activities in hydrolysis of β-glycosidic bond-based natural raw and pretreated lignocelluloses & β-glucans and insoluble cellodextrins are considerably lower than the α-amylase and maltase-glucoamylase or amyloglucosidase activities towards α-glucosidic bond-based starch and maltodextrins. Secondly, characterized microbial β-glucosidase activities are typically much higher than the endocellulase and the exocellulase activities in hydrolysis of their natural substrates including cellobiose. For example, β-glucosidase activity in hydrolysis of cellobiose reaches the level of about 665 U/mg protein as compiled by the Braunschweig Brenda Enzyme Database (http://www.brenda-enzyme.org). This is also likely, in part, due to the fact the natural substrates of β-glucosidases are soluble cellodextrins. Thirdly, comprehensive studies demonstrated that free multimodular processive cellulases were effective and the limiting enzymatic factors in hydrolysis of pure crystalline cellulose substrates, physiochemically pretreated biomass and raw biomass materials [76,77]. Thus, discovery of highly efficient and efficacious synergistical endocellulases and exocellulases, particularly the uniquely high active processive endocellulases that hydrolyze crystalline cellulose substrates, is the key to conquer the biomass recalcitrance.

Diversity of Microbial Cellulases
A large diversity of endocellulases and exocellulases has been characterized in aerobic and anaerobic bacteria and fungi, protozoa and some unique animal species [12,26]. Endocellulases and exocellulases are further classified into non-complex free cellulases [12,26] and the complex cellulosomes [26] based upon how these enzymes are associated with their original organisms and what biological mechanisms that these enzymes employ to carry out cellulose degradation. The highest wild-type bacterial free cellulase activity was reported at about 2.5 U/mg protein on Avicel for a GH9 processive endoglucanase originated from Clostridium phytofermentans [78] and a GH5 bacterial endoglucanase screened from the Buffalo rumen metagenomic library [55]. Directed evolution of this Clostridium phytofermentans GH9 processive endoglucanase further enhanced its activity to about 6 U/mg protein on Avicel [79]. The highest fungal free cellulase activities were reported at 9.7 and 29.8 U/mg protein for two exocellulases on Avicel and these free cellulases were respectively originated from ruminal Neocallimastix patriciarum [80] and the White Rot fungus Irpex lacteus [75]. The highest purified cellulosome activities were reported at 9.6-14.6 U/mg protein on Avicel from Clostridium thermocellum [81]. It is apparent that these abovementioned highest levels of cellulase specific activities reported so far for hydrolyzing crystalline cellulosic substrates are still far below the specific activities reported for some of the well-established industrial enzymes such as commercial phytases, hemicellulases and α-amylases, typically having specific activity ranging from several hundreds to thousands U/mg protein. Thus, it is imperative to continue the research efforts for the discovery and engineering of highly active endocellulases and exocellulases for their various industrial applications.
Free cellulases are typically secreted out of their originating organisms or cells and are sloughed into surrounding environments and these enzymes are further divided into monomodular, possessing only one catalytic domain, bi-modular, having one carbohydratebinding domain or module (CBD or CBM) and one catalytic domain, and multi-modular, containing one or more CBD and one or more catalytic domains, cellulases [12,26,27]. The classic view is that almost all cellulase enzymes that degrade insoluble substrates contain a substrate-binding domain linked to the catalytic domain via a linker peptide [12]. Thongekkaew et al. (2013) demonstrated that fusion of a cellulose-binding module enhanced cellulase binding affinity and cellulolytic activity to insoluble cellulosic substrates [82]. However, possessing a CBD in cellulases does not mean that cellulases have high activity on insoluble especially crystalline cellulosic substrates, which is deemed to be an essential property for potential industrial applications. For example, two ruminal GH5 cellulases that were characterized to contain CBD could effectively hydrolyze soluble substrate CMC but did not have activities on the model crystalline cellulosic substrate Avicel [83,84]. A number of free cellulases that are reported to possess CBD have little or no activities on model crystalline cellulosic substrates as compiled in the Braunschweig Brenda Enzyme Database (http://www.brenda-enzyme.org). Alternatively, cellulose substrates as ligands can directly bind to catalytic domains to engage hydrolysis with non-processive and processive cellulases [85]. Payne et al. (2011) [86] and Taylor et al. (2013) [87] further demonstrated that cellulases bound to hydrophobic regions in catalytic domain active sites, especially the tryptophan residues in processive cellulases.
On the other hand, cellulosomes are multi-enzyme based multi-functional complexes that are anchored onto the outer membrane of some anaerobic bacteria and fungi and can synergistically hydrolyze celluloses and/or hemicelluloses [26,[88][89][90]. Cellulosomes are typically structured in such a way that multi-enzymes are linked by one or more scaffold proteins as scaffold subunits that, in one direction or end, are further extended with one or more "X" modules of unknown function and cellulose binding module(s) that are tethered to bind to cellulose or hemicellulose substrates, and in the other direction or end, are further anchored to the bacterial outer membrane via dockerin and cohesion proteins, and related stepwise actions and mechanisms have been well documented and illustrated in the literature [26,91,92]. By using synthetic biology, designer cellulosomes are being developed to optimize their catalytic efficiency [13,[90][91][92], ideally via combining component enzyme proximity and flexibility with a reduced complex enzyme mass.

Biomass Porosity and Cellulase Efficacy
With this context, it should be pointed out that cellulase accessibility to hydrolytic cellulose surface area is a major determinant of lignocellulosic biomass degradation. Biomass has both external and internal surface areas with natural lignocellulosic materials having a much greater internal surface area [19]. It is well accepted that the primary cell wall of most untreated and natural plant cells is semi-permeable with pores, permitting the simple diffusive passage of small molecules such as CO 2 and H 2 O, and small proteins with size exclusion estimated to be at 30-60 kDa (Cell wall, Wikipedia). Assuming that free cellulases have the simplest shape, a sphere, these aforementioned small proteins with size exclusion estimated at 30-60 kDa for access to plant cell wall pores are calculated to have a radius range of 2.05-2.6 nm and a diameter range of 4.1-5.2 nm, respectively, by using the formula provided by Erickson (2009) [93]. Much earlier studies by Carpita et al. (1979) reported that pore size diameters in various living plant cells ranged between 3.5 and 5.2 nm [18]. Mechanically processed (e.g., via chopping and grinding) but untreated lignocellulosic biomass materials are particulate and porous with intraparticulate pores (a diameter range of 1-10 nm) as well as interparticulate voids (> 5 µm) [19]. Several studies documented that pretreated lignocellulosic biomass materials and crystalline cellulose model substrates are associated with much larger pore diameters, ranging between 5.1-11.0 nm [94][95][96][97]. Thus, it can be concluded that natural, raw, untreated biomass materials such as plant feeds have an estimated upper cut-off pore size diameter at about 5.2 nm while physiochemical pretreated biomass materials are associated with an upper pore size diameter at about 11.0 nm.
As biocatalysts, molecular dimension sizes of enzymes are generally related to their molecular weights. Enzymes with large molecular weights are likely to bear large molecular sizes or molecular diameters in aqueous solutions. Erickson (2009) viewed that most proteins fold into globular domains and polypeptides larger than 50 kDa typically form two or more domains [91]. Cowling and Kirk (1976) summarized estimated sizes of various fungal cellulases that were projected with a sphere and/or ellipsoid shape [98]. The surveyed fungal cellulases were associated with spheric diameters ranging between 3.3 and 7.7 nm, and ellipsoid shapes measuring (width x length) between 1.8 × 10.8 and 4.2 × 25.0 nm for their molecular weights ranging between 11.4 and 76 kDa [98]. Typical free cellulases, with a large catalytic domain core being connected to a small carbohydrate-binding module via a short peptide linker or spacer, were shown to have tadpole shapes by using small-angle x-ray scattering analysis [99,100]. For example, the Trichoderma reesei exocellulase Cel7A (CBHI) and endocellulase 7B (EGI) were estimated to have a tadpole shape (catalytic domain core diameter x molecule length) measured at 4.4 × 19.6 and 5.3 × 18.0 nm, respectively [99,100]. Ding et al. (2012) showed that only a few surface microfibrils were degraded by the fungal Trichoderma reesei cellulases in the untreated raw cell wall of biomass materials [7]. It is known that microfibril surface represents a major portion of the internal surface area in biomass materials. Their results would suggest that Trichoderma reesei free cellulases with a catalytic domain core diameter ranging 4.1-5.2 nm had limited diffusion into plant cell wall pores that have an upper cut-off porosity diameter at about 5.2 nm in the untreated raw biomass materials. In the Ding et al. (2012) study, the authors further showed that the Trichoderma reesei cellulases effectively penetrated into the microfibril internal surface and engaged degradation of these microfibrils with digestion holes or pits in the pretreated delignified biomass materials, whereas purified cellulosomes from Clostridium thermocellum were shown to only peel off individual microfibrils from the cell wall surface in these pretreated biomass materials [7]. These results by Ding et al. (2012) [7] would suggest that the dimension size of purified Clostridium thermocellum cellulosomes was likely much larger than the upper pore diameter of 11 nm in the pretreated biomass materials reported by Grethlein (1985) [95] and Hui et al. (2009) [97], reflecting the fact that cellulosomes are rather large multi-enzyme complexes with molecular weights ranging from several hundreds to over 1 million kDa. Xu et al. (2013) [101] used domain engineering to exploit the advantage of intramolecular and intermolecular proximity synergies by designing mini-cellulosomes to improve cellulase specific activity as well as to enhance their biomass-penetrating feature. Other strategies such as glycosylation by overexpression designer cellulosomes in other microbial platforms could further improve designer cellulosomes' thermostability and activity [102].
Meanwhile, Brunecky et al. (2013) demonstrated that the multimodular free cellulase CelA cloned from the thermophilic bacterium Caldicellulosiruptor bescii could effectively hydrolyze crystalline cellulose substrates via mechanisms of surface ablation and cavity excavating driven by a combined endo-acting and exo-acting processivity [77]. However, the Caldicellulosiruptor bescii Cel A had a large molecular weight of about 230 kDa [103], and an estimated effective size of 10-35 nm, thus limiting its diffusion into microfibril internal surface for biomass hydrolysis [77]. Furthermore, the large molecular Caldicellu-losiruptor bescii Cel A was associated with multiple carbohydrate-binding modules and its activity would be affected, to a greater extent, by nonproductive adsorption than much lower molecular weight cellulases [77]. Therefore, small molecular weights may be a favorable and advantageous biochemical feature in developing highly active processive free cellulases that may be effective to penetrate into microfibrillar internal surfaces for biomass hydrolysis.
Cell wall biomass recalcitrance is collectively referring to the natural resistance of cell wall materials to microbial and enzymatic deconstruction [5,9]. The biomass recalcitrance is essentially composed of two layers of resistance, including the water-excluding nature of the resistance of the crystalline cellulose bundle core within each of the elementary fibrils as well as the resistance representing the crosslinks around the cellulose bundle core formed by hemicellulose & pectin embedding each elementary fibril within each of microfibrils and the aromatic net further patched by lignin polymers on the exterior of the microfibrils within each of the macrofibril structure primarily in plant cell wall materials [19,[104][105][106]. The first layer of biomass resistance as a measure of water-excluding nature associated with the crystalline cellulose bundle core within the elementary fibril is also reflected by the crystallinity index (CrI, ranging between 0 and 1) that is experimentally measured for a number of biofuel materials [19,107] but in very few plant feed materials for animals. The crystalline cellulose bundle core within each elementary fibril represents about 100 cellulose glucans that are tightly aggregated via extensive interchain and intrachain hydrogen bonds and van der Waals forces in forming a straight and stable supramolecular insoluble quasi-crystalline structure with a typical crystal size of 4-5 nm, being variable dependent upon biomass sources [19,105,106,108]. Each cellulose glucan is a polymer consisting of D-anhydroglucopyranose joined together by β-1,4-glycosidic bonds, with anhydrocellobiose as the repeating unit [19]. Each microfibril contains a single elementary fibril and is surrounded and embedded in a matrix of non-crystalline hemicelluloses & pectic polysaccharides with microfibril dimensions typically ranging from 2-3 nm to 20-50 nm, as affected by plant species and tissue types [19,105,106]. When microfibrils interact with one another to form aggregation via van der Waals forces, macrofibrils are formed and their dimensions vary from 50 to 250 nm in diameter [99,105,106,108]. Thus, plant cell materials typically consist of macrofibril, microfibril and elementary fibril units.
Cell wall biomass materials are porous materials. Porous materials are classified into three categories including micropores (pore diameters less than 2 nm), mesopores (pore diameters between 2-50 nm) and macropores (pore diameters larger than 50 nm) [108]. The micropores (around 1 nm in pore diameters) and the mesopores (between 2 and 5.2 nm in pore diameters) were likely to be the capillaries between adjacent microfibrils while the macropores (around 20 to 100 nm in pore diameters) were likely to be the space along the external surface between macrofibrils in plant cell wall biomass materials, as was originally proposed by Cowling and Kirk (1976) [98] and further discussed by Guo and Catchmark (2011) [107]. It is also reasonable to believe that micropores represent the majority whereas mesopores and macropores likely only represent a small proportion of the total pores in the raw biomass materials. Hence, another layer of biomass resistance is the restricted nanoporosity, which reflects the crosslinks formed by hemicellulose & pectic polysaccharides embedding each elementary fibril via non-covalent bonds and the lignin polymers surrounding the exterior of each microfibril via covalent bonds, and is related to free space across microfibrils and macrofibrils in the biomatrix [109]. Pores in biomass can be open or closed. Open pores are accessible, whereas closed pores are inaccessible, to accommodate cellulases for cellulose hydrolysis [107]. The drying process, especially drying cell wall biomass materials at higher temperature for a long duration (e.g., 150 • C for 1 h), can cause fibre hornification that is a consequence of the irreversible change of the cell wall structure due to the collapse of pores [109]. Carpita et al. (1979) reported that pore size diameters in various living plant cells ranged between 3.5 and 5.2 nm, as measured by using a solute exclusion technique [18]. Biomass material accessibility and porosity can be experimentally measured with several techniques, including a solute exclusion, water vapor sorption and a liquid N 2 adsorption, and these have been conducted for a number of biofuel materials [19,109]. As reviewed in an earlier section, several previous studies documented that pretreated lignocellulosic biomass materials and purified crystalline cellulose model substrates such as Avicel, α-cellulose, Sigmacell (20, 50 and 100) and Solka-Floc ® were associated with much larger pore diameters, ranging 5.1-11.0 nm [94][95][96][97]. Thus, it is safe to predict that fresh fibre feeds such as grasses for grazing herbivores have an upper porosity diameter at about 5.2 nm according to the study by Carpita et al. (1979) [18]. To the best of our knowledge, these types of porosity measurements have not been well reported in other raw plant feed materials for agricultural food animals such as corn silage, ground hay, DDGS, wheat shorts and middlings, canola meal and soybean meal, etc.

Properties of Fungal Biomass Degradation Hemi-Cellulases and Cellulases
Aerobic fungal species, typically including white, brown and gray wood rot fungi, are well characterized microorganisms, and they employ a multiplicity of synergistic mechanisms to effectively degrade wood-based lignocellulose biomass materials high in lignin [110][111][112][113]. It is equally important to compare GH5-p4818Cel5_2A specific activity  [45] with specific activities that are expressed and are converted into the international unit (1 IU = µmol D-glucose per mg protein per min) in some of the well characterized aerobic fungal endocellulases and exocellulases reported in the literature. The white rot fungus basidomycete Irpex lacteus grows on fallen wood and decomposes it by secreting multiple enzymes including cellulases, hemicellulases and lignin-degrading enzymes with its strain MC-2 being used as a producer of a commercial enzyme preparation called Driselase marketed by Kyowa Hakko, Co., Ltd. [75]. Hamada et al. [114,115] isolated a cellulase gene cel2 encoding for two highly active GH7 exocellulases Ex-1 (a molecular weight of about 53 kDa) and Ex-2 (a molecular weight of 56 kDa) and a cellulase gene cel3 encoding a GH7 exocellulase Cel3 protein (a predicted molecular weight of about 56 kDa), equivalent to the Trichoderma reesi exocellulase Cel7A (CBHI) in the basidomycete Irpex lacteus with specific activities reported only for Ex-1 and Ex-2 (20.  [116]. Toda et al. (2005) also isolated a cellulase gene cen1 encoding for a GH5 endoglucanase I termed En-1 (a molecular weight of about 52 kDa) in the basidomycete Irpex lacteus with relatively low specific activities (7.76-25.9 µmol D-glucose/mg protein.min) determined with the soluble substrate CMC [116]. While the basidomycete Irpex lacteus GH5 endoglucanase-I En-1 activity in hydrolyzing Avicel was not reported in the study, the researchers did show that En-1 in combination with the exocellulase Ex-1 had dramatic synergistic effects in enhancing the hydrolysis of crystalline cellulose substrates of Avicel and PASC [116]. Zheng and Ding [117] isolated a bi-modular processive GH5 endoglucanase termed EG1 (a molecular weight of about 48 kDa) in an atypical white rot basidomycete Volvariella volvacea with the activity (0.1479 µmol D-glucose/mg protein.min) determined with filter paper that was lower by about 46% compared with GH5-p4818Cel5_2A activity on Avicel from the Wang et al. (2019) study [45]. Clearly, while the exocellulase activities have been characterized to be the highest levels reported so far, to the best of our knowledge, the endocellulase activities are shown to be relatively low and are limiting cellulosic degradation within the white rot basidomycete fungi.
On the other hand, very few β-1,4-endoglucanase genes are shown to be present in the brown rot basidomycete genome [112]. The characterized brown rot basidomycete Postia placenta β-1,4-endoglucanases are associated with very low activities. Specifically, the GH5-p4818Cel5_2A activity on Avicel from the Wang et al. (2019) study [45] was about 57-fold higher than the activity (0.0045 µmol D-glucose/mg protein.min) determined with Avicel for the GH5 processive endoglucanase Cel5A in the brown rot basidomycete Postia placenta [118]. Several lines of evidence suggest that extracellular reactive oxygen species (ROS), specifically hydroxyl radicals produced from the Fenton reaction, play a compensatory biodegradative role in extensively disrupting crystalline cellulose in wood biomass in concert with their relatively weak and lack of activities of processive and non-processive endocellulases in the white and brown rot basidomycete fungi [112,119]. Last but not least, the white rot and the brown rot basidomycete fungi possess unique lignin-degradation enzymes [112]. Hence, low and lack of endocellulase, especially the processive endocellulase activities, are the recognized limiting enzymatic factors, and these basidomycete fungi are using ROS as recognized compensatory mechanisms in engaging effective lignocellulosic degradation of wood biomass materials. Giving the fact that lignin polymers further encapsulate microfibril units and reduce porosity of raw lignocellulose biomass materials, possession of lignin-degradation enzymes by these fungi would facilitate their cellulases in effectively penetrating into the internal microfibrillar surface for effective hydrolysis.
The most studied aerobic cellulosic microorganism is the fungus, Hypocrea jecorina, originally also called Trichoderma viride, and is renamed Trichoderma reesei to honor Drs. Reese and Mandels for their contributions in isolating this fungus during World War II [49]. Trichoderma reesei fungi produce the two most abundant exocellulases Cel7A (CBHI, making about 70% of the total cellulase proteins secreted) and Cel6A (CBHII, making about 10% of the total cellulase proteins secreted), as well as seven endoglucanases including Cel7B (EGI, the most abundant EG), Cel5A (EGII), Cel12A (EGIII, serving as expansin activity), Cel61A (EGIV), Cel45A (EGV), Cel5B and Cel61B [49]. Variable cellulase activities have been reported for various Trichoderma reesei likely due to different research conditions such as enzyme purity, substrates and assay temperature. The two Trichoderma reesei exocellulases Cel7A (CBHI) and Cel6A (CBHII) activities were reported to be 0.0091 and 0.0085 µmol D-glucose/mg protein.min, respectively, determined with filter paper [120]. However, these two Trichoderma reesei exocellulases Cel7A (CBHI) and Cel6A (CBHII) activities were reported to be 0.370 and 0.330 µmol D-glucose/mg protein.min, respectively, determined with Avicel [121]. Meanwhile, an average Trichoderma reesei endoglucanase (EG) activity at 0.420, and the Trichoderma reesei Cel5A (EGII) activity at 0.0290 µmol D-glucose/mg protein.min were determined with Avicel and filter paper by Aylward et al. [121] and Tambor et al. [122], respectively. Overall cellulase activities of the Trichoderma reesei cellulase enzyme mixture products in hydrolyzing crystalline cellulose substrates (e.g., pretreated wood materials vs. Avicel), primarily marketed by Novozymes, were reported to be at 1.06 and 0.73 µmol D-glucose/mg protein.min by Yu and Saddler [123] and Aylward et al. [121], respectively. It is well established that endocellulases and exocellulases synergistically drive much higher specific enzyme activities for the Trichoderma reesei cellulase enzyme mixture. Relatively high enzyme protein yield, high specific enzyme activity and a muchreduced enzyme production cost by about 10-fold in recent years are the main reasons that the Trichoderma reesei cellulase enzyme mixture is the primary cellulase product on the current market [48,50].
Another aerobic fungus used for industrial cellulase production is Humicola insolens that produces a similar set of multiple endocellulases and exocellulases but with a much lower enzyme protein yield and specific activities [49,50]. Trichoderma reesei and Humicola insolens are not closely related, however, both organisms are brown rot fungi and do not degrade lignin [49]. This is likely due to the fact that these two brown rot fungi have lost their lignin degradation genes during their evolution process [108]. Tambor et al. [122] screened 55 fungal endocellulases out of several fungal species by comparing their specific activities with the Trichoderma reesei Cel5A (EGII) activity at 0.0290 µmol D-glucose/mg protein.min measured with filter paper. The top three selected fungal endocellulases included the Aspergillus niger ApCel5A activity at 0.120, Gloeophyllum trabeum GtCel12A at 0.030 and Sporotrichum thermophile StCel5A at 0.024 µmol D-glucose/mg protein.min by using filter paper [122]. By comparison, GH5-p4818Cel5_2A activity on Avicel by Wang et al. [45] was about 80% higher than the Aspergillus niger endocellulase ApCel5A activity, six-fold higher than the Gloeophyllum trabeum endocellulase GtCel12A, and eight-fold higher than the Sporotrichum thermophile endocellulase StCel5A, respectively. Therefore, it can be concluded from this review that while cellulases from the three main aerobic fungi Trichoderma reesei, Humicola insolens Aspergillus niger are useful and the predominant commercial cellulases on the current industrial enzyme markets largely for textile, detergent, wood pulp & paper polishing and food processing, these cellulases are clearly limited in their applications as exogenous feed enzyme additives for livestock feeding industries and as an industrial cellulases for producing commercial scale of lignocellulosic ethanol from raw feedstock biomass materials. This is largely due to the fact that these fungal cellulases are not processive endocellulases, need multiple enzyme synergism to carry out effective degradation, and are bi-modular, relatively large in their molecular weights and sizes, and thus less effective and efficient in breaking down raw lignocellulose composite in plant nature of feeds and feedstock.

Microbial Processive Cellulase Properties of Hydrolyzing Natural Crystalline Substrates
The chemically modified soluble cellulose substrates, such as CMC and HEC, are widely used for biochemically characterizing various endocellulases and exocellulases. However, cellulase hydrolytic activities measured with CMC and HEC are not necessarily related to and cannot be simply extrapolated to their potential activities on pure crystalline or amorphous cellulose substrates, pretreated lignocellulose biomass and natural raw lignocellulose biomass materials [4] (the Braunschweig Brenda Enzyme Database at http://www.brenda-enzyme.org). It is a standard research practice to further characterize and compare the hydrolytic potential of cellulases on native lignocellulose biomass by using model insoluble cellulose substrates. Major model insoluble cellulosic substrates include nearly pure celluloses such as cotton fibre, Whatman No. 1 filter paper, bacterial cellulose, bacterial microcrystalline cellulose (BMCC), microcrystalline cellulose (Avicel) and Sigmacell (20, 50 and 100) as well as impure cellulose-containing substrates such as dyed celluloses, α-cellulose and various pretreated lignocellulosic substrates (e.g., Solka-Floc ® ) [4,124].
There are some recognized intrinsic differences in the physiochemical properties among the major model insoluble cellulosic substrates including crystallinity index, cellulose size as marked by their DP and intraparticulate pore size [4,19]. Crystallinity index (CrI) differs among the model insoluble native cellulosic substrates and variability of the CrI values largely reflects different intramolecular and intermolecular hydrogen bonding among polymer cellulose glucan molecules within each of the elementary fibrillar bundles as well as different techniques used for measuring the CrI values [4,96,106]. There are inconclusive views regarding the relationship between hydrolysis rates and CrI values in the major model insoluble cellulosic substrates. Weimer and Weston [96] and Hall et al. [125] showed that CrI values negatively predicted hydrolysis rates of the major model insoluble cellulosic substrates, whereas Fierobe et al. [126] and Zhang et al. [4] concluded that substrate recalcitrance and hydrolysis rates were not strongly associated with their CrI values but reflected their overall accessibility of reactive sites to cellulases. Intrinsic differences among cellulases involved in the studies such as mechanisms of enzyme actions might have been responsible for the differences in the role of CrI values in the afore-discussed major model insoluble cellulosic substrate hydrolysis studies.
Since lignin and celluloses are essentially removed in the major model insoluble cellulosic substrates, intraparticulate porosity (radius of 1-10 nm) reflects free space or the capillary structure between each of the elementary fibrillar bundles in the major insoluble cellulosic substrates [19,96,107]. Avicel (PH101, PH102 and PH105) products are processed from wood pulp with removal of the amorphous fraction and are thus associated with intermediate CrI values (0.5-0.6) but relatively small DP values (DP of 150-500) [4]. The main Avicel (PH101, PH102 and PH105) products are further characterized with an intraparticulate pore peak radius between 2 and 3 nm [94]. Thus, model insoluble cellulosic substrates such as filter paper (Whatman #1) and Avice are widely used to characterize and compare hydrolysis rates of endocellulases and exocellulases for their potential commercial applications while Avicel is a more suitable substrate for assessing exocellulase activity due to its much smaller chain length.
It is important to note that the GH5-tCel5A1 processive endoglucanase reported by Basit and Akhtar [127] and the GH5-p4818Cel5_2A processive endoglucanase by Wang et al. [45] had a specific activity (Table 1, 0.216 vs. 24.8 µmol D-glucose/mg protein.min) in hydrolyzing the model insoluble crystalline cellulosic substrate Avicel. Specific cellulase activities are expressed and converted into the international unit (1 IU = µmol D-glucose per mg pure enzyme protein per min) in some of the well characterized bacterial endocellulases reported in the literature. The GH5-p4818Cel5_2A endoglucanase activity on Avicel by Wang et al. [45] was about 20-215 folds higher than the range of activities (0.0010-0.0105 µmol D-glucose/mg protein.min) determined with filter paper for the six Thermobifida fusca endoglucanases and exocellulases [120]. The GH5-p4818Cel5_2A endoglucanase activity on Avicel was about 8-to 19-fold higher than the activities (0.011 and 0.024 µmol D-glucose/mg protein.min) determined with Sigmacell-20 for the termite saliva endogenous endoglucanase [128]. While the GH5-p4818Cel5_2A endoglucanase activity on Avicel was about 30% higher than the activity (0.1696 µmol D-glucose/mg protein.min) measured with Avicel for the Clostridium cellulolyticum cellulosomal GH9 processive endoglucanase CelF [129], its activity was about 36-to 119-fold higher than the activities (0.0018-0.0059 µmol D-glucose/mg protein.min) measured with Avicel for a total of 13 other Clostridium cellulolyticum cellulosomal GH9 endoglucanases [130]. The GH5-p4818Cel5_2A endoglucanase activity on Avicel was about 18% higher than the activity (0.1828 µmol D-glucose/mg protein.min) determined with Avicel for the Clostridium thermocellum noncellulosomal processive endoglucanase GH9B (CelI) [131], and was about 431-fold higher than the activity (0.0005 µmol D-glucose/mg protein.min) measured with bacterial microcrystalline cellulose (BMCC) for the Clostridium thermocellum cellulosomal non-processive endoglucanase GH9C (CbhA) [132]. The GH5-p4818Cel5_2A endoglucanase activity on Avicel in this study was about 9-to 97-fold higher than the range of activities (0.0022-0.0218 µmol D-glucose/mg protein.min) determined with Avicel for the three Saccharophagus degradans GH5 processive endoglucanases of Cel5G, Cel5H and Cel5J [133]. GH5-p4818Cel5_2A endoglucanase activity on Avicel by Wang et al. [45] was three-fold lower than the activity (0.8187 µmol D-glucose/mg protein.min) measured with Avicel (PH105) for the Bacillus subtilis GH5 endoglucanase Bscel5 [75], however, the Bacillus subtilis Bscel5 was a classic non-processive endoglucanase. The GH5-p4818_2A endoglucanase activity on Avicel in this study was about 24-and 71-fold higher than the activities (0.0030 and 0.0088 µmol D-glucose/mg protein.min) measured with Avicel for novel GH5 endoglucanases obtained from vermicompost and sugarcane soil metagenomes [61,134]. Thus, this truncated GH5-tCel5A1 reported by Basit and Akhtar [127] and the porcine gut GH5-p4818Cel5_2A endoglucanase reported by Wang et al. [45] had an excellent hydrolytic activity on pure crystalline cellulose substrates in comparison with a number of bacterial endocellulases reported in the literature.
On the other hand, we should also compare the GH5-tCel5A1 and the GH5-p4818Cel5_2A processive endoglucanase properties, as reported by Basit and Akhtar [127] and Wang et al. [45], with the following two thermophilic (i.e., 45-122 • C) processive endcellulases of bacterial origins that are highly regarded for their potential industrial applications with high endocellulase activities in hydrolyzing crystalline cellulosic substrates in the current literature. The Clostridium Phytofermentans GH9 processive endoglucanase Cel9 was cloned and characterized to have about 11-fold higher specific activity (2.488 µmol D-glucose/mg protein.min) determined with Avicel by Zhang et al. [135] than GH5-p4818Cel5_2A. Clostridium Phytofermentans is an obligately anaerobic, mesophilic (i.e., 20-45 • C), cellulolytic Gram-positive bacterium strain (i.e., ISDg T = ATCC 700394 T ) isolated from forest soil by Warnick et al. [136]. Clostridium Phytofermentans is also regarded as biofuel bacterium and a model microorganism for consolidated bioprocessing (CBP), since it can utilize a broad range of carbon sources for generation of ethanol and other metabolites in one step [78,135]. However, the Clostridium Phytofermentans GH9 processive endoglucanase Cel9 is a thermophilic cellulase, because it has an optimal peak temperature at 65 • C and its activity drops down sharply at 40 • C [135]. The researchers believed that the mesophilic Clostridium Phytofermentans might have acquired the thermophilic GH9 processive endoglucanase Cel9 via the horizontal gene transfer mechanism [78]. Liao et al. [76] further reported that the Clostridium Phytofermentans endoglucanase Cel9 only had a specific activity at 0.3464 µmol D-glucose/mg protein.min determined at 37 • C with Avicel, which may limit its potential industrial applications in some areas such as exogenous enzyme additives. Furthermore, the Clostridium Phytofermentans endoglucanase Cel9 has a multi-modular structure and thus a relatively large molecular weight and dimensional size for penetration into internal microfibril surface of raw and pretreated biomass materials, which may be the additional disadvantage associated with the GH5-tCel5A1 and the GH5-p4818Cel5_2A processive endoglucanases for their industrial applications.
Zverlov et al. [103] cloned and characterized a multimodular free cellulase CelA from the extremely thermophilic cellulolytic bacterium Caldicellulosiruptor bescii (previously called Anaerocellum thermophilum) with a specific activity at 0.2391 µmol D-glucose/mg protein.min with Avicel, and 1.6174 µmol D-glucose/mg protein.min with xylan, respectively, determined at 72 • C driven by its bi-modular endo-acting and exo-acting processivity and the bi-functional hydrolytic property. The Caldicellulosiruptor bescii CelA was determined to have an optimal peak temperature at 85 • C for hydrolyzing Avicel and this activity dropped down sharply to zero at about 45 • C [103], which may limit its potential industrial applications in some areas such as serving as an exogenous feed enzyme additive in livestock feeding. The Caldicellulosiruptor bescii CelA had a large molecular weight of about 230 kDa [103], and an estimated effective size of 10-35 nm [77], thus potentially limiting its effective diffusion into microfibril internal surface for biomass hydrolysis. It was shown that the Caldicellulosiruptor bescii CelA was effective in excavating extensive cavities on the surface of cellulosic substrates, thus increasing its effective hydrolytic areas [77]. In their Science paper, Brunecky et al. [77] demonstrated that the Caldicellulosiruptor bescii CelA was effective in enzymatically converting both pretreated and raw biomass materials at 75 and 85 • C. Berlin [88] commented that hydrolytic activities of this Caldicellulosiruptor bescii CelA showing that now there were no enzymatic barriers for effective cellulose breakdown. Berlin [88] further viewed that two other advantages associated with hydrolysis of biomass through the Caldicellulosiruptor bescii CelA activities at the high temperatures of 75 and 85 • C included the inhibition of potential bacterial contamination and fluidity of fermentation broth. However, maintaining the high temperatures of 75 and 85 • C to achieve the optimal Caldicellulosiruptor bescii CelA activities will likely be too expensive at the large-scale commercial level for manufacturing lignocellulosic ethanol. On the other hand, the Caldicellulosiruptor bescii CelA cellulase was originally characterized by Zverlov et al. [103] and the Clostridium Phytofermentans bacterium expressing the endoglucanase Cel9 was originally isolated and patented by Warnick et al. [136]. Hence, considering the high activity in hydrolyzing insoluble cellulose substrates and other ideal enzymatic properties such as the mesophilic or thermophilic, optimal pH and a small molecular weight and size, these recently characterized GH5-tCel5A1 and the GH5-p4818Cel5_2A, as reported by Basit and Akhtar [127] and Wang et al. [45], are outstanding processive endoglucanases candidates for various potential industrial applications.
There are two types of processive cellulases: exocellulases and processive endocellulases [73]. Degree of processivity for exocellulases is usually determined through calculating the ratio of cellobiose to cellotriose based upon quantitative chromatographic analysis of hydrolytic products of a target exocellulase [73,[140][141][142]. The processive nature of endocellulases can be verified by calculating the processivity value, i.e., the ratio of the amount of total soluble reducing sugars to the amount of total insoluble reducing sugar ends, which is usually measured with pure crystalline cellulose substrates such as filter paper over a number of hours incubated with target endocellulases [73,141,142]. The following substrate hydrolytic data in comparison with literature reports support the notion that the GH5-p4818Cel5_2A reported by Wang et al. [45] is a novel processive endocellulase. Firstly, the pattern of dramatic reduction in specific viscosity in the CMC solution by Wang et al. [45] was similar to the patterns reported for the Clostridium thermocellum noncellulosomal GH9 endoglucanase GH9B (CelI) [131] and the Saccharophagus degradans GH5 endoglucanase Cel5H [133]. Both the Clostridium thermocellum noncellulosomal GH9 endoglucanase GH9B (CelI) and the Saccharophagus degradans GH5 endoglucanase Cel5H were shown to be processive endoglucanases with cellobiose as the main hydrolytic end product [131,133]. Arguably, the carboxymethyl substitution on cellulose in the CMC may prevent cellulases in hydrolyzing CMC in a processive manner, thus questioning the suitability of CMC as a substrate for processivity study [131]. Secondly, pNP-cellobioside has been recognized as a chromogenic substrate to measure exocellulase activity [74]. Kinetic activities of this cellulase GH5-p4818Cel5_2A on pNP-cellobioside, as reported by Wang et al. [45], suggest that GH5-p4818Cel5_2A, as a recognized endoglucanase, had an exoacting catalytic property. However, McGrath and Wilson [132] cautioned that activity on pNP-cellobioside was a function of where and how well this synthetic substrate compound could bind into the active site of an enzyme and would not predict whether an enzyme was an exocellulase or endocellulase.
Determination of the processivity value with purified cellulose substrates can help define the processive nature of endocellulases and this has been demonstrated in the past in several studies. The processivity value was determined with filter paper (Whatman #1) for the Thermobifida fusca processive endoglucanase Cel9A (E4) (at 7.0) [120], for the Clostridium thermocellum noncellulosomal processive endoglucanase GH9B (CelI, at about 50) [131], and for the three Saccharophagus degradans GH5 processive endoglucanases of Cel5G, Cel5H and Cel5J (at 4.0-4.6) [133]. The processivity value was measured with Avicel for the brown rot Basidiomycete Glyoephyllum trabeum GH5 processive endoglucanase Cel5A (at 11.3) [118]. The processivity value was determined with phosphoric acid-swollen cellulose (PASC) for the Clostridium cellulolyticum cellulosomal GH9 processive endoglucanase CelF (at 19.0) [129]. The processivity value was increased from 2.4 to 3.9 when the incubation was prolonged from 0.5 to 3.5 h with the soluble amorphous cellulose substrate RAC for the Clostridium Phytofermentans GH9 processive endoglucanase Cel9 [135]. The processivity value was also seen increasing from 3.6 to 8.6 when the incubation was prolonged from 0.5 to 24 h with filter paper for the processive GH5 endoglucanase EG1 in an atypical white rot basidomycete Volvariella volvacea [117]. Meanwhile, the processivity value was also determined with filter paper for the Thermobifida fusca non-processive endoglucanase Cel6A (E2, at 2.1 and 2.6) [120,133] and the Clostridium thermocellum cellulosomal non-processive endoglucanase GH9C (CbhA, at 1.9) [132]. Watson et al. [133] reviewed that the processivity values, measured with filter paper for the extensively characterized Thermobifida fusca processive endoglucanase Cel9A, typically ranged between 3.0 and 7.0 for processive endoglucanases. GH5-tCel5A1 and GH5-p4818Cel5_2A were determined to have a processivity value of 10 and 4.6, respectively [45,127]. Thus, determination of processivity values is affected by types of substrates and duration of incubation time used.

Properties of Processive Endocellulases in Hydrolyzing Pre-Treated Cellulosic Substrates
While Avicel is widely used as a pure crystalline cellulose substrate with estimated CrI at 0.5-0.6, it is a microcrystalline cellulose with relatively much small molecular weights (DP of 150-500 glucose units) in comparison with other crystalline cellulose substrates such as Whatman No. 1 filter paper (DP of 750-2800 glucose units) and Solka-Floc ® (DP of 750-1500 glucose units) [4]. Thus, GH5-p4818Cel5_2A activity reported by Wang et al. [45] was also determined at 0.127 µmol D-glucose/mg protein.min by using Solka-Floc ® and this activity value was about 70% lower than the value measured with Avicel (Table 1). Differences in porosity and molecular weights, as reflected by their different DP values, were likely responsible for this discrepancy in the GH5-p4818Cel5_2A activities measured between these two crystalline cellulosic substrates. Due to the fact that biological pretreatment of biomass materials is not well established, various physiochemical pre-treatments of biomass materials have been established and optimized to release semipure pretreated cellulosic substrates for potential industrial scale of saccharification and subsequent fermentation of ethanol [10,24]. Physicochemical pre-treatments of biomass feedstock are expensive and represent about 21% of the total expense of producing lignocellulosic ethanol [24]. It should be pointed out that Solka-Floc ® is a semi-pure crystalline cellulose product commercially available from the International Fibre Corporation. Solka-Floc ® has properties comparable to pretreated cellulosic substrates including CrI values (CrI for both 0.4-0.7) and DP (DP of 750-1500 vs. 400-1000 glucose units) [4]. Thus, the GH5-p4818Cel5_2A activity determined by using Solka-Floc ® reported by Wang et al. [45] should represent specific activity of this novel enzyme in hydrolyzing pretreated crystalline cellulosic substrates.
Amorphous disordered insoluble celluloses and dissolved soluble celluloses are different entities, which is different from crystalline celluloses in terms of physicochemical properties and rates of hydrolysis by cellulases for industrial applications [124]. While needing to be further optimized for potential industrial scale of applications, both amorphous insoluble cellulose and soluble cellulose substrates have been developed and widely used in cellulase activity tests [4,143]. Acid swollen cellulose, primarily in the form of phosphoric acid swollen cellulose (PASC), is a typical insoluble amorphous disordered cellulose with CrI at 0.00-0.04 and DP of 100-1000 glucose units [19,124]. Regenerated amorphous celluloses (RAC) are soluble amorphous cellulose substrates and have been generated by using concentrated phosphoric acid and cellulose solvents [143], which represents further treated readily degradable next generation cellulosic substrates for commercial ethanol production [76,143]. The tCel5A1 processive endoglucanase reported at 24.8 µmol D-glucose/mg protein.min by using RAC at 70 • C but not at the physiological pH by Basit and Akhtar [127]. The GH5-p4818Cel5_2A activity reported by Wang et al. [45] was determined at 1.39 µmol D-glucose/mg protein.min by using RAC at 50 • C and this activity was about 5.4-fold higher than the value measured with Avicel in this study (Table 1). This result of GH5-p4818Cel5_2A activity on RAC was consistent with previous studies in showing that endocellulases with activities on crystalline celluloses would have much amplified hydrolytic activities in hydrolyzing amorphous cellulose substrates frequently demonstrated with PASC or RAC [118,120,131,133]. Hence, these tCel5A1 and the GH5-p4818Cel5_2A endoglucanase activities determined by RAC, as reported by Basit and Akhtar [127] and Wang et al. [45], would suggest that this novel enzyme is very effective in degradation of amorphous cellulosic substrates that are derived from further advanced physicochemical processing in the bioethanol industry.

Potential of Processive Endoellulases as Newly Emerging Exogenous Fibre Enzymes for Food Animal Nutrition
To reveal the potential of these tCel5A1 and the GH5-p4818Cel5_2A processive endoglucanase, as reported by Basit and Akhtar [127] and Wang et al. [45], as potentially emerging exogenous feed enzyme additives for livestock feeding especially in cattle, it is imperative to review current understanding of ruminal cellulolytic enzymology and cellulase activities. Two lines of evidence suggest ruminal microbial degradation of lignocellulosic feeds is not very efficient in ruminant livestock species, especially in the highly productive Holstein dairy cows and double muscle beef cattle, and this has been a major sustainable development concern on the global stage. Firstly, about 1 billion tonnes of grains, including wheat, corn, barley, oats, rye, millet and sorghum, accounting for an estimated 70% of grains produced by developed countries and one-third or more of the world's cereal grains produced, are used in feeding livestock annually for global animal production, which could be directly used to feed some 3.5 billion humans to resolve the world hunger crisis [144]. Furthermore, 40% of feed grains or 400 million tonnes of grains are used annually in ruminants-mainly in cattle [144]. Biologically speaking, this is due to the fact that ruminal microbial degradation of lignocellulosic feeds is not fast enough in ruminants, especially Holstein dairy cows and beef cattle that have been selected and bred for high milk yield and lean growth. Thus, large proportions of processed grains need to be included in rations for these cattle [145,146]. Secondly, it has been well documented that fermentation of feed cellulose in rumen is not optimal as supported by the fact that a significant amount of fermentable fibre is recovered in feces of ruminants [30]. This less optimal digestive utilization of lignocellulosic fibre feeds in combination with grain inclusion in rations & diets has resulted in poor efficiency of carbon and energy utilization in livestock production sectors, accounting for 14.5% of the human-induced greenhouse gas emissions and exceeding that from transportation [144], and becoming a major sustainability concern [34,147]. Selinger et al. [148] reviewed that exogenous feed enzymes including cellulases produced by recombinant DNA technology would be powerful for enhancing efficiency and cutting costs in livestock production. Adeola and Cowieson [32] reviewed that the global exogenous enzyme market value was estimated at about $600 million with about 60% of this market share being dedicated to the single exogenous enzyme product phytase in monogastric feeding and nutrition. Hemicellulases such as β-gucanase and xylanases were recognized as some of the other exogenous feed enzyme products while cellulases had very negligible market share [32]. Fan [34] and Meale et al. [33] concluded that current exogenous feed cellulase additive products were not efficacious in showing their synergy and sufficient activity in breaking down cellulose in vivo, suggesting new cellulase enzyme products need to be developed for the potentially huge global market.
Enzymatically, several lines of evidence have pointed out that ruminal microbial cellulases, especially in bovine species, are not likely to be self-sufficient to carry out fast, efficient and compete degradation of natural crystalline lignocelluloses associated with fibre feeds. Firstly, Denman et al. [80] and Wang et al. [149] isolated and characterized a GH6 endocellulase celA with very high activities at 9.7 and 1.1 µmol D-glucose/mg protein.min, respectively, measured by using Avicel from the ruminal anaerobic fungus Neocallimastix patriciarum. Liu et al. [150] isolated and characterized high GH5 endocellulase celA activity at 1.4 µmol D-glucose/mg protein.min by using Avicel from the ruminal anaerobic fungus Piromyces rhizinflata. Indeed, these reported ruminal anaerobic fungal endocellulase activities were several folds higher than our GH5-p4818Cel5_2A endoglucanase activity in hydrolyzing Avicel. However, anaerobic fungi only represent a small population (5-10%) of the total microbes in rumen [151]. Secondly, these ruminal anaerobic fungal endocellulases are not processive in nature, thus cannot lead to efficient degradation of crystalline cellulose into soluble cellodextrin in vivo. Krause et al. (2003) were the first to recognize and propose that the lack of expression of GH6 and GH7 exo-acting cellulases was likely the primary enzymatic factor limiting efficient degradation of crystalline cellulose in rumen [30]. Intriguingly, more recent work on microbial metagenomic sequencing and the GH gene cataloguing of cow's ruminal microbiome by Hess et al. (2011) [41] confirmed the original view by Krause et al. [30]. No GH6 and only one GH7 cellulase gene were identified out of a total of 2365 cow's ruminal microbial cellulase genes catalogued in the work conducted by Hess et al. [41]. Also characterized with the cellulase systems in the aerobic fungal Trichoderma reesei and Humicola insolens, the type-I and type-II exo-acting cellulases, i.e., cellobiohydrolases, are largely classified to have GH6 and GH7 catalytic domains, and belong to GH6 and GH7 families [49]. Thus, the metagenomic analysis-based gene cataloguing by Hess et al. [41] would support the notion that there are essentially very little or no exo-acting microbial cellulase genes and their activities in the bovine rumen.
On the other hand, a large number of GH5 and GH9 cellulase genes were identified and catalogued from cow's ruminal microbiome by Hess et al. [41]. It is known that some of the well-characterized processive endocellulases belong to the GH5 and GH9 cellulase families. A large number of studies have been conducted to isolate and characterize ruminal microbial cellulases by using the classic chromatography-based biochemical protein purification [152,153], gene cloning and expression [69,154], organism genome sequencing based cloning and expression [82,151,155], metagenomic library based functional screening [53,156], and gene-centric metagenomic sequencing, cataloguing and expression [41] during the past three decades. Palackal et al. [156] and Duan et al. [157] screened and characterized some ruminal GH5 endocellulases with non-detectable activities on crystalline cellulose substrates. Ferrer et al. [53] characterized a number of novel cellulases with activities on several tested crystalline cellulose substrates; however, they did not classify catalytic domain GH families on these enzymes, and their enzymes were not purified and assayed by using the international unit for direct comparison with other studies. Fibrobacter succinogenes and Eubacterium cellulosolvens are some of the main fibrolytic bacteria in rumen [158]. Yoda et al. [83] cloned and characterized a large bifunctional Eubacterium cellulosolvens GH5 endocellulase cel5A (127 kDa) showing activities on soluble substrate CMC and hemicellulose xylan but did not report activity on crystalline cellulose substrates. Cavicchioli and Watson [155] cloned and characterized a Fibrobacter succinogenes endocellulase with activities on several tested crystalline cellulose substrates; however, they did not classify a catalytic domain GH family on this enzyme, and their enzyme was not purified and assayed by using the international unit for direct comparison with other studies. Qi et al. [154] cloned and characterized two main bifunctional Fibrobacter succinogenes GH5 and GH9 endocellulases of rcCel5H and rcCel9B with activities on soluble substrates CMC and HEC, amorphous substrate Ball-milled cellulose and hemicellulose xylan. While the Fibrobacter succinogenes GH5 endocellulase rcCel5H had non-detectable activity on the crystalline cellulose substrates Avicel and Sigmacel 100, the Fibrobacter succinogenes GH9 endocellulase rcCel9B (67.3 kDa) had very low activities on the crystalline cellulose substrates Avicel (0.05 µmol D-glucose/mg protein.min) and Sigmacel 100 (0.07 µmol D-glucose/mg protein.min) [154]. By comparison, the GH5-p4818Cel5_2A activity on Avicel from this study was about two-fold higher than the GH9 endocellulase rcCel9B activity from the Fibrobacter succinogenes in the study by Qi et al. [154]. Qi et al. [69] further cloned an atypical GH9 gene termed Cel9D (79.4 kDa) in expressing the exocellulase (E.C. 3.2.1.74) activity, and characterized its synergism with other four endocellulases of Cel8B (81.4 kDa), Cel9B (67.3 kDa), Cel45C (37.7 kDa) and Cel51A (119 kDa) that were all cloned from the ruminal Fibrobacter succinogenes. By comparison, the GH5-p4818Cel5_2A activity (1 U = µmol D-glucose/min) on Avicel from this study was about 431-fold higher than the atypical GH9 exocellulase Cel9D activity (0.0005 U/mg protein), 307-fold higher than the endocellulase Cel8B activity (0.0007 U/mg protein), 215 fold higher than the endocellulase Cel9B activity (0.0010 U/mg protein), 1079-fold higher than the endocellulase Cel45C activity (0.0002 U/mg protein) and 359 fold higher than the endocellulase Cel51A activity (0.0006 U/mg protein) from the ruminal Fibrobacter succinogenes in the study by Qi et al. [69]. Thus, as compiled in Table 1, in comparisons with the GH5-tCel5A1 and the GH5-p4818Cel5_2A endoglucanase activities as reported by Basit and Akhtar [127] and Wang et al. [45], cattle ruminal bacterial endocellulase and exocellulase activities reported so far in the literature are considerably much lower.
Arguably, highly active processive endocellulases may still exist in cow's ruminal microbes till all the catalogued 2365 cows' ruminal cellulase genes with large numbers of the GH5 and GH9 genes are fully expressed and characterized. Aspeborg et al. [159] reviewed that GH5 genes represent one of the largest glycoside hydrolase family genes within the 113 GH families. As summarized by Aspeborg et al. [159], while a few of the GH5 subfamilies did not encode for endocellulase enzyme activities, a number of the GH5 genes subfamilies have encoded GH proteins that have lost their catalytic machinery, indicating evolution towards other novel functions. Interestingly, studies by Duan et al. [157] and Liu et al. [160] screened several novel GH5 endocellulases with little or none detectable activities on crystalline cellulose substrates from the buffalo rumen, whereas Bao et al. [54] screened a novel GH5 endocellulase with high activity (0.20 U/mg protein) on Avicel from the yak rumen, and Nguyen et al. [55] screened a novel GH5 endocellulase with very high activity (2.5 U/mg protein) on crystalline cellulose Avicel from the buffalo rumen metagenome. Last but not least, the presence of lignin polymers that wrap around the surface of individual microfibrillar units encapsulates the microfibrillar contents of hemicelluloses and celluloses from effective enzyme penetration and attack. The negative effect of lignin on fibre degradation has been well documented in animal nutrition [145,146,158,161].
Nevertheless, it can be concluded that the overall ruminal microbial cellulase activities, particularly processive endocellulase and exocellulase activities towards crystalline celluloses, are low and are limited to allow fast and efficient lignocellulolytic degradation. The present commercial cellulase additives that are primarily derived from the aerobic fungal Trichoderma reesei, Humicola insolens and Aspergillus niger have been viewed to be less efficacious as exogenous enzyme additives likely due to the fact these enzymes have to work in groups and proper ratios for optimal synergism and activities, which prevent them from effectively penetrating into internal microfibrillar surfaces for their actions in fibre feeds high in lignin. Thus, commercialization of GH5-tCel5A1 and the GH5-p4818Cel5_2A as emerging exogenous endocellulase additives will have the potential to greatly improve efficiency of plant cell wall based fibre feed utilization in livestock feeding sectors, particularly in the cattle industry.
Some GH5 endocellulases with dual substrate specificity such as concurrent activities on glucan-mannan or glucan-xylan have been well documented [159,165]. Pereira et al. [166] reported that the hyperthermophilic Thermotoga maritima GH5 endoglucanase Cel5A had activities on soluble CMC, barley β-glucan, xyloglucan, lichenin, galatomanan and glucomannan; however, it had no activities on important plant cell wall substrates of crystalline celluloses and xylan. Vlasenko et al. [167] suggested that the GH5 multifunctional properties likely descended from an ancestral enzyme gene through bioinformatic phylogenetic analyses. Cohen et al. [118] and Qi et al. [69] reported that GH5 processive endoglucanase Cel5A in the brown rot basidomycete Postia placenta and the ruminal Fibrobacter succinogenes GH9D had concurrent activities on crystalline cellulose and xylan. The bi-modular endo-acting and exo-acting processive and bi-functional thermophilic Caldicellulosiruptor bescii endocellulase CelA were shown to hydrolyze both crystalline cellulose and xylan [103], which feature was highly regarded as an outstanding potential industrial cellulase for saccharification without required physicochemical pretreatment of feedstock [77,88]. However, the Clostridium thermocellum noncellulosomal GH9, the Clostridium Phytofermentans GH9 and the Saccharophagus degradans GH5 processive endoglucanases were all characterized to only hydrolyze cellulose substrates [78,131,133].
The current commercial cellulase enzyme products that are primarily derived from the aerobic fungal species of Trichoderma reesei, Humicola insolens, Aspergillus niger and basidomycete Irpex lacteus do not possess hydrolytic properties beyond crystalline celluloses, CMC, barley glucans and/or xylan. To the best of our knowledge, the GH5-Cel5A1 variants and particularly GH5-p4818Cel5_2A have been the only processive endocellulase characterized so far to have multiple substrate specificity in hydrolyzing multiple plant cell wall substrates including crystalline cellulose, pretreated cellulose and β-glucan, as well as hemicelluloses of xylan, xyloglucan, glucomannan and galactomannan.
Under this context, it is known that hemicelluloses are important components of plant cell wall materials embedding elemental fibril crystalline cellulose core for each of microfibrillar units, reducing porosity, and hampering cellulases from further penetration into microfibrillar internal surfaces for effective hydrolysis. Physicochemical pretreatment of plant cell wall biomass materials for removal of hemicelluloses accounts for up to 22% of the total projected cost of producing commercial scale of lignocellulosic ethanol [24]. Harris et al. [168] reviewed that the current trend in the developing second-generation lignocellulose and hemicellulose ethanol industry is to use of pretreatment of lower severity by combining biochemical conversion technology in order to lower the capital cost and also to minimize waste generation. For example, the first commercial scale second-generation fuel ethanol plant, Beta Renewables plant in Crescentino, Italy, utilizes only steam in its pretreatment without chemical addition [168]. Uses of hemicellulases following this steam pretreatment would further increase porosity, constituent soluble sugar yields and fluidity of the pretreated feedstock slurry, since β-glucans, xyloglucans, xylan and arabinoxylans are very viscous [168]. Vincken et al. [169] showed that endocellulases, exocellulases and xyloglucanases together could exert considerable synergy in degradation of raw plant cell wall materials. Thus, these novel GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases with unique multi-functional hemicellulase activities of hydrolyzing crystalline celluloses and hemicelluloses will have an excellent potential as an exogenous feed enzyme additive for livestock feeding and as a highly efficient biocatalyst to enable competitive lignocellulosic ethanol production. Further laboratory scale of tests and field experiments need to be conducted to examine the efficacy of these novel GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases in hydrolysis of livestock feed and biofuel feedstock samples.

Optimal Biochemical and Physiological Properties of the Newly Characterized Processive Endocellulases
These novel GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases had pH optimum in a light acidic to neutral pH environment of 5.0-7.0 with the majority (about 60%) of its activity retained for a wide range of pH, ranging from very acidic pH of 4 to alkaline pH of 8.5 (Table 1) [45,127]. These results were consistent with the pH optimum and effects of pH on endocellulase activity response patterns previously reported in three novel GH5 cellulases screened and characterized out of bovine rumen metagenomes [54,55,157] and rabbit cecum [56]. These GH5 endocellulase pH optimum and effects of pH on the endocellulase activities would reflect adaptive evolution of these endocellulases and their host bacteria to the ruminal, gastric, small and large intestinal luminal pH environment and the more alkaline pH environment in the middle and the distal regions of the small intestine in porcine, monogastric herbivore and bovine species. The light acidic to neutral pH optimum at 6-7 and the tolerance to a wide range pH environment for these novel GH5-tCel5A1 and GH5-p4818Cel5_2A are valuable biochemical properties which enable its potential wide range of industrial applications as an exogenous feed enzyme additive, saccharifiction of feedstock for production of lignocellulosic ethanol as well as a number of other applications in light industries such as textile, detergent, food processing and wood and paper pulp polishing.
The GH5-p4818Cel5_2A endocellulase had optimal temperature at 50 • C with majority (above 75%) of its endocellulase activity retained between 37 and 50 • C, which represented a marginal overlap between mesophilic (20-45 • C) and thermophilic (50-75 • C) temperature zones, whereas tCel5A1 endocellulase activity was shown as optimal at the thermophilic (50-75 • C) temperature zone [45,127]. Feng et al. [56] reported several novel cellulases with optimal temperature between 40 and 55 • C screened and characterized out of rabbit cecal metagenome. Bao et al. [54] reported a novel GH5 cellulase with an optimal temperature at 60 • C screened and characterized out of yak rumen metagenome, and Nguyen et al. [55] reported a novel GH5 cellulase with an optimal temperature at 50 • C screened and characterized out of buffalo rumen metagenome. It is known that normal mammalian body core temperature is at about 37 • C and ruminal digesta core temperature is at about 39 • C [54]. The mesophilic forest soil bacterium Clostridium Phytofermentans GH9 processive endocellulase was a thermophilic enzyme with an optimal temperature at 65 • C [78]. Horizontal gene transfer might have played a role for the discrepancy of the optimal temperatures and the inhabiting environment temperatures among these endocellulases to which the aforementioned hosting bacteria adapted. These novel GH5-tCel5A1 and GH5-p4818Cel5_2A with an optimal temperature at 50-70 • C and their major endocellulase activities retained between 37 and 50 • C will be suitable as an exogenous feed enzyme additive in a wide range of livestock feeding, also including pelleted monogastric fish, poultry and swine diets.
For non-herbivore monogastric animal species including swine, poultry and fish, a much higher thermostability is usually required for exogenous enzymes such as phytases and endocellulases because diets for these animals are commonly pelleted at high temperature (60-80 • C) [170]. Thus, a biochemical property of thermostability, as measured by residual activity after being heated at 80 • C for 10 min, will be essentially ideal for GH5-p4818Cel5_2A endocellulase to be potentially used as an exogenous enzyme additive for non-herbivore monogastric animal species [170]. On the other hand, while saccharification of feedstock with cellulases following pretreatment has been tested and developed for potential lignocellulosic ethanol production at various temperatures such as 37, 45 and 50 • C [10,76,171], a higher thermophilic temperature ranging at 60-65 • C is considered to be advantageous for commercial lignocellulosic ethanol production [88,172,173]. Thus, thermostable cellulase such as the GH5-tCel5A1, as reported by Basit and Akhtar [127], reflecting this required hyperthermophilic temperature, would be desirable for potential commercial lignocellulosic ethanol production.
Two approaches are typically used to explore these desired thermophilic cellulases, including (i) the screening of novel and highly active cellulases from thermophilic (50-75 • C) and hyperthermophilic (80-120 • C) through targeted genomic sequencing the extremely thermophilic microorganisms such as Thermotoga maritima for the Cel5A1 and its variants [137,174]; and (ii) improving of selected mesophilic cellulases through mutagenesisbased enzyme protein engineering [175,176] and/or glycosylation modification-based enzyme property modification [177]. The first approach, represented by the Cel5A1, may be limited due to the fact that the selected thermostable cellulases that are expressed and screened out of thermophilic & hyperthermophilic microbes will most likely have optimal temperature within the thermophilic temperature zone, which will not be suitable as exogenous enzyme additives in livestock feeding sectors. For example, the Clostridium Phytofermentans GH9 thermophilic processive endocellulase with an optimal temperature at 65 • C, as reported by Zhang et al. [78], is not ideal to serve as an exogenous enzyme additive in livestock feeding, since this enzyme will lose about 60-70% of its maximal activity once fed through the animals' digestive tract system at a temperature ranging from 37 • C in mammals to 41 • C in poultry. The hyperthermophilic Caldicellulosiruptor bescii bi-functional processive endocellulase CelA was effective in enzymatically converting pretreated and raw biomass materials at 75-85 • C, as demonstrated by Brunecky et al. [77], however, this enzyme would not be very active in animals' digestive system as a good exogenous enzyme additive in livestock feeding. Thus, upon further optimization for improving its thermostability, reaching 75-85 • C through effective strategies of enzyme protein engineering and glycosylation modifications, this GH5-p4818Cel5_2A processive endocellulase, reported by Wang et al. [45], has the potential to be used for various industrial applications at mesophilic, thermophilic and hyperthermophilic temperature conditions. Further research needs to be conducted to examine the efficacy of this hyperthermophilic processive endocellulase GH5-tCel5A1 and its variants, as reported by Basit and Akhtar [127], as exogenous cellulases for food animal feeding.
The property of protease resistance of cellulases is generally not an immediate concern for some industrial applications such as lignocellulosic ethanol production, textile and wood and paper pulp bioprocessing. In first-generation grain ethanol production, proteases are used to maximize starch hydrolysis and sugar yields by dismantling starch-gluten complexes [168]. Perspectively, the property of protease resistance of cellulases will also likely become important for commercial lignocellulosic ethanol production, since inclusion of proteases in biochemical based pretreatment and subsequent enzymatic saccharification of feedstocks would potentially reduce interferences of nitrogenous compounds during cellulose and hemicellulose breaking down associated with plant cell wall materials. The protease resistance feature is important to consider for novel cellulases that are to be used for animal feed industrial applications and other food processing applications associated with the presence of significant levels of proteases. As reported by Wang et al. [45], GH5-p4818Cel5_2A endocellulase activities on CMC demonstrated a remarkable resistance to the proteolytic activities (5000 U/mL test media) of trypsin, whereas the GH5-p4818Cel5_2A endocellulase activities on CMC were dramatically decreased after 30 min of incubations in the presence of chymotrypsin activity (200 U/mL test media). Trypsin and chymotrypsin activities in small intestinal fluids were reported to range from 43 to 45 U/mL, and from 6.1 to 7.9 U/mL, respectively [178]. Trypsin and chymotrypsin activities used in the experiments by Wang et al. [45] were much higher than these protease activities reported in growing pigs in the literature [178]. Thus, it may be concluded that GH5-p4818Cel5_2A would be somewhat resistant to trypsin activity. Further research needs to be pursued to investigate the gastrointestinal stability of promising novel cellulases for their potential development as efficacious exogenous feed fibre enzymes in the agricultural food animal industry.
Both trypsin and chymotrypsin are endopeptidases. Trypsin specifically cleaves Arg and Lys residual based peptide bonds, while chymotrypsin preferentially hydrolyzes the aromatic amino acid residue Trp based peptide bonds as well as Tyr, Phe, Leu and Met based peptide bonds [38]. With this context, it should be pointed out that glycoside hydrolases such as endocellulases, which are ubiquitous and typically exhibit a tunnel or a cleft catalytic site, are lined with aromatic Trp residues for processing carbohydrates [86]. GH5-p4818Cel5_2A was of bacterial origins and was expressed in competent non-pathogenic E. coli cells, thus this GH5-p4818Cel5_2A endocellulase enzyme protein was not likely glycosylated. Thus, this GH5-p4818Cel5_2A endocellulase with Trp residues in its catalytic sites without glycosylation are, in principle, very much vulnerable to proteolytic activity by chymotrypsin. On a separate note, a much higher level of chymotrypsin activity load (by about 111-fold) in the study by Wang et al. [45] in comparison with the levels reported in the Fang et al. [178] study was also likely responsible for the dramatic decline pattern of the GH5-p4818Cel5_2A endocellulase activities shown by Wang et al. [45]. On the other hand, it was reported that in cecal digesta, a high residual trypsin activity was observed, while negligible residual chymotrypsin activity was seen [179]. Thus, the lack of resistance to chymotrypsin activity by this GH5-p4818Cel5_2A endocellulase reported by Wang et al. [45] was also likely due to the fact the bacteria that expressed this cellulase had evolved to adapt to the cecal environment.
Akiba et al. [180] reported that a fungal Aspergillus niger endocellulase was highly resistant to proteases. Morgavi et al. [181] shown that several fungal cellulases were resistant to proteolytic activities that were of bovine ruminal microbial origins and addition of non-enzyme sources of proteins could further dilute the ruminal proteolytic effects on the fungal endocellulase activities. Bai et al. [182] reported a novel GH9 beta-glucanase that was cloned and characterized from thermoacidophilic Alicyclobacillus sp. and was highly resistant to various proteases when this enzyme was expressed in the yeast Pichia pastoris likely in a glycosylated form. Nguyen et al. [55] reported that a novel GH5 cellulase from buffalo rumen metagenome was highly resistant to proteases. Beckham et al. [177] viewed that O-glycosylation imparts protease resistance for the fact that glycans presumably block proteases access to the protein backbone. However, pepsin is an important animal gastric protease and its effects on cellulase activities are not well reported in the literature. Woyengo et al. [183] reported that porcine gastric pepsin activity ranged from 143 to 266 PU (pepsin activity unit)/mL digesta fluid. Further efforts should be made to examine effects of pepsin, trypsin and chymotrypsin activities on both GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulase activity in the presence feed proteins under close to normal physiological conditions such as at the porcine physiological body temperature of 37 • C, a typical gastric pH for testing the pepsin activity effect and a typical small intestinal luminal pH for testing the trypsin and chymotrypsin activity effects. The expression of both GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases of the bacterial origins in the major commercial fungal protein production platforms such as Trichoderma reesei, Humicola insolens, Aspergillus niger and yeast Pichia pastoris and the subsequent expression effects on GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulase glycosylation and protease resistance should be further investigated for future commercial implications.

Small Molecular Size and Monomodular Structure of the Newly Characterized Processive Endocellulases
These novel GH5-tCel5A1 and GH5-p4818Cel5_2A processive cellulases were characterized to be a relatively small monomodular enzyme protein with a molecular weight of about 37 and 42 kDa [45,127]. Erickson [93] viewed that most single modular proteins would fold into one globular domain and polypeptides larger than 50 kDa typically form two or more domains. Thus, small molecular weight monomodular cellulases likely form various spheric shapes with a different radius size while bi-modular, larger molecular weight cellulases tend to show ellipsoid and tadpole shapes or other irregular shapes [97,98]. As shown by Wang et al. (2019), the predicted GH5-p4818Cel5_2A endocellulase structure resembles a globular shape. Wang et al. [45] reported that GH5-p4818Cel5_2A was estimated to have a diameter of 4.6 nm calculated according to the model by Erickson [93]. Thus, GH5-tCel5A1, as reported by Basit and Akhtar [127], would have even a much smaller diameter compared with the GH5-p4818Cel5_2A. Carpita et al. [18] reported that pore size diameters in various plant cell wall materials ranged between 3.5 and 5.2 nm. Several studies documented that physiochemically pretreated lignocellulosic biomass materials and crystalline cellulose model substrates were associated with much larger pore diameters, ranging between 5.1-11.0 nm [94][95][96][97]. Thus, these GH5-tCel5A1 and GH5-p4818Cel5_2A processive cellulases should be effective in penetrating into microfibrillar internal surfaces of raw biomass materials such as fibre feeds and feedstocks because of its small molecular weight and size.
Very few endocellulases have been reported in the current literature with the combined biochemical features of having a single module, a small molecular weight and size and being highly active in hydrolyzing crystalline celluloses and/or hemicelluloses. Cohen et al. [118] reported a novel single modular processive GH5 cellulase Cel5A with a molecular weight at 42 kDa from the brown rot basidomycete Postia placenta, which had about 57-fold lower activity on Avicel in comparison with the novel GH5-p4818Cel5_2A endocellulase reported by Wang et al. [45]. Watson et al. [133] reported that the Saccharophagus degradans GH5 processive endoglucanase Cel5H had a molecular weight of 68 kDa and was bi-modular. Bao et al. [54] reported a novel single modular processive GH5 cellulase with a predicted molecular weight of 40 kDa from yak rumen metagenome with activity (0.200 U/mg protein) on Avicel as well as hydrolysis of β-glucans, which is, in our review, the only reported endocellulase with a small molecular weight and activity in hydrolyzing crystalline celluloses comparable to this novel GH5-p4818Cel5_2A. However, it should be pointed out that this yak ruminal processive GH5 cellulase only had activities on β-1,3/1,4-glucans including soluble and insoluble celluloses and had no activities towards other hemicelluloses [54]. Nguyen et al. [55] reported a novel GH5 cellulase with a molecular weight of 60 kDa from buffalo rumen metagenome without specifying if this enzyme had a distinct second CBM. Several other well characterized GH9 cellulases were also bi-modular, including the Thermobifida fusca processive endoglucanase Cel9A with a molecular weight of 90.2 kDa [120], the Clostridium thermocellum noncellulosomal GH9 processive endocellulase with a molecular weight of 93 kDa [131], the ruminal Fibrobacter succinogenes exocellulase GH9D with a molecular weight of 76 kDa [69], and the Clostridium Phytofermentans GH9 thermophilic processive endocellulase with a molecular weight of 104.9 kDa [135]. Moreover, a bi-modular, bi-functional endo-acting and exo-acting processive thermophilic Caldicellulosiruptor bescii endocellulase CelA was characterized with a large molecular weight of 230 kDa by Zverlov et al. [103]. This thermophilic Caldicellulosiruptor bescii endocellulase CelA was further characterized to have an estimated effective size of 10-35 nm, and was highly regarded as an outstanding potential industrial cellulase for digesting raw feedstocks because of its unique mechanism of cellulose degradation [77].
Therefore, to the best of our knowledge, these novel GH5-tCel5A1 and GH5-p4818Cel5_2A processive endcellulases are among the only reported processive endcellulases documented so far that are featured to be a monomodular and a small molecular weight endocellulase with high activity on crystalline celluloses and multi-functional hydrolytic activities towards hemicelluloses, and should be highly penetrating in effectively digesting raw biomass materials such as fibre feeds and feedstocks as ideal industrial biocatalysts.

Structure and Functionality of the Newly Characterized Processive Endocellulases
Unique functionality of enzyme proteins is supported by their unique structures. It has been well documented that substrate specificity and the mode of action of various glycosyl hydrolases are governed by exquisite details of their three-dimensional structures rather than their global fold with three active-site topologies of (i) a pocket or crater; and (ii) a cleft or grove; and (iii) a tunnel [184]. These unique GH5-tCel5A1 and GH5-p4818Cel5_2A processive cellulases have three major unique biochemical properties within one enzyme molecule in comparison with other endocellulases reported in the literature. Firstly, GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulase are small monomodular enzyme proteins with a molecular weight (below 50 kDa) and relatively high activity on crystalline celluloses [45,127]. Only two other reported studies in the literature have so far shown that an endocellulase with a single module and such a comparable small molecular weight and size could hydrolyze crystalline cellulose substrates, including the study by Cohen et al. (2005) [118] on the Cel5A from the brown rot basidomycete Postia placenta; the study by Bao et al. (2011) [54] on a yak ruminal GH5 cellulase. Secondly, both GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases have exocellulase activity on crystalline celluloses thus are a processive endocellulases. Only two other reported studies in the literature have shown that a processive endocellulase without a CBM could hydrolyze crystalline cellulose substrates, including the study by Cohen et al. [118] on the Cel5A from the brown rot basidomycete Postia placenta, and the study by Bao et al. [54] on a yak ruminal GH5 cellulase. Classic processive endocellulases are all primarily GH9 cellulases and they all contain a CBM, including the Clostridium thermocellum noncellulosomal endoglucanase GH9B cellulase GH9B [131], the Thermobifida fusca processive endoglucanase GH9A (E4) [185], and the Clostridium Phytofermentans processive endoglucanase Cel9 [135]. Structural analyses by Sakon et al. [185] show that the presence of a CBM was essential for the processive nature of these GH9 endocellulases. However, Watson et al. [133] reported that the presence of CBM was not essential for the processive property in the Saccharophagus degradans GH5 processive endoglucanases. Third, apart from hydrolyzing crystalline celluloses, both GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases have been shown to have activities on other hemicelluloses, i.e., β-glucans, xylan, xyloglucans, mannan, glucomannan and galactomannan. To the best of our knowledge, no other cellulases have been reported with such wide substrate specificity in the literature.
Sequence alignments of several different GH5 processive cellulases, including both GH5-tCel5A1 and GH5-p4818Cel5_2A, had been previously illustrated in the supplemental Figure- [45]. Attempts have been made to predict the structure of this GH5-p4818Cel5_2A endocellulase with the SWISS-MODEL online server by Wang et al. [45]. This predicted TIM barrel fold of this cellulase GH5-p4818Cel5_2A is consistent with the TIM barrel motif superfamily previously reported in the GH5 endocellulase CelA from Clostridium cellulolyticum [186]. A cleft or groove structure was apparently seen in this GH5-p4818Cel5_2A endocellulase as was shown in the Clostridium cellulolyticum GH5 endocellulase CelA by Ducros et al. [186] and reviewed by Davies and Henrissat [184]. High activity of GH5-p4818Cel5_2A on the branched substrate xyloglucan and locust bean gum, as summarized in Table 1, would suggest the presence of a cleft active site in this cellulase structure, since branched substrates such as xyloglucans and locust bean gum could not effectively fit into a tunnel-like or a pocket active site topology [162]. Yaoi et al. [187] further demonstrated that a cleft active site was also essential to have exo-mode hydrolysis of xyloglucans by an endocellulase to engage the hydrolytic action. The double Glu residues as the proton donor in the putative catalytic acid/base and nucleophile are at Glu136, and are consistent with the strictly conserved residues within GH5 catalytic domain functions [186]. Intriguingly, a long tunnel-like active site with multiple Trp residues is clearly visualized along the enzyme protein surface and is located from the C-terminus of the β-strands or the TIM barrel [45]. This predicted long tunnel-like active site topology for this novel GH5-p4818Cel5_2A has provided additional evidence that this novel GH5-p4818Cel5_2A has exo-acting mode, and thus is a processive GH5 endocellulase [45]. All exocellulases are processive [73]. Exocellulases have a tunnel-like active site topology [72,188,189]. However, Sakon et al. [185] shown that the processive GH9A (E4) endocellulase Thermobifida fusca had a cleft and not a tunnel-like active site topology, and a CBM was essential for this enzyme's activity on crystalline celluloses. McGrath and Wilson [132] discussed that substrate binding or the formation of cellulase-substrate complexes was essential to characterize cellulase structure and catalytic relationship. Giving the multi-functional properties of this novel GH5-p4818Cel5_2A endocellulase, crystallographic ensembles to be derived from multiple substrate binding states should be generated and used to study its functional structural variations [190]. The structure and functionality of have been revealed in details for GH5-tCel5A1 [127,191]. Thus, it is very likely that GH5-p4818Cel5_2A will be unique with a combined cleft and a tunnel-like active site topology in its structure to support its superb multifunctionality, which needs to be resolved in our future studies. Further crystallization and structural characterization of this GH5-p4818Cel5_2A processive cellulase will facilitate further rationale design-based mutagenesis to improve functionality of these enzymes, such as resistance to mammalian proteases.
To facilitate further effective adoption of these two unique GH5-tCel5A1 and GH5-p4818Cel5_2A processive endocellulases for industrial applications, the following three aspects of research need to be pursued. Firstly, further enhancement of these wild-type of both GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulase specific activities on crystalline celluloses and hemicelluloses will need to be conducted. Harris et al. [168] articulated that innovations at a small-scale through enzyme discovery in the laboratory could exert largescale impacts when rolled out in industrial applications as demonstrated in commercial ethanol production. Lynd et al. [24] reviewed that enzyme and pretreatment expenses represent about up to 48% of the total cost of commercial lignocellulosic ethanol production. Directed evolution is a very powerful protein-engineering tool for improving enzyme properties such as specific activity and thermostability without in-depth immediate understanding of enzyme protein structure and enzyme-substrate interactions [78]. Given the fact that the GH9 Clostridium Phytofermentans processive endocellulase had a large molecular weight (over 100 kDa) and was multi-modular, Ahmad et al. [79] demonstrated this by using B. subtilis as a host and applying directed evolution for improving the processive GH9 endocellulase activity by about two-fold from the Clostridium Phytofermentans on the solid cellulosic substrate. Efforts ought to be made to further enhance both GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulase activity by directed evolution and using B. subtilis as a host.
Upon resolving detailed structural and functional relationships for GH5-p4818Cel5_2A in future, a rational design approach can also be undertaken to further optimize GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulase properties [175,177,192], considering their small molecular size and monomodular features. Secondly, these unique GH5-Cel5A1 and GH5-p4818Cel5_2A endocellulases are currently over-expressed in non-pathogenic E. coli competent cells for laboratory characterization, and advantages of these competent cells for characterization of novel enzymes have been reviewed by Uchiyama and Miyazaki [193]. Various engineered E. coli cells, including an endotoxin-free E. coli strain, referred to as the ClearColi ® BL21 (DE3) as developed by Mamat et al. (2015), are widely used for commercial scale of pharmacological polypeptide & enzyme product production [194]. Use of E. coli competent cells for commercial scale of industrial biomass degradation enzyme production is being promoted [195]; however, governmental regulatory body approval would be needed if resulting biomass enzymes are to be used as exogenous fibre enzyme feed supplements. Several microbial platforms have been developed and used for industrial and biorefinery based commercial enzyme production via the solid state fermentation (SSF) or submerged fermentation (SmF) types of microbial cultivation, including Trichoderma reesei and/or Trichoderma longibrachiatum by Novozymes, Genencor-Danisco-Dupont, Dyadic International, AB Vista Enzymes, Biocatalysts Limited, and the formerly Iogen Bioproducts (purchased by Novozymes in 2013); Aspergillus niger by Novozymes and Amano Enzymes Inc.; and Bacillus sp. by Maps Limited, and Specialty Enzymes & Biotechnologies Co. (SEB) [50]. A logic next step of research would be to engage biological engineering and expressing of these two unique GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases into one of the above commercial platforms of microbes, i.e., Trichoderma reesei, Trichoderma longibrachiatum, Aspergillus niger or Bacillus sp., by working with one or more of the major industrial enzyme companies for their potential commercial production.
The concept of consolidated bio-processing (CBP) of de-polymerization of structural carbohydrates into their monomeric constituents and the fermentation of hexose and pentose sugars into ethanol in a single step in one platform microorganism has been proposed and applied for production of lignocellulosic ethanol and other biochemicals. Lynd et al. [24] reviewed that an effective CBP alone could account for about 41% of the total cost of commercial lignocellulosic ethanol production. Several microbial platforms have also been researched and developed for potential industrial applications, including the engineered CBP organism Clostridium Phytofermentans (ATCC 700394) [196], the engineered CBP organism Bacillus sp. [76] and the ethanol fermenting yeasts such as Pichia stipites and Saccharomyces cerevisiae [50,197]. Thus, direct engineering of optimized versions of these two unique GH5-tCel5A1 and GH5-p4818Cel5_2A endocellulases along with a highly active βglucosidase gene into the CBP organism Bacillus sp. or the Pichia stipites or the Saccharomyces cerevisiae system would potentially allow the development of a cost-effective CBP system for commercial production of lignocellulosic ethanol from raw feedstock materials.  In summary, while the gene-cloning and species-targeted genome sequencing techniques have been traditionally applied for identifying and characterizing cellulase genes, the metagenomic approaches, including the gene-centric metagenomic analysis and the metagenomic expression library screening, have further widened the opportunities for the discovery and characterization of novel microbial cellulase genes from various environmental resources. Upon reviewing the present literature of cellulase discovery and characterization, two recently characterized unique GH5-tCel5A1 and GH5-p4818Cel5_2A processive endocellulases are a small molecular weight, monomodular and processive β-1,4-endoglucanase with an estimated spherical diameter at 4.6 nm or smaller. These two unique GH5-tCel5A1 and GH5-p4818Cel5_2A processive endocellulases are highly active on crystalline and pre-treated celluloses and have multifunctionality towards several hemicelluloses including β-glucans, xylan, xylogulcans, mannans, galactomannans and glucomannans. Therefore, these two unique GH5-tCel5A1 and GH5-p4818Cel5_2A processive endocellulases have novel structural and functional properties for potential important industrial applications.