Integrated -Omics: A Powerful Approach to Understanding the Heterogeneous Lignification of Fibre Crops

Lignin and cellulose represent the two main components of plant secondary walls and the most abundant polymers on Earth. Quantitatively one of the principal products of the phenylpropanoid pathway, lignin confers high mechanical strength and hydrophobicity to plant walls, thus enabling erect growth and high-pressure water transport in the vessels. Lignin is characterized by a high natural heterogeneity in its composition and abundance in plant secondary cell walls, even in the different tissues of the same plant. A typical example is the stem of fibre crops, which shows a lignified core enveloped by a cellulosic, lignin-poor cortex. Despite the great value of fibre crops for humanity, however, still little is known on the mechanisms controlling their cell wall biogenesis, and particularly, what regulates their spatially-defined lignification pattern. Given the chemical complexity and the heterogeneous composition of fibre crops’ secondary walls, only the use of multidisciplinary approaches can convey an integrated picture and provide exhaustive information covering different levels of biological complexity. The present review highlights the importance of combining high throughput -omics approaches to get a complete understanding of the factors regulating the lignification heterogeneity typical of fibre crops.


Introduction
In the list of highly sought-after goals for a sustainable exploitation of natural resources, the use of plant-based material to replace fossil carbon-derived products ranks undoubtedly among the top. The increase in the world population, together with rapid industrial development, push towards the depletion of petrochemicals and the subsequent need of finding sustainable sources of raw materials. Plant-sourced raw material, i.e., lignocellulosic biomass, is such a renewable resource that can reduce our dependence on fossil carbon. Plant cell walls are indeed valuable natural products which find a wide array of industrial applications, spanning from conversion into energy to material development. The food-oriented agricultural industry is progressively being redirected towards one ensuring also raw material for pulp and paper, textiles, construction and biofuels [1][2][3]. This aspect is particularly important, as it offers an alternative to forest-sourced biomass. In view of the foreseen deficit in wood caused by the decreasing capacity of the world's forest to supply woody biomass [4,5], much effort is devoted towards finding other sources of renewable raw material. The use of non-woody plants, as well as agricultural byproducts (e.g., corn, wheat, rice, sorghum, barley, sugarcane, pineapple, banana, coconut [6]), which can supply lignocellulosic biomass for different industrial needs, is therefore receiving increasing scientific and societal interest. The advantages of non-woody plants are clear: short growth cycles, moderate irrigation requirements, relatively low lignin content [1,3] are emblematic examples. Although agricultural byproducts such as rice husks can be used for the production of energy, the consistent quality required for most industrial processes can only be met by the cultivation of plants specifically grown for fibres, collectively known as fibre crops. These are non-woody fibres, which hold great potential as sources of raw materials for different industrial sectors. A prime example of the impact of fibre composition on the usability of plant biomass is offered by alfalfa (Medicago sativa), that has been selected for centuries for a decreasing lignin-content, to increase the digestibility of this queen of forages.
The fibrous mass of these plants is characterized by a low lignin content and by the occurrence of fibres highly enriched in cellulose. These features make them extremely appealing not only as models to study secondary cell wall formation, but also as sources of valuable raw materials used in different industrial sectors. Their fibres (called bast fibres) show physical properties (namely tensile strength) which make them optimal for the woven textile industry, but at the same time their low lignin content is a desirable feature for biomass saccharification. Their woody core fibres (a.k.a hurd or shives) are excellent for paper production, as insulating material, as fibrous component in concrete and wall coatings and once crushed to powder they can be used to make biodegradable plastics. It is hence clear that progressing in an integrated study of the mechanisms which regulate the tissue-specific secondary cell wall composition of fibre crops can help devise strategies to maximize the exploitation of their lignocellulosic biomass.

Plant Fibres: Nature's Treasure Trove
Plant lignocellulosic fibres hold great value both for the environment and the economy, as they favor the shift towards a bio-based economy. Moreover, they shape important plant features, as the mechanical properties of wood and the quality of textiles and paper. True fibres are cells of the sclerenchyma, i.e., plants' mechanical tissue [9], characterized by the presence of secondary cell walls. Several aspects of fibre ontogenesis are still not understood [10,11]; however studies on model plants have proved very useful to shed light on some mechanisms. It is here worth mentioning the work carried out on Arabidopsis thaliana which has delivered milestone data concerning, for instance, the cascade regulating the transcriptional wiring of secondary cell wall formation [12][13][14][15][16][17][18][19][20][21][22][23][24][25][26]. Moreover, A. thaliana offers a wide collection of cell wall mutants' [27][28][29][30][31][32][33][34][35][36][37] and by modulating its growth conditions (as photoperiod, temperature, hormone treatment [10]), it is possible to affect fibre formation. It was furthermore demonstrated that this herbaceous model plant can be a useful system for the study of gelatinous fibres' (G-fibres) formation [38]. The limits in the study of fibre formation encountered with Arabidopsis can be overcome with poplar, which has contributed to gain a deep understanding on the formation of woody fibres in trees [39][40][41][42][43][44][45]. The availability of the genome [46], together with the possibility of downregulating gene expression [47], makes poplar an attractive model for fibre secondary cell wall formation. However, a detailed study on isolated fibres is challenging, because they are usually found mixed with other cell types [11], and therefore a preparation of pure poplar fibres is very difficult to attain. Important data concerning secondary cell wall biogenesis come also from the legume model plant M. truncatula [48][49][50][51], which was indeed shown to be an appropriate system for cell wall studies.
Fibre crops, principally hemp and flax for which the genomes are available [52,53], are excellent systems for the analysis of fibre formation and the great advantage they offer is the possibility of studying tissues with dramatically different lignification patterns in the same plant. The stem of these fibre crops shows a distinct anatomical structure which makes them particularly suitable for molecular studies. Their cortical bast fibres, characterized by the occurrence of a high percentage of cellulose and a low lignin content, envelop a central lignified core (Figure 1), a radial distribution that favors the study of tissues with contrasting composition that are isogenic and grown under exactly identical conditions. Therefore the application of high throughput holistic (proteomics, metabolomics, transcriptomics), as well as targeted approaches (expression analysis of a subset of genes, targeted metabolite profiling), is feasible. The outer tissue can be peeled off from the woody core and the isolated material is devoid of "contamination" coming from the lignified inner tissue. The use of an integrated biology approach can help to shed light on the molecular processes which intervene in the two tissues and determine the contrasting nature of secondary cell walls' composition. Although some molecular studies (mainly concerning flax and hemp tissues) are available in literature (summarized in Table 1), a systems biology approach integrating multilevel -omics data is still missing. In the last decade the approach to biological studies has changed and these are now conducted at the systems level [54,55], through the integration of data from different sources (i.e., integrative systems biology). The integration of all these data flows into a general model which, applied to fibre crops, can provide a depth of analysis never so far attained.

Lignin: The Lord of the (Aromatic) Rings
Plant secondary metabolism comprises an array of complex pathways which shape the biochemical plasticity of plants. Considered dispensable for plant survival, it fuels cells with important organic compounds which mediate relevant events, as for instance defense and structural support. Within plant secondary metabolism, the phenylpropanoid pathway undoubtedly occupies a central position, as it generates the common precursor, p-coumaryl CoA, for the biosynthesis of flavonoids and lignin building blocks. Molecules from this pathway are involved in structural support (lignin), redox homeostasis (chlorogenic acid), reproduction (flavonoids as flower pigments) and many other aspects of plant biology [56,57]. The importance of this metabolic route is evidenced by numerous recent studies aimed at increasing our understanding of this pathway and often related to redirecting cell wall lignification through genetic manipulation [58][59][60][61][62][63]. The main goal of such studies was to increase the digestibility of plant biomass favoring bio-ethanol production, either through a transgenic approach [64][65][66][67][68][69][70][71], or through the incorporation of monolignol analogs/substitutes [72][73][74]. secondary metabolism-associated genes highly represented in stems; 3 FLAs, laccases overexpressed in core tissue; 3 FLAs overexpressed in outer stem tissue; 6 LTPs, lipid/wax metabolism and photosynthesis-associated genes overexpressed in outer tissue; cultivars differ in cell wall-and biotic stress response-related genes [84]  Lignin, a polyaromatic, hydrophobic, heterogeneous biopolymer, is the main product of plant secondary metabolism, and is in mass globally produced second only after cellulose. Its structural complexity confers an intrinsic "chaotic" nature and often theoretical analyses spanning from fractal geometry to deterministic chaos concepts are invoked to describe the multi-level intricacy of its organization [91]. This surprising structural complexity constitutes a sort of chemical "fingerprint": It might be virtually impossible to find in nature two identical lignin macromolecules with the same degree of polymerization or succession of phenylpropane units, and therefore it is often preferred to refer to this aromatic polymer as "lignins" [91]. Recently, improvements to techniques such as atomic force microscopy, FTIR spectroscopy [92], and NMR [93] have greatly helped in fostering the structural study of lignin.
Lignin constitutes an inspiring model for the creation of biomimetic materials with desirable physical properties and this is an issue of particular relevance for the construction sector, which is shifting towards an eco-friendly building concept, through the use of plant-sourced materials instead of petrochemicals. Lignin contributes indeed to the elastic modulus of timber while conferring strength, and is water-proof.
The physiological role of lignin is that of biological "glue" for wall polysaccharides, and its presence impairs saccharification of plant biomass, as it impregnates cellulose, thus hindering its accessibility to hydrolytic enzymes. However, its function is to ensure vascular integrity, fortify the cell wall against pathogen attack and to confer increased hydrophobicity and mechanical rigidity.
Plant secondary cell walls show a sort of "promiscuity" towards lignin structures: They can tolerate variation in the polymer macrostructure, therefore providing extreme flexibility in situations of environmental stress or in case of transgene-induced modifications. The enzymes which intervene in the lignin biosynthetic pathway, the efficiency in energy and carbon retention of which were recently calculated [94], constitute a metabolic "grid" [95,96] which ensures plasticity in case of natural or artificial modifications. Multigene families indeed exist [97][98][99][100][101] and isozymes can function in case of alterations in the pathway [96].
Metabolic channeling in the phenylpropanoid pathway has been demonstrated for phenylalanine ammonia-lyase (PAL) and cinnamate-4-hydroxylase (C4H) [102,103] and recent evidence in literature has shown the existence of protein-protein/protein-membrane interactions for CYP73A5 and CYP98A3, two Arabidopsis cytochrome P450 catalyzing the hydroxylation of the phenolic ring of monolignols [104]. The enzymes involved in the pathway seem therefore to present physical cross-talk which fine-tunes the pathway.
Although the availability of different plant mutants defective in genes involved in the lignin deposition route [105][106][107] has contributed to shed light on some of its biosynthetic aspects, certain events, as for instance polymerization and precursors' export, are only starting to be tackled [108,109]. For a recent review on the knowledge on lignin deposition and the gaps in the latter see Liu [110]. The analysis of plants with altered lignification by means of an integrated approach can provide valuable data and can fill the gaps still present in the study of secondary cell wall formation in plants.

The Particular Cell Wall of Bast Fibres
A great heterogeneity in wall lignification exists in nature: Among different species, in the different tissues within the same plant [111] and in response to biotic/abiotic stresses [96,111,112]. Particularly rich in lignin heterogeneity are plant fibres. One of the most striking examples of differential lignification (both qualitative and quantitative) within the same plant is found in fibre crops. These plants contain small amounts of lignin, which is abundant in the core, while the cortex harbors bundles of bast fibres particularly rich in cellulose.
Bast fibres are cortical extraxylary sclerenchymatous structures which provide mechanical support to the conductive elements of the phloem and are typically present in bundles held together by pectins and lignin. Their cell walls are composed primarily of cellulose (which can make up 75%-80% of the dry mass in C. sativa), hemicelluloses (5%-16%) and pectin (1%-4%) [113], while lignin is a minor component (2%-7%), associated to the middle lamella. Some fibre crops, like hemp, display the presence of both primary and secondary bast fibers: Primary fibres are procambium-derived long (from 20 to more than 100 mm of length) and strong fibres containing little lignin, while secondary fibres are shorter (around 2 mm in length), more lignified and derive from the cambium. From the cellular point of view, bast fibres are unique, since they are characterized by the occurrence of a gelatinous type of secondary cell wall (the so called G-layer), which is rich in crystalline cellulose with low microfibril angle [114,115]. For a summary of the main physical parameters of bast fibres, see Table 2. G-layers are usually found in reaction wood (i.e., tension wood) in response to a mechanical stimulus and can generate high tensional stress. However bast fibres do not play the same contractile role as tension wood [117,118] and are characterized by the occurrence of a specific type of -1,4-galactan [119]. The cell wall chemistry, together with their mechanical features, make bast fibres highly attractive and valuable for the development of biomimetic material and for use as feedstocks with desirable improved saccharification traits [120].
The formation of bast fibres takes place through intrusive growth, a mechanism which increases the number of fibres in cross sections, while keeping their total number unchanged within a specific stem segment [121]. The mechanism of primary fibres' intrusive growth has been studied in more detail than that of secondary fibres and it was shown that the procambium close to the apical meristem is the starting site of primary fibres' elongation, which in flax was proposed to occur through diffused growth [11,[121][122][123]. Studies on flax bast fibres have shown the occurrence of a galactan-rich layer (Gn-layer) which is progressively modified through the activity of -galactosidase into the cellulosic G-layer [83]. Moreover the process of G-layer formation in flax is preceded by the accumulation of "bicolor" Golgi vesicles which fuse forming large vacuoles [124,125]. These vacuoles fuse then with the plasma membrane and release their cargo in a "syringe-like" fashion; this mechanism ensures the maximal delivery of the vesicles containing matrix polymers of the Gn layer to sites where cellulose is present, when the cell wall is still in its construction phase [125]. The stem of fibre crops shows not only a radial distribution of lignification pattern (the lignification degree increases as we move centripetally in the stem), but also a longitudinal distribution of cell wall metabolic stages. Along the axis of flax, it was demonstrated that a physically distinctive point is present, called the "snap point" [122] which is a zone where the mechanical properties of fibre cells change dramatically. The region above the snap point is characterized by cell elongation through intrusive growth, while below it cell wall thickening takes place. Below the snap point cell walls show a typical bipartite appearance [83,122,124]: A layer with loosely-packed structure (that is the Gn-layer) and one more ordered and homogeneous (the G-layer). At maturity the Gn-layer is remodeled into a G-layer. Interestingly, the lignin present in flax and hemp bast fibres is condensed [3,76,78], with a high occurrence of G and H units (H units are 5% in the core and 13% in flax bast fibres). This characteristic, together with the high crystallinity of cellulose, might have consequences on the digestibility of fibre crops' cellulose, which, although only enrobed by a low amount of lignin, is extremely recalcitrant to saccharification.
It was recently shown that the typical hypolignification of flax stem is accompanied by the high accumulation of phenolics compounds and by a very active monolignol metabolism [86]. Using lignomics, 81 different phenolic compounds were identified in the outer stem tissues, 65 of which were reported for the first time in flax. These results show the complexity of phenylpropanoid metabolism in flax stems (and most likely in other fibre crops), which is transcriptionally regulated and translates into a range of organic compounds belonging to intricated metabolic routes which are still not fully unveiled.

Integrated -Omics to Study Differential Lignification Patterns in Fibre Crops
Secondary cell wall biosynthesis is a good example of a complex physiological system, as it involves the coordinated expression of cellular activities at different biological levels (i.e., transcriptional, protein-and metabolite-level). High throughput analyses can deliver a wealth of data, but generally only convey partial information [126]. In this post-genomic era, where the amazing progress in sequencing allows us to reach resolutions and speed never before attained, the study of organisms has moved towards an integrated vision, which tries to explain cellular responses to natural or induced perturbations (i.e., a certain phenotype) by combining high throughput -omics data [127]. This shift towards systems biology is witnessed by the progressively increasing number of bioinformatics tools/portals enabling the integration of -omics data (e.g., Paintomics [128]; MADMAX, [129]; BESC, [130]; COVAIN [131]). The availability of microarray and metabolomics data allows not only the in silico analysis of gene co-expression networks, but also of gene-to-metabolite connections [126]. This meta-analysis of -omics data can be further extended to genomics and proteomics, thus enabling for instance metabotyping at the ecotype-level [126,132]. DoE (design of experiments) and multivariate analysis can then be applied to systems biology to better interpret the data and deduce models [133].
Only a handful of studies are available in literature on fibre crops' transcriptomics (summarized in Table 1). This is probably due to the lack of genomes for many fibre crops, which makes bioinformatics analysis more challenging, although molecular studies aimed at amplifying and studying genes involved in lignin and cellulose biosynthesis from fibre crops are present in literature [134][135][136]. The majority of the transcriptomics data come from microarray and cDNA libraries studies on flax and hemp aiming at understanding the molecular events behind cell wall dynamics in stem/hypocotyls tissues [77,79,81,[83][84][85][86]. These studies have shown how genes involved in cell wall biosynthesis and remodelling display typical expression patterns which mark specific stages of development in the different stem tissues and along different regions of the stem axis. Hypolignification was indeed demonstrated to be associated with the low abundance of monolignol biosynthetic gene transcripts, laccases and some peroxidases in stem outer tissues and to the occurrence of transcription factors involved in lignin biosynthetic pathway repression (i.e., orthologs of AtMYB4, EgMYB1, ZmMYB31/42 [86]). The establishment of the parallelism between sequential stages of hypocotyl development (namely 7, 9 and 15 days) and the differentiation of bast fibres along the top, middle and bottom region of the stem [83], has offered a supplementary tool in the study of phloem fibre development. It was indeed demonstrated that flax phloem fibres development is strictly synchronized with hypocotyls elongation and that the hypocotyl and stem show a set of common genes activated during their specific developmental/differentiation stages. Elongating hypocotyls showed enrichment in transcripts involved in photosynthesis, transport, hormone signaling, as well as primary cell wall deposition [83]. Arabinogalactan proteins (AGPs), β-galactosidase were enriched in the transition between elongation and secondary wall deposition, while chitinases and glycosyhydrolases (GHs) like KORRIGAN were abundant at later stages [83].
The availability of next generation sequencing (NGS) techniques is now only starting to be applied to fibre crops to perform for instance genome-wide SNPs discovery (which can facilitate (Quantitative Trait Loci) QTL identification [137]) and to carry out de novo transcriptome assembly to identify crucial cell wall-related genes, as cellulose synthases (CesAs) [90]. Being a sequencing-based technology, NGS can reach accuracies which go beyond the limits imposed by the microarray technology. It widens the limits of detection to low-abundant transcripts and no previous knowledge of genomes is necessary, as it can be used to create sequence assemblies, to map reads, exon/intron boundaries, splicing variants [138]. NGS technologies are also finding fertile fields in the study of microRNA (miRNA and the so-called "miRNomics"). Increasing evidence on the role of miRNAs in the regulation of cell wall metabolism do exist in literature [87,139,140], and they constitute very interesting targets for engineering plants secondary cell walls to meet industrial and bioenergetic needs [139]. In poplar, for instance, the presence of unique miRNAs (miRX41) targeting a cellulose-synthase like gene, as well as NAC transcription factors and a xyloglucan endotransglycosylase/hydrolase (XTH16) (miRX50), have been recently identified through SOLiD ABI sequencing technology [139]. A previous study showed also that poplar-specific miRNAs (not present in Arabidopsis) responsive to mechanical stress exist, and that they are associated with the biosynthesis of cell wall metabolites [141]. The sensitivity of NGS applied to the mining of miRNAs involved in cell wall metabolism in fibre crops' stems can help identify novel candidates targeting transcription factors and/or genes involved in the heterogeneous lignification pattern observed.
As for transcript and metabolite profiling, most work on the proteome abundance in fibre crops has been focused on flax. Recently Day et al. reported the identification of 152 predicted cell wall proteins in flax stems using a sequential salt extraction [89]. The majority of cell wall proteins is involved in sugar and/or glycoprotein-related events and includes numerous glycosidases but also chitinases [89]. Of the proteins known to be involved in lignin deposition and thus secondary metabolism, several class III peroxidases were identified [142,143], the gene expression of which was previously shown to be different between outer-and inner-(hypolignified) flax stem tissues [88]. One of these peroxidases was previously identified as being a cell wall protein in another study on flax [86].
While the previous studies isolated the different tissues or compared the proteome of the stems at different developmental stages, the most targeted study on fibre proteome was reported by Hotte and Deyholos [82]. In this study DIGE was used to compare the protein abundance in isolated fibres versus the surrounding cortical tissue. From the 240 spots that were more intensely stained in the fibres, corresponding to enriched proteins, 51 were identified [82]. This study indicates that fibres are, as expected, devoid of proteins related to photosynthesis and enriched in proteins involved in cell wall deposition. Of the 10 proteins with the highest enrichment factor, 7 are directly involved in cell wall polysaccharide metabolism with the most important one being β-galactosidase. Another enzyme involved in cell wall polysaccharide metabolism is rhamnose biosynthetic enzyme. Fructokinase was also identified in several spots that were more abundant in the fibres. Given the low lignin content in early stage developing flax fibres, just below the snap point, it is not surprising that no enzymes related to phenylpropanoid metabolism were found. However the enrichment of a secretory peroxidase might indicate the start of lignin deposition.
Although proteomics has proven its power to contribute to the elucidation of pathways activated during cell wall development, the enzymes responsible for the biosynthesis of secondary metabolites, mainly the phenylpropanoid pathway with monolignols as major product, are of low abundance and therefore generally not detected. Furthermore key steps in phenylpropanoid metabolism are catalyzed by cytochrome P450-like membrane-anchored proteins [144,145], making dedicated approaches necessary for their proteome-level study.

Conclusions and Future Perspectives
The stem of fibre crops represents one of the best examples in nature of contrasting lignification pattern existing within the same plant. This heterogeneous secondary cell wall composition allows the use of fibre crops' lignocellulosic biomass for different industrial sectors. Although recent studies have contributed to shed light on some aspects controlling the different lignification profile in these plants, an integrative study obtained by simultaneously interrogating the organisms at different levels of biological complexity is still missing. To fully understand the molecular events responsible for secondary cell wall biogenesis in fibre crops, it will be necessary to integrate high throughput -omics analyses with accurate experimental design, bioinformatic tools for meta-analysis and co-visualization of the data and mathematically-supported models inference. The integrated knowledge thus obtained can be applied to test the models through functional genomics: Modifications of cell wall metabolic networks can be engineered and used to improve the extractability of fibres (i.e., retting), the agronomic traits of existing cultivars, or to create application-specific fibres.