The Importance of Extraction Protocol on the Analysis of Novel Waste Sources of Lignocellulosic Biomass

: As the utilization and consumption of lignocellulosic biomass increases, so too will the need for an adequate supply of feedstock. To meet these needs, novel waste feedstock materials will need to be utilized. Exploitation of these novel feedstocks will require information both on the effects of solvent extraction on the succeeding analysis of potential novel feedstocks and how accurate current methodologies are in determining the composition of novel lignocellulosic feedstocks, particularly the carbohydrate and lignin fractions. In this study, the effects of solvent extraction on novel feedstocks, including tree foliage, tree bark and spent mushroom compost, with 95% ethanol, water and both sequentially were examined. Chemical analyses were carried out to determine the moisture content, ash, extractives, post-hydrolysis sugars, Klason lignin (KL) and acid-soluble lignin (ASL) within the selected feedstocks. The result of extraction could be seen most strongly for Klason lignin, with a strong association between higher levels of Klason lignin levels and greater amounts of non-removed extractives (tree foliage and bark). Higher Klason lignin levels are reported to be due the condensation of non-removed extractives during hydrolysis, hence the lower Klason lignin determinations following extraction are more exact. In addition, total sugar determinations were lower following extractions. This is because of the solubility of non-cell-wall carbohydrates; thus, the determinations following extraction are more accurate representations of structural cell-wall polysaccharides such as cellulose. Such determinations will assist in determining the best way to utilize novel feedstocks such as those analyzed in this work.


Introduction
The need for sustainable and environmentally friendly sources of fuel and platform chemicals has become critically important in the energy field [1]. Use of non-fossil-fuelbased chemicals and transport fuels will be required to meet targets proposed by the US government and the EU for the increased use of renewable energy [2]. Biomass-derived biofuels are predicted to make up a substantial portion of these non-fossil-fuel-based chemicals and transport fuels. Second-generation feedstocks ("lignocellulosic feedstocks"-cellulose, hemicellulose and lignin) deriving from non-food plants, agricultural residues and municipal wastes offer several advantages over first-generation feedstocks: they do not use high-quality arable farmland, are not considered food sources, and are considered to be economically attractive being both low cost and abundant [1,3,4].
Key to exploiting lignocellulosic biomass is the depolymerization of the lignocellulosic matrix to obtain smaller molecules (such as glucose from cellulose) that can be used or undergo additional transformation into to platform chemicals and biofuels [3]. Despite the abundance of lignin within lignocellulosic materials, the major focus of biorefining has been on the utilization of cellulose and hemicellulose. However, the economic and environmental sustainability of biorefineries will require the valorization of lignin along with the carbohydrate fractions into the future [5].
As the utilization and consumption of lignocellulosic biomass increase, so too will the need for an adequate supply of feedstock. Studies have projected that 476 million tons of lignocellulosic biomass will be required to meet the growth in the market for bio-based products by 2030. To meet these needs, the wide-scale utilization of non-traditional waste biomass sources will be required [6]. Non-traditional lignocellulosic feedstocks outside of the traditional agricultural residues and energy crops will need to be adopted by biorefineries. Many potential novel feedstocks such as tree foliage and tree bark remain uncharacterized and thus there is a lack of sufficient analytical data and composition investigations reported in the literature.
Accurate analysis of the composition of these novel lignocellulosic feedstocks and on the effects of solvent extraction is therefore essential to determine the optimal route of conversion to platform chemicals and biofuels. Currently, the analytical methodologies most widely used are those created by the National Renewable Energy Laboratory (NREL) [7][8][9][10].
Understanding the effect of extraction on novel lignocellulosic feedstocks is important to the future utilization of such materials for biorefineries. Here, we report on the effectiveness of current extraction methods and analytical methodologies on the analysis of three distinct non-traditional types of non-herbaceous feedstock: tree foliage, tree bark and spent mushroom compost (SMC). These are considered wastes in their respective industries and are widely available. Tree foliage and tree bark make up a large portion of forest residues. Contingent on the age of the tree, the species and the specific wood/foliage mix, forest residues made up of bark and foliage typically differ from 50 to 100 tonnes per hectare on a dry-mass basis [11]. SMC can be defined as the substrate remaining after mushroom production and harvesting. This is an underutilized resource that has the potential to be upcycled into high-value products [12]. Approximately 5 kg of SMC results from every kilogram of mushrooms grown and over three million tonnes of SMC are produced in Europe each year [13]. Hayes and Hayes examined these as potential feedstocks for biorefineries; however, they observed that analytical data were lacking and no extraction-based studies were reported [11].

Experimental Samples
Testing was carried out on tree foliage, tree bark and spent mushroom compost. These samples are discussed below.

Tree Foliage and Tree Bark
The tree foliage samples consisted of Norway spruce, Sitka spruce and Lawson cypress, while the tree bark samples consisted of Norway spruce, Lodgepole pine and Douglas fir. These samples were obtained from forestry plantations in Ireland. For each of the feedstock types, three different samples were selected. All samples were ground using a Retsch rotary mill and sieved to give particle sizes of 180-850 μm.

Spent Mushroom Compost (SMC)
Three SMC samples were obtained from various mushroom production facilities in Ireland. These are referred to as SMC1, SMC2 and SMC3, respectively, hereafter. SMC1 and SMC2 both had grown two harvests (flushes) of mushrooms and were sanitized by heating at 70 °C for 12 h. SMC3 had grown three mushroom harvests and was not sanitized. All samples were ground using a Retsch rotary mill and sieved to give particle sizes of 180-850 μm.

Analytical Methodology
Chemical analyses were carried out to determine the content of moisture, ash, extractives, post-hydrolysis sugars, Klason lignin (KL) and acid-soluble lignin (ASL) within the selected samples. All samples were subjected to extraction using three different solvent schemes (see below) before analysis in duplicate.

Moisture Content
Moisture content was calculated from the weight loss (sample weighted pre-drying and post-drying on a Mettler Toledo analytical balance) from a sample of weight 300 ± 30 mg dried overnight in a convection oven at 105 °C.

Ash
A Nabertherm L-240H1SN muffle furnace was used to generate ash from the biomass samples, operating with a ramping temperature program, which peaked at 575 °C and was maintained for 3 h, as described by the NREL [14]. This achieved complete incineration of samples for ash analysis.

Extractives
All extractions were performed using a Dionex Accelerated Solvent Extractor (ASE) 200 unit. Extractions were carried out using 95% ethanol (EE), water (WE) and finally sequential extraction with deionized water followed by 95% ethanol (SE). The ASE 200 unit was programmed according to the NREL standard program for automated extraction [15]. A volume of 11 mL ASE cells was used at a pressure of 1500 PSI using analytical-grade N2, a temperature of 100 °C, a heat time of 5 min and a static cycle time of 7 min. The samples went through three static cycles and the total flush volume was 150%. The solvent used for extraction (removed at the end of each cycle) was dispensed into a vial corresponding to the cell; this solution contained the extractives in suspension.

Lignocellulosic Sugars and Lignin
Lignocellulosic sugars (glucose, xylose, mannose, galactose, arabinose and rhamnose) and lignin were determined using the standard NREL procedure [16]. This involved a two-stage acid hydrolysis of samples (treating 300 ± 30 mg of sample with 72% H2SO4 before water addition and autoclaving) followed by gravimetric filtration to separate hydrolysate from acid-insoluble residue (AIR). KL was calculated by the difference between AIR and its ash content. ASL was measured using an Agilent 8453 UV-Visible spectrophotometer, whereby part of the hydrolysate was analyzed to determine absorbance at 200-205 nm, which is converted to ASL [17].
The investigation of the sugars freed by hydrolysis was carried out with ion-chromatography, described previously by Hayes [18]. Briefly, the hydrolysates were diluted (1 in 5) with a solution having an identified concentration of fucose (being used as the internal standard). Following this, the hydrolysates were filtered using 0.2 μm Teflon syringe filters, and placed into vials for testing on a DIONEX ICS-3000 ion chromatography system (Thermo Fisher Scientific Inc., Sunnyvale, CA, USA). This system is made up of: an electrochemical detector (using Pulsed Amperometric Detection, PAD), a gradient pump, a temperature-controlled column (with detector), and an AS50 autosampler (Thermo Fisher Scientific Inc., Sunnyvale, CA, USA). The autosampler injected 10 μL of the sample. Sugar separation was achieved using a Carbo-Pac PA1 guard and an analytical column. These were linked in sequence. Hydrophobic materials were taken away via in-line solid-phase extraction, including a Dionex NG1 guard column placed between the injection valve and the PA1 guard column. The eluent flow bypassed this NG1 guard column 2 min after injection. Sugar separation happened at 16 min using deionized H2O as the eluent, with a flow rate of 1.1 mL/min, and a column/detector temperature of 18 °C. The standard Dionex "carbohydrates" waveform was used for analysis (Dionex, 2000). After 16 min, the column was regenerated and re-equilibrated. This included a 30 s build up to an eluent concentration of 240 mM NaC2H3O2 in 400 mM NaOH, with conditions being maintained for 2 min. Following this, deionized water was used for 30 s as the only eluent. This elution was sustained for 15 min before injection of the subsequent sample. PAD needs alkaline conditions for the analysis of carbohydrate, and therefore NaOH (300 mM concentration) (using a Dionex GP40 pump at a flow rate of 0.3 mL/min) was placed into the eluent stream (post-column). The chromatographic settings permitted, in a single injection, determination of and determination between fucose, arabinose, galactose, rhamnose, glucose, xylose, and mannose. Relative response factors were found via sugar standard samples injected at fixed times in the analytical order.

Extractives
The data for the three different extraction schemes are compared in Figure 1 (full results can be seen in Tables 1, 2 and 3). The extractive determinations for tree foliage and tree bark were higher than those for SMC samples. Considerable variation can be observed between the individual wood species for specific extraction protocols for both foliage and bark. This indicates a degree of structural and compositional variation between species, while determinations for the SMC samples were broadly similar to each other. Sequential water-ethanol extraction (SE) resulted in the highest extractive determinations for all samples. In the case of tree foliage and tree bark samples, extractive determinations as a result of SE were all less than the sum of values from ethanol extraction (EE) and water extraction (WE), indicating that extractives were not mutually exclusive to a solvent. For the SMC samples, the sum of EE and WE values corresponds to the SE determinations.
When considering individual solvents, ethanol had greater extractive solvation for one tree foliage sample (Lawson cypress, Table 1) and two tree bark samples (Lodgepole pine and Douglas fir, Table 2), whereas water showed greater extractive solvation for two tree foliage samples (Norway and Sitka spruces, Table 1), one tree bark sample (Norway spruce, Table 2), and all SMC samples (Table 3). These differences are attributed to variation in the types of extractives among different species of tree.

Klason Lignin
KL accounted for the largest single determined component in all samples which did not undergo extraction (NE), and was invariably lower following extraction. SE resulted in the lowest KL determination for all samples. Except for Norway spruce, EE resulted in lower KL determinations than WE for tree foliage and tree bark samples. The degree to which KL determinations were reduced following extraction was far less for SMC samples compared to tree foliage and bark.
Previous studies have noted that for hardwood samples, extraction using a single solvent yielded a higher lignin value compared to when multiple solvents were employed. It was suggested that if not removed, acid-insoluble extractives could condense and act as contaminants by contributing to lignin yields arising from acid hydrolysis, resulting in overestimation and inaccuracy in lignin determinations [19]. The reductions in KL determinations resulting from solvent extraction observed here can be attributed to this phenomenon, with the determinations arising from SE considered to be the most accurate estimations of KL in a given sample. The correlation between extractives and KL determinations was further investigated by constructing scatter plots. Since SE determinations are the most accurate reflections of true KL content, it is most appropriate to plot the differences between KL determined for NE/EE/WE and KL content determined for SE. These differences were plotted against the residual extractives content of samples, which were calculated according to Equation 1.
where ER = extractives remaining following extraction with a chosen solvent, ESE = extractives content following sequential water-ethanol extraction, and Ei = extractives content for a single solvent (i.e., EE or WE). Consequently, data points of EE samples represent samples containing residual water extractives, data points for WE samples represent residual ethanol extractives, and data points for NE samples represent residual water and ethanol extractives.
Scatter plots for each feedstock are presented in Figure 2. R 2 values of 0.7818, 0.8070 and 0.3244 were calculated for tree foliage, tree bark and SMC, respectively, which indicate a relatively robust correlation within both tree foliage and bark samples, but no obvious relationship in the case of the SMC samples. These data indicate that in the case of tree foliage and bark, greater residual extractives content in the samples will result in a higher determination of KL.  It is important to note that while a similar trend has been reported in herbaceous biomass [10], it was not possible to predict the magnitude of the effect for a particular feedstock. Previous studies have observed a relative reduction in KL of 10-16.5% following EE and 18-26% following WE, in comparison to the native samples. The corresponding reductions resulting from EE observed in this work are compared with Thammasouk et al. [10] in Table 4, and were significantly higher in the case of tree bark and foliage (26-38%) but significantly lower in the case of SMC (0-15%), with the corresponding reductions due to WE exhibiting a broadly similar trend. This comparison highlights the necessity to evaluate the effects of solvent extraction on KL analysis for each individual feedstock. Scatter plots were constructed using data exclusive to the two individual solventsethanol and water-to help determine which solvent had the greater effect on KL measurement. These data are shown in Figure 3a,b for residual water extractives and residual ethanol extractives, respectively. An R 2 value of 0.5579 was derived for residual water extractives, indicating that, in general there is little difference between the KL determinations of EE and SE. In the case of residual ethanol extractives, the R 2 value was 0.9025, suggesting that extracting solely with water will result in samples with sufficient residual extractives to significantly impact subsequent KL determinations. Therefore, it appears that in cases where only one solvent can be used, that ethanol is more suitable than water for accurate KL determination in lignocellulosic feedstocks. The valorization of lignin has developed into a major area of research and development for biorefining. Such efforts will require robustly accurate lignin data to correctly calculate yields and assess the economic value of potential valorization pathways. Improving the accuracy of lignin determination is an increasingly important area of biomass analysis [20]. Through both the literature discussed here, which relates to traditional biomass feedstocks, and the results of the novel feedstocks presented in this paper, the critical importance that solvent extraction plays in accurate analysis and how it should be considered before further analytical procedures are carried out to accurately quantify the lignin fraction have been made clear.

Post-Hydrolysis Sugars
Post-hydrolysis sugar determinations for NE samples were considerably higher in the tree foliage and bark samples than in SMC. These determinations consisted of individual measurements of glucose, xylose, mannose, galactose, arabinose and rhamnose (an example chromatograph can be seen in Supplementary material). The most abundant individual sugar determined for all samples was glucose, followed by mannose in the case of tree foliage and tree bark, and xylose for SMC. Total lignocellulosic sugar determinations were reduced because of extraction in the case of all samples with each feedstock showing different behavior and values.
Reductions in post-hydrolysis sugar determinations following extraction for tree foliage showed a clear trend (Table 1): EE resulted in a slight reduction, while WE and SE resulted in a larger and similar reduction. Due to the insolubility of cellulose and hemicellulose in ethanol, the reductions arising from EE are likely to be caused by the removal of free mono-/oligosaccharides. This has been previously been reported by Sun and Sun, for different materials, who termed the extraction of non-lipophilic components by ethanol as "co-extraction" [21]. Based on FTIR analysis of the extracts, they suggested that any extracted sugars are likely to be oligosaccharides. These reductions were also observed for different biomass sources by Thammasouk et al., who suggested that low-molecularweight sugars were being extracted [10]. The reduction in determined sugar because of WE and SE can be attributed to the solubility of many free sugars and oligosaccharides in water. A study by Chen et al. analyzed aqueous extracts from four switchgrass samples and found that 25-32% of these extracts were comprised of free sugars and oligosaccharides, indicating that a considerable portion of sugars were being removed [22].
For the tree bark samples, EE, WE and SE showed similar, slight reductions in posthydrolysis sugar determinations (Table 2). Bark is a tough material which gives trees strength, support and physical structure, with most of its lignocellulosic sugars being insoluble structural polysaccharides such as crystalline and amorphous cellulose. More specific examples include hemicelluloses such as highly branched arabinan, which are common in pines, and callose, which is a form of glucan [23]. Two soluble oligosaccharides, raffinose and stachyose (both consisting of galactose and glucose), are present in minor amounts in some barks [23]. Considering this, it is likely that any reductions in post-hydrolysis sugar determinations in tree bark resulting from extraction are due to the removal of free sugars and/or very small amounts of these soluble oligosaccharides.
Post-hydrolysis sugar determinations in samples of SMC were always lower as a result of SE compared to NE; however, there was no clear distinction between EE and WE ( Table 3). A previous compositional study on 63 different SMC samples taken from various sites within Ireland determined that SMC contained an average of 38% cellulose and 19% hemicellulose with respective ranges of 18-62% and 2-41% [24]. These contrast with the results shown in this study, with the total post-hydrolysis sugar determinations of NE SMC samples close to the sum of the minimum cellulose and hemicellulose ranges but lower than the sum of the average values reported by previous studies [24]. While these differences may be due to variation in sample source, it is more likely that they arise from differences between methodologies employed. These studies, however, used detergent fiber methods outlined by Goering and Van Soest [25], which are routinely used in forage analysis. These methods utilize methods that produce acid detergent fiber (ADF) [contains cellulose and lignin] and neutral detergent fiber (NDF) [contains residual cell-wall content]. The variance between NDF and ADF is used to calculate non-cellulosic polysaccharides, typically assumed to be hemicellulose. Reliability problems arise with these methods, as cellulose may be overestimated due to the presence of contaminating compounds such as ash and protein in ADF [26][27][28] and hemicellulose may be overestimated due to the variable responses of hemicellulose polysaccharides of differing structure to the extraction conditions in the production of NDF [29].
Considering (i) the difficulty in unambiguously identifying trends in post-hydrolysis sugar determinations following extraction in SMC, and (ii) the differences in results between this and previous studies [24], it is suggested that a more comprehensive and detailed study of the carbohydrate composition of SMC is necessary. In addition, since extraction caused the solubilization and removal of some non-cell-wall carbohydrates, determinations for total lignocellulosic sugars following extraction are considered to be a more accurate representation of structural cell-wall polysaccharides.

Acid-Soluble Lignin
ASL is frequently analyzed in the literature by absorbance at 205 nm, which is converted to ASL based on Beer's law [17]. However, ASL determinations by this method must be treated with caution as some carbohydrate derivatives can also absorb within the 190-205 nm region, resulting in an overestimation of ASL and any changes observed following extraction may be due to changes in these derivatives rather than ASL itself. ASL analysis and related difficulties in accurate determination are reviewed in Hatfield and Fukushima [17].
In analyzing the tree foliage ASL in this study, determinations after extraction did not show a discernible trend for Sitka spruce. For Norway spruce and Lawson cypress, ASL was lower after SE. In the tree bark samples, no discernible trend was observed in ASL determinations for Lodgepole pine while Norway spruce and Douglas fir exhibited reductions in ASL following extraction. The ASL determination for Norway spruce was reduced to a greater degree by WE compared to EE, whereas the opposite was true for Douglas fir. SE resulted in the largest determined ASL reductions for both these samples. ASL for SMC samples exhibited a reduction in all but one extraction. EE resulted in a decrease in ASL for SMC1 and SMC2. WE and SE led to reductions in determined ASL for all SMC samples. WE had a greater effect on ASL determination than EE, while SE had the largest effect, which was similar to WE. These trends correlate with respective extractive determinations; however, it is important to be cautious due to the methodology employed.

Ash
Biomass ash is composed of trace minerals and inorganic ions, both of which are found in low concentrations in tree foliage, where they are a requirement for photosynthesis [23]. Ash in the NE tree foliage and bark samples can be considered relatively low (Tables 1 and 2). This is similar to a study carried out by Vassilev et al. which carried a study on 86 different samples from various different lignocellulosic feedstocks where the average percentage of ash was 6.8% (ranged from 0.1-46%) [30]. EE did not affect the ash determinations in tree foliage samples. WE and SE resulted in reduced ash determinations, which can be attributed to solubility of certain inorganic anions in water, particularly alkali metals [22,30]. Ash determinations for NE SMC samples are much greater than the average reported by Vassilev et al. but are similar to a report by Maher et al., who found that Irish SMC samples contained an average of 39% ash [30,31]. EE did not affect ash determinations in the SMC samples; however, large reductions were observed as a result of WE and SE, which were very similar to each other. Much like those observed in the tree foliage and bark samples, these reductions can be attributed to the solubility of inorganic anions such as alkali metals in water [22,30]. This trend suggests that a major part of the observed increases in determined extractives resulting from WE compared to EE is extracted ash. Ash determinations following EE can be referred to as "ethanol-insoluble ash", while determinations resulting from WE can be termed "water-insoluble ash". These determinations are not accurate reflections of true ash content in the samples, but they allow for insight into ash behavior. This is particularly relevant to biorefineries that wish to achieve lower ash contents before thermochemical processing, where ash can act as impeding substance to product yields and process efficiency [2].

Mass Closures
The total mass closure of each sample was determined by summing the quantities of all measured components in a sample. For tree foliage and tree bark, the NE samples always resulted in the lowest mass closure. This is easily attributed to the lack of an extractives measurement in comparison to extracted samples. Extraction greatly increased the total mass closure for tree foliage samples, with mass closures generally increasing as extractives increased. SE and WE resulted in very similar mass closures for these samples.
In the case of tree bark, mass closures increased by a small margin following extraction, with similar closures resulting from all extraction methods. While one may expect greater increases in mass closure such as those seen in the tree foliage samples, the similarity between closures for the tree bark samples can be attributed to the fact that as the extractives measurements increased, KL determinations fell at a rate proportionally greater than that observed for the tree foliage samples.
Mass closures of the SMC samples were all relatively close to 100% regardless of extraction, with the exception of the NE sample SMC3 which was higher than others. Variation between the mass closures can be attributed to sample heterogeneity. It is noteworthy that the SE samples provided mass closures which were the closest to 100% without exceeding it for all three SMC samples.

Discussion
Improved accuracy in lignin determinations and more efficient chromatographic quantification of extractives are of key importance in biomass analysis for energy crops [20]. Wet chemical analysis of lignocellulosic biomass is generally considered to be laborious and slow. Increased speed of chemical analysis and throughput of samples is desirable and a major field of research. To speed up analysis time, quantitative prediction models have been developed for biomass components by use of near-infrared spectroscopy (NIRS) and chemometrics [20]. From a theoretical perspective, the concept is rather simple; however, it requires a substantial amount of laboratory work to build a predictive model. A large calibration set of samples and data is required, which must be representative of the samples of interest and contain component values distributed across the range of expectation for each component [20]. NIR spectra of each sample must be recorded prior to analysis; these spectra are used to regress against reference chemical data. All the reference data applied to build a predictive model should be obtained from following the conventional NREL methodology. Finally, the method of partial least squares is used to regress the spectra against the reference chemical data, which forms a predictive mathematical model [20].
The results reported here have potential implications on the development of calibration sets for NIRS models. The accuracy of models developed by NIRS and partial least squares regression is entirely dependent on the data obtained from the reference chemical analysis. While spectra used for the model will be obtained from the unextracted sample, the selection of accurate datasets to regress with the spectra is of extreme importance.
KL determinations are substantially more accurate following solvent extraction. This is of significant importance with respect to biorefining and valorization of lignin. Therefore, it is more appropriate to use compositional datasets where extraction has taken place when KL is being modelled for a sample. Extraction was shown to result in variable reductions in total sugar determinations for many samples. These reductions were attributed the solubilization and removal of non-cell-wall carbohydrates, such as free monosaccharides in ethanol and various mono-/oligosaccharides in water. Thus, it can be concluded that lignocellulosic sugar determinations following extraction are more accurate representations of structural cell-wall polysaccharides such as cellulose and hemicellulose. Therefore, it is more appropriate to use compositional datasets where extraction has taken place when modelling the structural polysaccharides, as these are the main origin of sugars used in biorefining processes.
Ash determinations were reduced following extraction with water, yielding determinations of "water-insoluble ash", rather than true ash content. Therefore, it is important to use ash datasets from unextracted samples when developing an ash model, as ash can act as a major inhibiting substance for yields and products in thermochemical biorefining. There is scope for the development of ethanol-insoluble and water-insoluble ash models; however, as biorefineries will often want/need to reduce ash content to as low a value as possible, models which can predict for lower ash values following a pretreatment, be it extraction or otherwise, are of value to potential biorefineries.
Future work will look to use these data to develop quantitative prediction models for biomass components form novel waste sources such as those investigated in this study [32]. Limitations of this study are that only NREL-approved methods (including the extraction procedures) were used. Future work will look to expand into different methodologies such as different extraction techniques and different analytical techniques.

Conclusions
Accurate lignin determination is of key importance in the valorization of the component within feedstocks. Hydrolysis is used in the analysis of the most important components of biomass; cellulose, hemicellulose and lignin. The determinations of non-herbaceous biomass components exhibited lower values following solvent extraction. Klason lignin determinations were always lower due to the removal of extractives, which would otherwise falsely elevate their value, with ethanol extraction having a greater impact than water extraction. Post-hydrolysis sugars were always reduced in our hands following extraction; however, reductions were due to the solubility of non-cell-wall carbohydrates in relatively minor amounts. Since structural cell-wall polysaccharides are insoluble in ethanol and water, determinations of hydrolysis sugars following extraction are considered to be a more accurate representation of these carbohydrates. Ash and protein determinations were only affected as a result of extraction with water, exhibiting reductions, thus the determinations in unextracted samples were deemed most accurate for these components. Overall, the results suggest that extraction before hydrolysis is crucial where accurate lignocellulosic analysis is required. Where only one solvent is used, ethanol extraction was shown to provide lignocellulosic data closest to that seen after full sequential extraction. Variations in composition as a result of extraction can have an impact on predictive near-infrared spectroscopy models. Due to this, it is suggested that for maximum accuracy in prediction, datasets of extracted samples should be used for modelling Klason lignin and lignocellulosic sugars, while datasets of unextracted samples should be used for modelling ash. In conclusion, these results will help to build a foundation for a more comprehensive database of the properties of novel feedstock materials, which will aid in the development of comprehensive strategies for the selection and preparation of feedstock to be used in biorefinery settings.