Modelling the Yield and Estimating the Energy Properties of Miscanthus x Giganteus in Different Harvest Periods

: This research aims to use artificial neural networks (ANNs) to estimate the yield and energy characteristics of Miscanthus x giganteus ( MxG ), considering factors such as year of cultivation, location, and harvest time. In the study, which was conducted over three years in two different geographical areas, ANN regression models were used to estimate the lower heating value (LHV) and yield of MxG . The models showed high predictive accuracy, achieving R² values of 0.85 for LHV and 0.95 for yield, with corresponding RMSEs of 0.13 and 2.22. A significant correlation affecting yield was found between plant height and number of shoots. In addition, a sensitivity analysis of the ANN models showed the influence of both categorical and continuous input variables on the predictions. These results highlight the role of MxG as a sustainable biomass energy source and provide insights for optimizing biomass production, influencing energy policy, and contributing to advances in renewable energy and global energy sustainability efforts.


Introduction
The transition from conventional fossil-fuel-based energy sources to renewable alternatives represents a critical point in our quest for sustainable energy production.In this context, biomass is emerging as an important component of the renewable energy spectrum, known for its sustainability and versatility [1].Lignocellulosic biomass, which includes agricultural, forestry, and industrial by-products, is widely used to produce heat, electricity, and direct combustible fuels [2].As the demand for renewable energy sources and biomass increases, research into efficient, cost-effective methods of biomass production becomes imperative [3].Biomass energy represents a renewable energy source that has the potential to contribute significantly to the fulfilment of global energy sustainability [4] and to provide numerous environmental and economic benefits, including significant reductions in CO2, SO2, and NOx emissions [5].Among lignocellulosic plants, Miscanthus × giganteus (MxG) is characterized by its suitability for biomass production and thrives on less fertile soils [6].Its high yield per unit area and adaptability to a range of conditions, including heavy-metal-contaminated soils and resistance to pests, emphasize its potential as a leading energy crop [7,8].
The quality and composition of MxG biomass is influenced by a variety of factors: management practices, fertilization measures, harvest period, climatic conditions, genetic material, and geographical location [9].Interestingly, variations in harvest timing, such as during autumn or spring, can cause significant physiological changes in the plant, affecting yield and nutrient concentration [10].In the broader context of biomass energy, MxG was identified as the top performer in a comprehensive global analysis comparing the yields of different energy crop candidates.This analysis found that MxG consistently outperformed other commonly studied energy crops such as Erianthus and Phragmites australis, demonstrating its superiority as a biomass source [11].
Given the growing importance of biomass as an energy source, the application of mathematical modeling, especially machine learning techniques such as artificial neural networks (ANNs), plays a crucial role in this sector.These models are crucial for estimating and simulating various aspects of biomass production, including plant growth kinetics and energy parameters [12,13].In addition, ANN models have been recognized for their effectiveness in modeling the energy value of biomass.They offer a way to achieve accurate estimates without the high time and resource investments required by traditional laboratory methods [14][15][16].Baruah et al. (2017) [17] conducted research to model biomass gasification using ANN.The authors state that ANN models can be successfully used in the evaluation of energy applications related to biomass as a feedstock.On the other hand, Uzun et al. ( 2017) [18] attempted to improve existing predictive models to estimate the calorific value of biomass based on the input variables of proximate analysis.The authors concluded that the application of these models can significantly facilitate the estimation of the energy value of biomass as well as intelligent decision making regarding the potential energy applications of these feedstocks.Based on the ultimate and proximate input variables, a more reliable model (with a lower error rate) can be created compared to the already developed empirical equations for estimating the calorific value of biomass (Ighalo et al., 2022) [19].Darvishan et al. (2018) [20] conducted research to estimate the HHV of biomass through a multilayer perceptron (MLP) ANN model.The authors used the ultimate biomass analysis as the input variable and achieved a high coefficient of determination for learning (R 2 = 0.99) and testing (0.99) the model.Based on the overlap between real and model data and model error calculation, the authors concluded that the MLP-ANN model can be used for the practical estimation of energy values.
To predict the yield and area of the activated carbon circulation, Liao et al. (2019) [21] created an ANN model incorporating the input data of the ultimate proximate analysis and the activation conditions of the circulation.A total of 168 data samples from the literature were used for the study.The evaluation results (R 2 > 0.90) were obtained by the ANN model with high efficiency.To estimate the yield and total biomass of wheat grains, Mehnatkesh et al. (2009) [22] used ANN and multiple linear regression (MLR) models.The authors found that ANN models show superiority in modeling performance based on the calculated statistical parameters of accuracy and modeling error.
The main objective of this research is to develop and explore the use of ANN modeling to estimate the yield and energy characteristics of MxG, considering various factors such as year of cultivation, location, and harvest time, to understand MxG's potential as a sustainable biomass energy source.By using advanced modeling techniques such as ANNs, a comprehensive insight into the yield and energy properties of MxG under different conditions will be gained.The significance of this research goes beyond theoretical analysis and offers practical implications for optimizing biomass production and contributing to more efficient and sustainable energy systems.It is expected that the results of this study will inform future energy policy, support technological progress in the field of renewable energy, and contribute to addressing global challenges in energy sustainability and environmental protection.

Experiment
The study was conducted at two locations, Sljeme (45.9°N 15.948° E) and Bistra (45°54′ N 15°51′ E), which were selected due to their different environmental conditions to evaluate the geographical effects on growth and yield of MxG.The research was conducted over a period of three years (2012-2014) from the establishment of the crop, taking into account annual weather fluctuations and observing long-term crop trends.Each year included three harvest periods (autumn, winter, and spring) aligned with common agriculture practices and local climate to assess the impact of harvest timing on biomass quality and quantity.The biomass was harvested manually to ensure uniform sampling and minimize the possibility of losses during collection.

Laboratory Analysis
In the laboratory of the University of Zagreb's Faculty of Agriculture, the MxG biomass underwent detailed analysis utilizing established test procedures.The research included a comprehensive ultimate analysis to measure elements like carbon (C), hydrogen (H), nitrogen (N), oxygen (O), and sulfur (S).This was performed using the Vario Macro CHNS analyzer [23], following the guidelines of EN 15104:2011 [24] and EN 15289:2011 [25].Additionally, the high heating value (HHV) was determined with an adiabatic bomb calorimeter [26], adhering to the CEN/TS 14918:2005 method [27].

Data Processing
After the research was completed, the data were collected (by creating a database with collected data from the analyses) and standardized for further analysis.The collected data were categorized according to the year of the study, the location, and the harvest period (a total of 108 samples).Descriptive statistical analysis, analysis of variance (ANOVA), and Tukey's honestly significant difference (HSD) post hoc test were performed to determine the differences within groups and samples.
The final part of processing the statistical data included univariate analyses to determine the influence of parameters such as study year, location, and harvest time on the changes in the variables studied.

Artificial Neural Networks
ANN regression models are used in regression problems because they can solve complex non-linear relationships between input and output values by learning from a portion of the data and adapting accordingly [28].The basic structure of an ANN model consists of an input layer, hidden layers, and an output layer [29].In response to the need for effective modeling, two ANN models were developed to estimate the lower heating value (LHV) and yield of MxG.The ANNs were constructed with a random number of artificial neurons in the hidden layer, having an architecture with input, hidden, and output layers.When creating the ANN model, the data were first standardized and cleaned.The data were then divided into a learning part (70%), a training part (15%), and a validation part (15%).The number of input variables was determined (given the need to model the output value) to be a random number of artificial neurons in the hidden layer (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20).To ensure consistency and reliability of the modeling process, both ANNs were trained in 100,000 cycles.To evaluate the model, the last step was to calculate statistical parameters by comparing the experimental data with the data obtained by the model, which is described in detail in the next chapter.The output values of the models were calculated using Equation (1) [30,31]: where Y stands for the output value, f for the transfer function between the layers, W for the weighting coefficients, b for the bias thresholds, and X for the input data of the model.Figure 1 shows the developed ANN models with a 3-layer structure.Figure 1a shows the architecture of the ANN model for estimating the LHV, which is based on the continuous input variables of carbon (C), hydrogen (H), nitrogen (N), sulfur (S), and oxygen, while year, location and harvest period are used as categorical input variables in both models.Figure 1b shows the architecture of the yield assessment model, which uses the number of shoots (NoS), plant height (PH), precipitation in the analyzed area (Per.), and temperature as continuous input variables.

Evaluation of the Model
Once the ANN model was created, statistical parameters were calculated to determine the accuracy and performance of the model in estimating the output values: the coefficient of determination (R 2 ) (2), the root mean square error (RMSE) (3), the mean bias error (MBE) (4), and the mean average error (MAE) (5).These values were used to superimpose the real data and the data obtained by the model, as well as the modeling error.The statistical parameters mentioned were calculated according to the following formulas [32]:

Method of Sensitivity Analysis
The final part of the modelling involved implementing the sensitivity method of the model to determine the influence of the sub-units of the variables on the output value (LHV and yield).The sensitivity method is used to measure the importance but also the impact of the input variables on the output variables [33].The variational method of global sensitivity analysis was used to quantify the influence of categorical (year, harvest period, location) and continuous (C, H, N, S, O, NoS, and PH) input variables on the variance in the output.

Results
Table 1 shows a comprehensive analysis of the variations in the biomass composition of MxG over different years, locations, and harvest periods.For the Bistra site, a general trend towards increasing C content was observed during the three years of the experiment, with more pronounced fluctuations in the second year of the experiment.The proportion of H and O did not fluctuate significantly, while N and S showed less variability in the different harvest periods.The highest LHV value for the Bistra site was measured in the third year of the experiment at the first harvest (17.38 MJ kg −1 ).At the Sljeme site, a similar trend in carbon content was observed in the three years of the trial, while H and O showed slight fluctuations, but without a clear pattern.
Table 2 shows a comprehensive analysis of the measured variables of plant height, number of shoots, and yield of MxG biomass over different years, locations, and harvest periods.Table 2 lists the measured characteristics of MxG, including plant height, number of shoots, and yield.The differences in these variables were compared across the three trial years, two planting sites, and three harvest periods.It is clear that plant height increases over time, with the maximum value being recorded at the 'Sljeme' site in the second year and the second harvest period (3.45 m).Conversely, the maximum height measured at the 'Bistra' site was 2.89 m in the third year (across all three harvest periods).
The highest number of shoots and biomass yield was measured at the 'Sljeme' site in the third year of the experiment (61.25 per m²; 30.85 t dm ha-¹).The study on sewage sludge fertilization by Voća et al. (2021) [3] reported an average yield of 20.55 t dm ha-¹ and an average NoS of 80.11 per m².
Figure 2 shows the recorded temperatures and precipitation at the Bistra and Sljeme sites in the three years and harvest periods of the experiment.Figures 3 and 4 show heat maps of the correlation coefficients between different variables in the dataset for the yield and LHV estimation models.
Heat maps are a popular visualization tool in the field of biology and the related sciences [34] and are effective techniques for finding patterns and connections between multidimensional data [35].
The heatmaps are color-coded to show the strength and direction of the correlation; red shades indicate a positive correlation, blue shades a negative correlation, and the intensity of the color indicates the strength of the correlation.A correlation coefficient can be between −1 and 1, where −1 stands for a perfect negative correlation, 0 stands for no correlation, and 1 stands for a perfect positive correlation.Figure 3 shows that the LHV has a moderate negative correlation with HP (correlation coefficient −0.34, p = 0.04), indicating a statistically significant but not too strong inverse relationship between these two factors.On the other hand, there is a moderate and statistically significant positive correlation between the LHV and H (correlation coefficient 0.54, p = 0.00), indicating that as H increases, the LHV also increases.The LHV also shows a weak positive correlation with S (correlation coefficient 0.31, p = 0.00).For other variables, the correlations are very low or non-existent, which means that there is no significant relationship between the LHV and these variables in the available dataset.The heatmap analysis shows a robust positive correlation between 'PH' and 'NoS' with a coefficient of 0.77, which means that an increase in 'PH' is usually accompanied by an increase in 'NoS' and vice versa.Conversely, 'PH', with a coefficient of −0.36, shows a significant negative correlation with 'Temperature', indicating that higher 'PH' values are often associated with lower temperatures.In addition, there is a clear positive correlation between 'Yield' and 'Year' as well as 'NoS', with coefficients of 0.84 and 0.74, respectively, which means that higher yields are associated with earlier years and higher 'NoS' values.
Tables 3 and 4 show the results of the univariate analysis carried out to analyze the influence of the individual parameters of year, location, and harvest period on the biomass variables of MxG.Table 3. Univariate analysis of the influence of the parameters of year, location, and harvest period and their interactions on the change in the final analysis and the LHV biomass of MxG.When looking at Table 3, it becomes clear that the change in variable C is influenced by the different years of the experiment and the period of the harvest.Changes also occur in the interaction of the parameters mentioned.The proportion of H and O is subject to the changes in the analyzed parameters, while N is not influenced by the location, and S is not influenced by the year of the experiment.The LHV variable is statistically significantly influenced by all analyzed parameters and their mutual interactions.Table 4. Univariate analysis of the influence of the parameters of year, location, and harvest period and their interactions on the change in the measured properties of MxG biomass.Figure 5 shows a high degree of overlap between real and modeled data, which is illustrated by the high R 2 value.To achieve a more objective evaluation of the test, the RMSE (0.13), MBE (0.02), and MAE (0.10) modelling errors are also shown.Considering the low error values, the model showed a high efficiency in modelling the LHV of biomass.After creating the model, it was important to analyze the influence of certain continuous and categorical input variables on the output of the LHV (Figure 7) and yield (Figure 8).The individual variables are presented as numerical values (coefficients) to indicate the importance of a single parameter.In contrast to the previous model, one of the category variables (location) was rated as having the lowest relative importance in the modeling, while HP and year contributed the most to the optimal solution when modeling the biomass yield of MxG.

Discussion
The laboratory analysis revealed average ultimate analysis values for MxG biomass, with proportions of C at 48.75%, H at 3.92%, O at 46.78%, N at 0.49%, and S at 0.08% and an LHV of 17.20 MJ kg⁻¹.Wilk and Magdziarz (2017) [37] reported values of C (46.30%), H (5.93%), N (0.37%), S (0.09%), and O (45.57%) with an LHV of 16.60 MJ kg⁻¹.Greenhalf et al. (2013) [38] emphasized the significance of the harvesting period on the properties of MxG products.They reported that early harvesting is crucial for maximizing biomass yield due to the higher moisture content, while the maximum LHV values were observed in later harvest periods.The highest NoS and biomass yield were measured at the 'Sljeme' location in the third year of the experiment, with 61.25 per m² and 30.85 DM t ha − ¹, respectively.Voća et al. (2021) [3] reported an average yield of 20.55 DM t ha⁻¹ and an average NoS of 80.11 per m².Battaglia et al. (2019) [39] found that the year of growth significantly influenced biomass yields, recording a maximum yield of 18.30 DM t ha⁻¹.Conversely, Anderson et al. (2011) [40] observed that the maximum MxG biomass yield could reach up to 40 DM t ha⁻¹ in certain areas in Europe, particularly in the range of 3-5 years of growth.Szulczewski et al. (2018) [41] proposed a new method for estimating MxG biomass yield, correlating it with the number of shoots and using simple biometric properties.Meehan et al. (2013) [42] highlighted that the timing and method of MxG harvesting significantly influence its quality.They observed an increase in the LHV because of the varying harvesting times, attributed to changes in moisture content, which reduced from 62% to 27% over the study period, significantly affecting the LHV.Chupakhin et al. (2021) [43] state that to study biomass production, it is crucial to examine plant height and number of shoots, which are key indicators for yield estimation.The authors Ouattara et al. (2022) [44] conducted research in France from 2013 to 2019 and applied several treatments to different species of miscanthus.The authors focused on identifying and analyzing indicators such as water stress and the number of frost days on changes in yield.In the study, they concluded that in both miscanthus samples tested, the parameters tested affected yield variability.The results showed that the site-year variability was greater for MxG than for Miscanthus sinensis.For example, the yield range in one of the MxG treatments was 0.8 to 20.5 t ha −1 , while yields in Miscanthus sinensis were more consistent, ranging from 2.27 to 11.9 t ha −1 for different treatments.The ANN model created for estimating the LHV of MxG biomass demonstrated high performance, evidenced by a high R² value of 0.89.The model's accuracy was further corroborated by low error rates: an RMSE of 0.13, an MBE of 0.02, and an MAE of 0.  [47] stated that ANNs are suitable tools for yield estimation. Basir et al. (2021) [48] developed an ANN model for rice yield prediction.The model was developed based on four input parameters, including density, number of shoots per seedling, crop spacing, and water level.In terms of splitting the data into learning and testing sets, the evaluation of the model resulted in a high R 2 value (0.99), with an MSE of 20.95.The authors state that the ANN model has achieved high efficiency in yield modeling and that it can replace existing numerical models as a tool.They also emphasize the importance of increasing the input parameters to create more accurate models with greater generalizability.To improve the models, future research should focus on the expansion of the database and on the possibilities for expanding the input variables in the models.In addition, the integration of hybrid models that could provide better prediction results should be investigated.Increased accuracy or a reduction in modeling error would open the possibility for better planning of the cultivation and care of crops and the use of their biomass in energy production, which is a step forward in creating a sustainable energy system.

Conclusions
Exploring the capabilities of ANN to estimate the yield and energy characteristics of MxG in this study has led to significant findings.The developed ANN models showed high accuracy, with R² values of 0.85 for the lower heating value (LHV) and 0.95 for yield.The corresponding RMSE values were 0.13 for the LHV and 2.22 for yield, together with an MBEs of 0.02 (LHV) and −0.26 (yield) and MAEs of 0.10 (LHV) and 1.65 (yield), demonstrating their robust predictive power in agriculture.Of note was the discovery of a strong correlation between plant height, number of shoots, and total yield of MxG, a finding that emphasizes the complex interplay between plant physiology and biomass production.The implications of this research go beyond the realms of agricultural modelling.By demonstrating the viability of MxG as a sustainable biomass energy source, this study contributes to the broader discourse on renewable energy resources.In a world grappling with the challenges of climate change and the need for sustainable energy solutions, findings such as these are invaluable.They provide a scientific basis for policy makers and industry representatives to make informed decisions about energy production and environmental protection.For the future, it is recommended that the practical applications of this study be explored in the real world of agriculture.Such applications could lead to more efficient biomass production strategies and optimize MxG cultivation for energy use.Future research should aim to further refine these ANN models, possibly incorporating different environmental variables and exploring other biomass sources.

Figure 1 .
Figure 1.Architecture of the ANN model for estimating (a) LHV and (b) yield of MxG.

Figure 2 .
Figure 2. Recorded temperatures and precipitation for the duration of the study during different harvest periods at the (a) Bistra and (b) Sljeme sites.

Figure 3 .
Figure 3. Heatmap of the correlation coefficient of ultimate analysis data and energy values of MxG.

Figure 4 .
Figure 4. Heatmap of the correlation coefficient of the investigated measured properties of MxG.

Figures 5 and 6
Figures 5 and 6 show a scatter plot of actual and predicted data using the ANN model to model the LHV and MxG biomass yield.Scatter plots, which are widely used in data visualization, represent each element of a dataset as a salient point aligned on two intersecting, continuous axes [36].

Figure 5 .
Figure 5. Scatterplot of the overlap between actual and predicted data for the estimation of LHV biomass of MxG with the data split into training, testing, and validation.

Figure 6 .
Figure 6.Scatterplot of the overlap between actual and predicted data for biomass yield estimation of MxG with the data split into training, testing, and validation.

Figure 6
Figure 6 clearly shows a high degree of overlap between real and model data, which is expressed by the high value of the coefficient of determination (R 2 = 0.95).In contrast to the previous model, despite the high degree of overlap, the model showed a slightly larger modeling error concerning the given dataset, as indicated by the calculated statistical metrics RMSE (2.22), MBE (−0.26), and MAE (1.65).After creating the model, it was important to analyze the influence of certain continuous and categorical input variables on the output of the LHV (Figure7) and yield (Figure8).The individual variables are presented as numerical values (coefficients) to indicate the importance of a single parameter.

Figure 7 .
Figure 7.The sensitivity method of the relative importance of the continuous and categorical input variables of the ANN model for the output of the LHV.

Figure 7
Figure 7 shows how categorical variables have the greatest influence on the modelling of the LHV, while the values of the continuous variables (N, C, S, H, and O) are less important in the modelling of the yield value.

Figure 8 .
Figure 8.The sensitivity method of the relative importance of the continuous and categorical input variables of the ANN model for the output return.

Table 1 .
Analysis of the composition of MxG biomass over different years, locations, and harvest periods.
ters in the column represent differences between the analyzed samples according to the Tukey post hoc HSD test (p ≤ 0.05).Statistical significance: * p ≤ 0.01.

Table 2 .
Analysis of measured variables of MxG biomass over different years, locations, and harvest periods.