1. Introduction
Population and industrial growth have significantly increased the global energy demand for industrial and commercial activities. For some centuries, hydrocarbons such as petroleum, natural gas, and coal have been the primary sources of energy [
1]. About 80% of energy for human activities is generated from fossil fuels, which are non-renewable, and their depletion has become a global challenge, while their extensive usage is currently creating a new and significant problem. Fossil fuel combustion is reported to be responsible for about 89% of the world’s greenhouse gas emissions, including CO
2 emissions [
2]. This finite nature of fossil fuels, the attendant environmental concerns, and the surge in global energy demands have caused a shift towards renewable energy resources. Among these renewable energy resources, biomass is unique because it can be converted into solid, liquid, or gaseous fuels, offering a versatile and sustainable alternative for electricity generation, heating, and transportation. The application of biomass for renewable energy generation is regarded as a sustainable technology that can meet energy needs and, at the same time, reduce the emission of greenhouse gases. Additionally, biomass use provides the merits of cost–benefit viability and minimizes the waste released into the environment [
3]. Lignocellulose biomass has been identified as the most available renewable energy source globally, and it is in the form of agricultural residues, energy crops, softwood, hardwood, and grasses. Lignocellulose materials are a composition of lignin, cellulose, and hemicellulose, and a larger percentage of hemicellulose and cellulose makes it a bright feedstock for biogas production. The biogas released from lignocellulose feedstocks is an environmentally friendly energy source widely acceptable as a substitute for fossil fuels [
4].
Despite the availability and the potential of lignocellulose feedstocks, their major limitation is the poor biodegradability due to the lignin content that reduces the available surface area, limiting the enzymatic hydrolysis and cellulose crystallinity [
5]. Pre-treatment assists in overcoming this challenge through effective delignification, enhanced digestion of hemicellulose and cellulose portions, and improved biomethane. Pretreatment techniques are categorized into thermal, biological, physical/mechanical, chemical, nano-additive, and combined. Irrespective of the pretreatment method selected, the primary interest is to enhance the availability of hemicellulose and cellulose during enzymatic hydrolysis and subsequent biogas release [
6]. Amongst these, chemical pretreatment is the most popular technique, which can alter the crystalline arrangement of cellulose and remove the hemicellulose and lignin portion of lignocellulose feedstock. Various chemical agents, such as acid, alkali, oxidizing compounds, and solvents, have been investigated for lignocellulose feedstock pretreatment [
7,
8]. Acidic pretreatment using sulphuric, hydrochloric, acetic, and formic acids is an efficient chemical method that solubilizes the hemicellulose content and partially solubilizes lignin, making the cellulose more available for further enzymatic hydrolysis [
9]. The literature is replete with several studies on the anaerobic digestion (AD) of lignocellulosic biomass subjected to acidic pretreatment. The biomethane yield of groundnut shells was reported to increase by 178% when 0.5%
v/
v H
2SO
4 was used for pretreatment for 15 min at 90 °C [
10]. Corn straw pretreated with 2%
v/
v of HCl enhances the biogas released by 115% compared to the untreated feedstock [
8]. The methane yield was improved by 8.9% when the wheat plant was subjected to H
2SO
4 treatment for 60 min [
11], while thermal-diluted H
2SO
4 on cassava residues increased the methane released by 56.96% [
12]. One of the significant challenges of acid pretreatment is the release of inhibitory compounds like phenolic acids, furfurals, 5-hydroxymethylfurfural, and aldehydes. The acidic pretreatment of lignocellulose focuses more on H
2SO
4 pretreatment, with limited studies on applying HCl. Therefore, more studies are required on the potential of HCl as a pretreatment technique for biomethane optimization. HCl is a strong acid like H
2SO
4, which should be able to degrade the recalcitrant characteristics of lignocellulose feedstock and have a comparative cost with H
2SO
4.
The efficiency of biogas production is significantly influenced by feedstock composition, process parameters, and microbial dynamics, rendering optimization challenging. Consequently, intelligent models have demonstrated efficacy in intelligent feedstock management and real-time decision-making. Interest in advanced computational techniques for extensive data-driven insights into biogas research technology has grown recently. Different mathematical models, including modified Gompertz, logistic, and first-order statistical methods, have been investigated to predict AD yield [
13]. However, a paradigm shift from these classical models to artificial intelligence (AI) and machine learning (ML)-based models has been noted to analyze the complex non-linear relation in the AD process. An Artificial Neural Network (ANN) was used to predict and monitor the biogas released from cassava wastewater, and the retention time, pH, and calcium eggshell concentration were selected as input parameters. It was observed that the model can predict the yield accurately with a correlation coefficient (R
2) of 0.9999 [
14]. An Adaptive Neuro-fuzzy Inference System (ANFIS) was utilized to predict and optimize the biomethane yield from the anaerobic co-digestion of
Xyris capensis and duck waste, with the mixing ratio, temperature, and retention time as input parameters and biomethane as the output parameter. The observed cumulative biomethane and ANFIS predicted yield were 478.42 and 436.20 mL CH
4/gVS
added, with a root square mean error of 2.7629 [
15]. The biogas released from the Mushroom was modeled using the ANFIS model with the temperature, C/N ratio, and retention time; an R
2 value of 0.9997, representing 99% accuracy, was reported [
16]. The anaerobic co-digestion of tea waste and cow manure process was modeled using ANFIS, and the model was reported to have around 99% accuracy [
17]. However, studies on optimizing and predicting the biomethane yield when the process parameter of pretreatment is considered an input parameter are limited.
In addition to the experimental investigations, the complex microbial interactions in the bio-digestion process and the pretreatment dynamics necessitate intelligent, data-driven strategies that can comprehend and interpret system behavior, identify hidden patterns, and facilitate biomethane process optimization. This study develops a novel integration of experimental and advanced computational analysis that addresses this gap by providing in-depth data-driven insights for optimizing the biomethane production from the AD of Xyris capensis subjected to acidic (HCl) pretreatment. The biomethane yield was determined experimentally by subjecting the Xyris capensis to different HCl pre-treatment conditions, with digestion conducted in mesophilic anaerobic conditions for 40 days. We leverage the experimental dataset to develop advanced statistical methods for parameter profiling, feature ranking, dimensionality reduction, cluster analysis, and neuro-fuzzy-based predictive modeling. The digestion process’ operational parameters and pretreatment conditions serve as input variables, while the biomethane yield serves as the output for the neuro-fuzzy model. Many studies have developed machine learning-based models in biogas research. However, little or no attention has been given to assessing the novel impact of data clustering on neuro-fuzzy models in biogas research. Thus, this study investigates the effects of the prominent data clustering techniques and hyper-parameters on the ANFIS with Particle-Swarm-Optimization (PSO). This integrated approach enables a comprehensive data-driven enhancement of the process output of the anaerobic digestion of biomass resources.
This research aims at investigating the impact of acidic pretreatment on the biomethane yield of Xyris capensis through the following objectives: (i) experimental investigations of biomethane yield under a different HCl concentration, exposure time, and digestion retention time in mesophilic anaerobic conditions for 40 days; (ii) assessment of the linear correlation between digestion parameters, pretreatment conditions, and biomethane yield; (iii) statistical assessment and visualization of the impact of acidic pretreatment on biomethane yield using a two-sample independent t-test; (iv) feature ranking of digestion and pretreatment parameters based on their relative importance and influence on biomethane yield prediction using a decision tree-based feature importance assessment; (v) unveiling dominant features influencing the energy yield of the biodigestion of Xyris capensis through dimensionality reduction using PCA; and (v) investigations of the impact of the data clustering and hyper-parameter optimization on the ANFIS-PSO model for biomethane yield prediction. The choice of the machine learning techniques, namely the PCA, k-means, and ANFIS-PSO techniques, was based on their unique strengths and capability in addressing the multidimensional complexity of the bio-digestion process and the dataset. Integrating lab-scale bio-digestion and data-driven methods in this research presents a robust and holistic framework that enhances anaerobic digestion optimizations. This research establishes an innovative and scalable approach for optimizing bioenergy, marking a substantial contribution to sustainable biogas development.