Modeling Dark Fermentation of Co ﬀ ee Mucilage Wastes for Hydrogen Production: Artiﬁcial Neural Network Model vs. Fuzzy Logic Model

: This study presents the analysis and estimation of the hydrogen production from co ﬀ ee mucilage mixed with organic wastes by dark anaerobic fermentation in a co-digestion system using an artiﬁcial neural network and fuzzy logic model. Di ﬀ erent ratios of organic wastes (vegetal and fruit garbage) were added and combined with co ﬀ ee mucilage, which led to an increase of the total hydrogen yield by providing proper sources of carbon, nitrogen, mineral, and other nutrients. A two-level factorial experiment was designed and conducted with independent variables of mucilage / organic wastes ratio, chemical oxygen demand (COD), acidiﬁcation time, pH, and temperature in a 20-L bioreactor in order to demonstrate the predictive capability of two analytical modeling approaches. An artiﬁcial neural network conﬁguration of three layers with 5-10-1 neurons was developed. The trapezoidal fuzzy functions and an inference system in the IF-THEN format were applied for the fuzzy logic model. The quality ﬁt between experimental hydrogen productions and analytical predictions exhibited a predictive performance on the accumulative hydrogen yield with the correlation coe ﬃ cient (R 2 ) for the artiﬁcial neural network ( > 0.7866) and fuzzy logic model ( > 0.8485), respectively. Further tests of anaerobic dark fermentation with predeﬁned factors at given experimental conditions showed that fuzzy logic model predictions had a higher quality of ﬁt (R 2 > 0.9508) than those from the artiﬁcial neural network model (R 2 > 0.8369). The ﬁndings of this study conﬁrm that co ﬀ ee mucilage is a potential resource as the renewable energy carrier, and the fuzzy-logic-based model is able to predict hydrogen production with a satisfactory correlation coe ﬃ cient, which is more sensitive than the predictive capacity of the artiﬁcial neural network model.


Introduction
Fossil resources, including petroleum, coal, and natural gas, have been utilized as crucial energy sources for the development of worldwide industry, technology, and investment; they satisfy approximately 80% of the global energy demand and supply [1][2][3][4][5]. Rapid depletion of limited fossil fuels, environmental contamination, greenhouse gas emissions, and climate change issues are questionable in the availability of the fossil fuel-dependent energy system in the long term [2,6,7]. There are diverse alternatives to sustainable energy, such as wind, solar, nuclear, hydroelectric, and second-generation fuels from lignocellulosic biomass. Hydrogen gas has received scientific attention and thought factors [10,32,33]. The neurological function, the most advantageous tool in the ANN, is able to train with input/output variables, calculate the weight and coefficient factors, and determine the optimal conditions with the lowest differences between the practical data and model-designed output [8,30]. The fuzzy-logic-based approach has also received attention from many researchers and has been applied in various fields of ecosystem, environmental science, and energy evaluation and prediction of anaerobic digestive processes [34][35][36][37]. Unlike the ANN, the fuzzy-logic-based model is applied for the predictive capacity without accurate knowledge of the process system and the interactions of the parameters [38]. Since a general fuzzy system converts numerical variables of the input and output into specified-level terms (high and low), the modeling of the anaerobic fermentative process is achievable for the prediction/development of the cumulative hydrogen yield under various types of substrate and interactions of the bacterial population [13,[39][40][41]. It is obvious that there is little to no literature for anaerobic fermentative hydrogen production from coffee mucilage combined with organic waste and its analytical modeling method. The current manuscript addresses the prediction of hydrogen production and yield in the batch system, varying the parameters of mucilage/organic waste ratio, pH, acidification time, chemical oxygen demand, and temperature. Two-level factorial experiments were tested, and their hydrogen profiles were modeled with the ANN and fuzzy-logic-based approaches. Furthermore, the prediction of the two models was tested with more anaerobic digestive fermentations under similar conditions, and those actual data were compared to the modeled data.

Raw Materials
Coffee mucilage wastes (Castillo variety coffee with a coffee demucilager machine) were provided from the Casa de Sabaneta farm (Sabaneta, Colombia), situated 1551 m above sea level with an average temperature of 23 • C. The fruit-vegetable organic wastes (crisp lettuce, Tommy Atkins mango, Valencia orange, guava and papaya) were taken from the Central Mayorista de Antioquia (Antioquia's Wholesale Market, Medellín, Colombia). The raw materials were not sterilized and stored at 4 • C prior to use. All other reagent and chemicals in the current study were purchased from Sigma Aldrich (St. Louis, MO, USA).

Experimental Design and Data Collection
In order to analyze the anaerobic fermentative performance, two-level half-factorial experiments were designed through the Minitab 16 software program (Minitab 16, Minitab Inc., State College, PA, USA), following our previous study [13]. The resulting 26 sets were performed in a 20-L bioreactor with an actual working volume of 13 L. Raw coffee mucilage samples do not require any supplements, such as carbon/nitrogen nutrients and initial microbial culture, for transforming fermentable sugars in the substrate to hydrogen since there are appropriate nutrient sources, minerals, and microorganisms in the samples [13,19,20]. Our earlier work found that 7 species were isolated after anaerobic dark fermentation, and 4 species (Micrococcus luteus, Kocuria kristinae, Streptococcus uberis, and Brevibacillus laterosporus) were relatively highly involved and participated in hydrogen production. Furthermore, increased hydrogen yield was observed when co-cultivation (bacterial consortium) was applied with K. kristinae and S. uberis, suggesting that the bacterial population could change the metabolic pathways and/or biochemical/molecular interactions that lead to efficient dark fermentation [13]. Briefly, a two-level factorial experimental test was designed with three different ratios (w/w) of coffee mucilage and organic wastes mixture (8:2, 5:5, and 2:8) were prepared, and additional control runs with only coffee mucilage or organic wastes were added [13]. Each anaerobic digestion was carried out with the independent variables of the substrate ratios (%), temperature (30-40 • C), chemical oxygen demand (COD) (20-60 g oxygen/L), acidification time (24-72 h), and pH (5.0-6.5). The initial pH was regulated by adding agricultural lime (95% calcium carbonate (CaCO 3 ), 2% humidity, and 54% soluble calcium oxide (CaO)). Determination of COD [42,43], operation of the bioreactor, and biogas collection followed our previous work [13]. The detailed experimental design and its hydrogen profiles of maximum hydrogen (%), daily hydrogen production (L H 2 /day), and cumulative hydrogen (L H 2 ) are summarized in Table 1. Table 1. Experimental data used for the ANN and fuzzy logic model. Each anaerobic fermentation was performed in duplicate under given experimental conditions. The hydrogen production was collected. Each data was statistically analyzed with 95% significant differences. Hydrogen gas in each test was collected in gas sampling Tedlar bags (Restek, LA, CA, USA) at interval times and determined with gas chromatography (3000 MicroGC system, Agilent, San Jose, CA, USA) as described in our previous study [13]. The gas chromatography was equipped with a molecular sieve column 5A (10 mm x 0.32 mm) connected to a thermal conductivity detector (TCD). Argon gas was used as the carrier gas at a 0.9 mL/min flow rate; the injector, column, and detector temperatures were kept constant at 60, 80, and 300 • C, respectively. A gas meter (G 2.5 volumetric, Metrex, Popayan, Cauca, Colombia) was used, which operated with a precision of 0.04 m 3 /h, and a 40 kPa maximum working pressure. Statistical analysis was performed with the t-test using the Minitab 16 program for daily hydrogen production, hydrogen content, and accumulation of hydrogen, with a 95% significant difference.

Analytical Modeling Approach-Artificial Neural Network (ANN)
In order to develop the hydrogen yield prediction model in anaerobic fermentative digestion, the ANN approach employed a multilayer perceptron-type neural network [44,45]. The backpropagation algorithm was employed for training and propagation of the error, which is the widely used training algorithm in the ANN. The main advantage of this algorithm is the ability to calculate the difference/error between the output from the neural network and the actual output and back propagate them through the designed system. The algorithm is able to adjust the weights in each independent variable (input and hidden layers); the successfully repeated procedure can minimize the errors between the experimental data and model-calculated prediction [8,30,46]. The five input factors chosen in the current work were: i) Substrate ratios (x 1 ), ii) acidification time (x 2 ), iii) chemical oxygen demand (x 3 ), iv) pH (x 4 ), and v) temperature (x 5 ). The interactions between the independent variables on hydrogen production were considered in a hidden layer of the neural network; one dependent variable (hydrogen production) was set up in the output layer (Y1). The training was supervised (with a pattern), which was the experimental data obtained from each input variable. The initial weights (w) for the connections between the layers were assigned randomly with the nn-tool option in the Matlab R2012a. The designed artificial network model is presented in Figure 1. To obtain the global output (O 3 , Equation (1)), the hidden layer and output layer used a sigmoid function and a linear function, respectively. The generalized form of the network output was investigated using Matlab platform R2012 syntax (MathWorks, Inc., Natick, MA, USA) as below (Equations (1) and (2)): where w 1 i,k : The weight of the connection between the input variable i in layer 1 and the neuron k in layer 2; w 2 k,y : The weight between the output of neuron k and the output neuron in layer 3 (y); n: The number of inputs in layer 1 (equal to 5); m: The number of neurons in the hidden layer (10); t: The learning rate (taken as zero); LW (2,1) : The weight vector of layer two (YxK, with Y number of outputs and K number of neurons); and IW (1,1) : The weight matrix of layer one (KxX, with K number of neurons and X number of inputs).
The error in the network output (E, Equation (3)) was calculated as the difference between the experimental values in the response for a combination of the inputs and the predicted network output (O 3 ). This error was distributed among the outputs of the previous layer, with which the error in layer two was determined (E 2 k , Equation (4)). Once the error was distributed, the connection weights between elements were updated using Equation (5): where O 2 k : The output value for the k neuron in layer 2; w 2 k,y : The weight between the output of neuron k in layer 2 and the output neuron in layer 3; E 2 k : The output error of the k neuron in layer 2; w p st (current): The updated or recalculated weight of the connection between the s th neuron of the p th layer and the t th neuron of the (p + 1) th layer; w p st (previous): The weight of the connection between the previous neurons without updating; x st : The input of the s th neuron into the t th neuron; and α: The learning constant (taken as 0.3) [44].
For the training and testing of the network, the 26 experimental data were randomly distributed into two groups, one set of 18 data to training and the other 8 data to test the network. The training consisted in passes through the network of the values of the input variables for each of the 18 trials in a sequential way (values of substrate ratio, acidification time, chemical oxygen demand, pH, and temperature), and their output was obtained from the network. In each case, the outputs of the hidden layer neurons, network error, distribution of the error backwards, and the updated weights were obtained. This procedure was developed in an iterative way until the error reached a value of less than or equal to 0.002, defined as the maximum value accepted. The input variables were normalized (values between 0 and 1) before the training and testing of the network, and then the outputs were decoded into values in their original domain.

Models with Fuzzy Logic
In order to develop the fuzzy-logic-based model, five input variables (substrate ratio, acidification, chemical oxygen demand, pH, and temperature) were used, and the following output data from the test was corresponded to the dependent variables. Numerical input and output data were converted into a form of linguistic levels (very low, low, medium, high, and very high), using trapezoidal membership functions ( Figure 2). The fuzzification process, including inference operators, minimization, and products, is capable of providing an understanding of the variables and membership functions from multiple-input data [47][48][49]. where µ i , corresponds to the membership function for fuzzy sets very low, low, medium, high, and very high; a i , b i , c i , d i , e i , f i , g i , h i , i i , and j i are the vertices of the fuzzy sets; and x is the value of each variable (substrate ratio, acidification, chemical oxygen demand, pH, and temperature), whose domain was defined according to the database. In the response variables, the domain was defined according to the results achieved during the tests. The fuzzy inference system was conducted through Mandani with fuzzy inference rules based on the antecedent-consequent format given by linguistic expressions of the IF-THEN form. The AND operator was utilized to evaluate the rules (minimum method or intersection of two fuzzy sets (Equation (6)), while the aggregation was calculated by the OR operator (maximum method or union of two fuzzy sets (Equation (7)). The transformed fuzzy outputs to values in the original domain of the independent factors were completed with the centroid method described in the earlier study [38]: where A and B are fuzzy sets in X with a group of pairs such that: A = x, µ A(x) /x ∈ X and B = x, µ B(x) /x ∈ X ; and µ A(x) y µ B(x) are membership functions for the fuzzy sets A and B, respectively, with their domain for the x variable.

Implementation of the Artificial Neural Network and Fuzzy Logic Models
Although the main aim for the comprehensive concept in the study was the production of hydrogen gas, this paper concentrated on modeling and comparing two model approaches, following on from our previous work [13], to estimate its predictability. Briefly, the carbohydrates and sugars in coffee mucilage were metabolized via the Emnden-Meyerhof (glycolytic pathway) process, which is able to produce hydrogen molecules by accepting/donating protons between adenine dinucleotide (NAD+) and nicotinamide adenine dinucleotide (NADH). The highest hydrogen yield (25.94 L equivalent to 1.21 L H 2 /L substrate) was obtained in the presence of the blended substrate (8 mucilage: 2 organic wastes) at 30 • C, acidification time of one day, pH 6.5, and chemical oxygen demand of 60 g O 2 /L while less to no hydrogen production was observed in the other tests with different substrate ratios [13]. This anaerobic digestion does not require any additional carbon/nitrogen sources, light, and initial microorganism for fermentation performance because the raw materials already have the resources for dark fermentation [13,19,20].
The artificial neural network model (ANN) architecture (5-10-1) was built for both acclimatization of the model-calculated prediction and the experimental phase ( Figure 1). We aimed to reach an error less than or equal to 0.002 that was available at a hidden layer with 10 neurons. Since the increase in neurons at the hidden layer could result in high values of the error and unacceptable data [50], the training algorithm in the neuron system was fixed with 10 neurons. The data from the cumulative hydrogen production variable were passed through the network to obtain the error; its training continued until 175 iterations, resulting in a coefficient of determination of 0.7866 ( Figure 3A). The final weight matrix was determined between the inputs and the 10 neurons of the hidden layer (IW (1,1) ), and the weight vector between the 10 neurons of the hidden layer and the output (LW (2,1) ). The matrix connections (IW (1,1) ) and those of the vector (LW (2,1) ) were determined with the lowest errors. The weights matrix and the weight vector (accumulative hydrogen production) are tabulated in Table 2.  Table 2. Artificial neural network model training parameters for the final weights' matrix and vector of the accumulative hydrogen production. The positive (+) and negative (-) sign represent an increase or decrease in the connection between neurons, respectively. Similar to the neural network model, the fuzzy logic-based model was applied to forecast hydrogen gas production from the two-level half-factorial design experiments. The membership functions with five input variables developed five diffuse sets of five linguistic levels (very low, low, medium, high, and very high). The membership functions and their ranks for each input and output variables are summarized in Table 3. A total of 1595 linguistic rules in the IF-THEN format were used to develop the fuzzy logic-based model with 26 experimental data by testing with different input parameters. There are two reasoning fuzzy systems (Mamdani vs. Takagi-Sugeno type) for applying the variables from the combination of input and membership functions to estimate output [47,51]. Since Mamdani's approach is more suitable to apply fuzzy inference by interpreting the fuzzy rules than the Takagi-Sugeno method, Mamdani's fuzzy inference method was used to forecast the hydrogen production. Each input factor was defined between 0 and 100 for the substrate ratio, 0 and 83 g O 2 /L for COD, 0 and 3 days for acidification, 30 and 40 • C for temperature, and 0 and 6.5 for pH, respectively. Figure 3B presents the correlation (R 2 = 0.8485) between the experimental cumulative hydrogen production and those predicted generations through the fuzzy model for data pointing. Even though some experimental tests showed little to no hydrogen production, the two modeling analyses assessed and predicted the accumulative hydrogen yields. Comparing the hydrogen predictions using the two different models, the fuzzy model was more fitted, with a > 84% coefficient correlation, than those from the ANN model, with a > 77% correlation coefficient.  24 30 30] In order to acquire an accurate prediction response with a higher probability worth (coefficient correlation) in both the ANN and fuzzy models, the poor bi-lateral parameters, including substrate*COD and acidification*pH, were excluded, and accurate architectures were simulated and created. The experimental and predicted hydrogen production by the ANN and fuzzy models are compared in Figure 4. Coefficient correlations (R 2 ) of 0.8369 and 0.9508 were achieved for validation of the ANN and fuzzy models, respectively, and these results indicated that the prediction of hydrogen in this experimental design was 83.69% and 95.08%, respectively. It is definite that both models accurately predicted the hydrogen production in given conditions, and the fuzzy model was better than the ANN model, as reflected by R 2 of 0.8369 and 0.9508. In both models, an adequate fit was observed in the linear trends between the predicted and experimental hydrogen production. Prediction model training with connection weights in the input and hidden neurons improved the correlation efficient. The R 2 values for hydrogen production confirmed that the model was able to predict the hydrogen generation in a dark fermentation phase closer to the experimental results. The correlation efficient in this study is comparable with earlier research studies with the modeling on hydrogen production from other substrates that is summarized in Table 4. A similar or higher R 2 value was obtained for both the ANN and fuzzy models compared with a comparative study of modeling on hydrogen production. The hydrogen production of simple sugars, such as glucose and lactose, seems to have a proper fit since mixed waste substrates have complex compositions that may dominantly contribute to cell growth, consumption of substrates, metabolic pathways, and biochemical conversion associated with hydrogen production in anaerobic fermentative conditions [13]. It is worthwhile to address that the prediction and estimation of hydrogen production from pure substrates via anaerobic fermentative performances is more accurate for a modeling study than those from mixed substrates due to potential factors that affect the hydrogen production [53,54]. However, the supplementation of essential sources (phosphorous, ferrous, nitrogen, mineral, and metals) from waste materials could positively impact the anaerobic cell growth and support hydrogen-producing metabolisms to improve the final hydrogen production. Margarida et al. [55] highlighted that the carbon/nitrogen ratio is a primary factor for anaerobic fermentative digestion, and they found that the additional nitrogen provided by organic wastes improved the hydrogen yield by increasing the C/N ratios. Furthermore, phosphorous (P) plays a role in adenosine triphosphate synthesis; the additional phosphorus could support the enzyme linkage and metabolic pathway for hydrogen production during anaerobic fermentation [54].

Conclusions
The hydrogen production from unutilized coffee mucilage combined with organic waste was predicted by two different approaches of ANN and fuzzy models. Anaerobic digestion was carried out with five input factors, including the substrate ratio, COD, acidification time, pH, and temperature, using a two-level factorial design test, and the consequent hydrogen yields were analyzed and fitted for the final hydrogen yields. This study confirmed that the use of the experimental results for bio-digestion of the waste substrate in a batch reactor with the ANN and fuzzy model was predictive, achieving an R 2 value of 0.8369 and 0.9508, respectively. The fuzzy model had a greater predictive capacity than the ANN model with the response to the interaction of independent factors: Hydrogen yield (L) = -137,74 -0.93*substrate ratio -0.00124*COD + 7.65*acidification + 23.79*pH + 5.88*temperature -0.09*substrate ratio*acidification + 0.22*substrate ratio*pH -0.0000988*COD*acidification + 0.000258*COD*pH + 0.2*acidification*temperature -1.14*pH*temperature -0.00182*substrate ratio*substrate ratio -1.7*acidification*acidification. The use of coffee waste as an alternative source for energy production can be considered, suggesting a practical and efficient application for industrial strategy.