A Review of Modeling Bioelectrochemical Systems: Engineering and Statistical Aspects

Bioelectrochemical systems (BES) are promising technologies to convert organic compounds in wastewater to electrical energy through a series of complex physical-chemical, biological and electrochemical processes. Representative BES such as microbial fuel cells (MFCs) have been studied and advanced for energy recovery. Substantial experimental and modeling efforts have been made for investigating the processes involved in electricity generation toward the improvement of the BES performance for practical applications. However, there are many parameters that will potentially affect these processes, thereby making the optimization of system performance hard to be achieved. Mathematical models, including engineering models and statistical models, are powerful tools to help understand the interactions among the parameters in BES and perform optimization of BES configuration/operation. This review paper aims to introduce and discuss the recent developments of BES modeling from engineering and statistical aspects, including analysis on the model structure, description of application cases and sensitivity analysis of various parameters. It is expected to serves as a compass for integrating the engineering and statistical modeling strategies to improve model accuracy for BES development.


Introduction
Bioelectrochemical systems (BES) are emerging technologies that apply microorganisms to transform chemical energy in wastewater to electrical energy through multiple microbial-electrochemical reactions [1][2][3][4].The increasing energy demand and desire for environmental sustainability motivates the technology development for energy recovery from wastes over the past decades [5][6][7].Among the emerged concepts in energy recovery and wastewater treatment, BES have received significant attentions and are extensively studied with different configurations and designs, including microbial fuel cells (MFCs), microbial electrolysis cells (MECs), and microbial desalination cells (MDCs), as well as the combination of up-flow constructed wetland and MFC [2,[8][9][10][11][12][13][14][15][16].The performance of BES is affected by biological, physical-chemical and electrochemical aspects (e.g., microbial community, configurations, electrode material, etc.) [17][18][19].Considerable studies have been conducted to optimize the above aspects [1,7,8,[20][21][22][23][24].However, the BES development for energy recovery from wastewater treatment still has bottlenecks [18,25].For example, BES may not replace anaerobic digesters to treat the high-strength wastewater [26], and the scale-up BES cannot achieve a similar performance in current generation as that of the lab scale systems [18,27,28].How to improve the BES performance is still challenging, and thorough understanding of these significant aspects that limit the overall BES performance is needed for future improvement.
A variety of works have been conducted to study the effect of design and operational parameters on BES performance, such as reactor configurations and scales, electrode materials, electrode surface areas and the types of electron donors [8,[29][30][31][32][33][34][35][36][37].However, because of the lack of sensing technology, some aspects are hard to be measured in situ, such as microbial community composition, biofilm thickness and growth rate, redox mediator transferring, and substrate utilization rate of bacteria attached on the electrode [38][39][40][41][42][43][44].These hard-to-measure aspects are likely to be key components for understanding the mechanisms of electricity generation and improving the efficiency of BES.
Mathematical modeling can be a powerful platform to investigate the effect of above aspects on the overall BES performance [19,23,[45][46][47].The use of mathematical models may provide analytical description of these hard-to-measure aspects, and thus reduce the measurement effort.The versatility of mathematical models enables the conversion of complexly systematic phenomena into a relatively simple series of mathematical expressions to describe the effect of each component on the overall output [19,48].In general, mathematical modeling is performed in two approaches.One approach is to derive model based on engineering/physical laws governing the system processes, and this is called engineering modeling [49,50].In this review, most engineering models are deterministic models, because the output of the model is determined by the parameter values and the initial conditions [51].For example, differential equations (DEs) are used as main mathematical equations in the engineering models due to their strong ability to embody the dynamics of BES [19,[52][53][54][55].The parameter values and initial conditions are set depending on specific situations to operate the model.The main equations are combined with physical, chemical and electrochemical principles to solve specific problems.For example, Monod-type equations are created for describing the growth curve of specific microorganisms in a certain living environment [40,56,57].Nernst-Michaelis-Menten equations are applied to calculate the amount of Nernstian electron transferred to embody bio-anode kinetics [58][59][60], and Nernst-Planck equations are applied to represent the ion diffusion through the interface or membrane of a system [61][62][63][64].The other approach is statistical modeling, which is developed based on the data collection from BES [49,50], and is also valuable for characterizing the system input-output relationships, especially when there is limited engineering-domain knowledge to characterize the complex mechanisms of electricity generation and organic material removal.In addition, the statistical models can better capture the system uncertainty and remedy the error originated from engineering models for better system quantification [49].
To help satisfy the increasing needs for mathematical models to achieve better development of BES for wastewater treatment and energy recovery, this review aims to summarize and discuss different modeling strategies for BES in the categories of engineering models and statistical models.It is significantly different from a recent review paper of mathematical models that focuses on overview of model studies [19].A general modeling methodological tree is presented in Figure 1.The motivations, assumptions, advantages, limitations and future directions of various modeling strategies summarized in this paper can be a reference for scholars to choose a proper modeling strategy according to their objectives.Multi-task Lasso Model; ANN: artificial neural network; GP: genetic programming; SVM: support vector machine; RVM: relevant vector machine).

Overview
BES have multiple processes of physical-chemistry, microbiology and electrochemistry, which are dynamically related to each other.Such dynamic processes are governed by many key parameters, which influence the BES performance.For example, the energy recovery of a MFC is impacted by parameters about substrate concentration, substrate type and system configuration [65].Understanding the relationships among these parameters and their dynamic processes is important to optimize the system for electricity generation and wastewater treatment [19,66,67].Specifically, DEs are the most frequently applied techniques [52][53][54][55].Based on the common mathematical methods used in BES, in the following, DEs are divided into ordinary differential equations (ODEs) and partial differential equations (PDEs) to discuss their motivations, advantages and limitations.

Overview
BES have multiple processes of physical-chemistry, microbiology and electrochemistry, which are dynamically related to each other.Such dynamic processes are governed by many key parameters, which influence the BES performance.For example, the energy recovery of a MFC is impacted by parameters about substrate concentration, substrate type and system configuration [65].Understanding the relationships among these parameters and their dynamic processes is important to optimize the system for electricity generation and wastewater treatment [19,66,67].Specifically, DEs are the most frequently applied techniques [52][53][54][55].Based on the common mathematical methods used in BES, in the following, DEs are divided into ordinary differential equations (ODEs) and partial differential equations (PDEs) to discuss their motivations, advantages and limitations.

Introduction of Ordinary Differential Equations
ODE is a type of DE to describe the dynamic processes, focusing on only one variable.A classical form of ODE is shown in Equation (1) [68]: The objective of ODE is to find the relationships between input x and output y by describing the differential changes of output under change of input [52,68].

Ordinary Differential Equation Stereotypes in Bioelectrochemical Systems
ODE is a convenient and efficient tool to describe the relationships of various variables in BES, such as the impact of growth and decay of microorganisms on the dynamic response of the change of substrate concentration and microbial population on the anode [54,69].ODEs allow fast computation and provide an advantageous platform for real-time process control and optimization of BES [70,71].Some ODE stereotypes can successfully model the dynamics of substrate concentrations and various microbial populations in BES, and these ODE stereotypes are also the foundations for the future ODE development [39,54,72].
ODE stereotypes are constructed based on the principle of mass balance to illustrate the dynamic response of substrate concentration and biomass in the bulk solution in anode chamber [39].Several parameters were considered in the ODE: the rate of substrate consumption, the rate of growth and death of biomass in the bulk solution, and the rate for the electrochemical oxidation of reduced mediator at the anode surface.These considerations have effectively tracked the source and sink for the dynamic change of substrate and biomass in the bulk solution at each moment in time, and a generic form of ODEs was constructed [39].However, the ODE cannot predict the dynamic change of microbial biomass on the biofilm.The equations simplified the microbial community into overall biomass, which cannot specify and distinguish the contribution of each type of functional microbial species (e.g., electricigenic and acetoclastic methanogenic microorganisms) in the anode chamber [39].
More general ODE stereotypes were developed with more focus on the competition and contribution among various types of microorganisms in the anode chamber of BES, including two types of BES (i.e., MFC and MEC) [54,69].They successfully integrated multiple kinetic equations such as Monod equations to take many parameters into account, including substrate consumption rate, biomass growth rate, and even the impact of mediator on the dynamic simulation (Figure 2).The related ODE stereotype equations are listed and explained in the Table S1 in Supplementary Materials.

Applications of Ordinary Differential Equation Stereotypes
The above ODE stereotypes are further developed for other BES, such as MDC, osmotic-microbial fuel cell (OsMFC), pressure-retarded osmosis/microbial electrolysis cell (PRO-MEC), and membrane bioelectrochemical reactors (MBERs).The ODEs applied in these BES are developed These ODEs could simulate the system output affected by glucose supply by constructing two-step degradation using acetate as intermediate substrate [69].Moreover, a multi-population dynamic model involves ODEs looking into the various contributions of electricigenic and acetoclastic methanogenic microorganisms on the system performance, including substrate consumption and current generation [54,69].

Applications of Ordinary Differential Equation Stereotypes
The above ODE stereotypes are further developed for other BES, such as MDC, osmotic-microbial fuel cell (OsMFC), pressure-retarded osmosis/microbial electrolysis cell (PRO-MEC), and membrane bioelectrochemical reactors (MBERs).The ODEs applied in these BES are developed with addition of certain equations on the stereotype models mentioned before, to enrich the modeling functions for other physical and chemical processes (Table S2, Supplementary Materials).For instance, the ODEs were modified with further consideration of salt flux through the ion exchange membrane, to estimate the performance of MDC [73,74] (Figure 3).The upgraded model can predict the salt concentration in the desalination, anode and cathode chambers, and be used to study the impact of different parameters on the system performance to optimize the salt removal efficiency.The model can even predict the dynamic response of the system output by changing the influent concentration of acetate, the external resistance, and the influent salt concentration.The developed ODEs can accurately simulate the quantitative influence of different parameters on MDC performance.The best acetate flow rate, influent salt concentration, and salt solution flow rate were found according to the model output [73], and thus the developed ODEs can be applied as valuable tools for future MDC development and optimization.

Applications of Ordinary Differential Equation Stereotypes
The above ODE stereotypes are further developed for other BES, such as MDC, osmotic-microbial fuel cell (OsMFC), pressure-retarded osmosis/microbial electrolysis cell (PRO-MEC), and membrane bioelectrochemical reactors (MBERs).The ODEs applied in these BES are developed with addition of certain equations on the stereotype models mentioned before, to enrich the modeling functions for other physical and chemical processes (Table S2, Supplementary Materials).For instance, the ODEs were modified with further consideration of salt flux through the ion exchange membrane, to estimate the performance of MDC [73,74] (Figure 3).The upgraded model can predict the salt concentration in the desalination, anode and cathode chambers, and be used to study the impact of different parameters on the system performance to optimize the salt removal efficiency.The model can even predict the dynamic response of the system output by changing the influent concentration of acetate, the external resistance, and the influent salt concentration.The developed ODEs can accurately simulate the quantitative influence of different parameters on MDC performance.The best acetate flow rate, influent salt concentration, and salt solution flow rate were found according to the model output [73], and thus the developed ODEs can be applied as valuable tools for future MDC development and optimization.Forward osmosis (FO) is reported to be competent to help MFCs to simultaneously achieve wastewater treatment, water extraction from the wastewater, and electricity generation together Forward osmosis (FO) is reported to be competent to help MFCs to simultaneously achieve wastewater treatment, water extraction from the wastewater, and electricity generation together [75], but quantitative impact of various parameters on the performance still needs to be investigated to improve the system output.ODEs are further developed to evaluate the application value of FO in some BES applications, including OsMFCs and PRO-MEC [76,77].Additional ODEs related to the salt flux between draw and feed solution were used to predict the performance of an OsMFC, with consideration of water and salt flux through the FO [76].The model construction was proposed to analyze the impact of FO on the organic degradation in the anode and current generation [76].The OsMFC model successfully validates the experimental results that higher catholyte conductivity Energies 2016, 9, 111 6 of 27 can increase the current generation, and reveals that increased water flux through FO can decrease electrical resistance of membrane and generate higher current, which proves the important value of FO membrane in increasing current output of BES.The equations of water flux through FO were integrated into the previously developed ODE stereotype, on purpose of simulating the energy generation from the process of pressure retarding in the unit of PRO, to study the impact of given pressure on the electricity generation of the PRO unit [77].The developed model can obtain the relations among pressure supply, electricity supply to drive MEC and the hydrogen production.The ODEs of PRO-MEC further reveal the impact of time-dependent change of proton concentration (e.g., migration and diffusion of proton to the cathode and further reduction to hydrogen gas) on the reduction potential and the current generation.
BES are combined with membrane bioreactor (MBRs) to form the system called MBERs for outstanding effect of wastewater treatment and nutrient removal (e.g., nitrogen) from the entire system [78].However, quantitative analysis is still needed to improve the MBERs performance, and the ODE stereotypes were developed to simulate the system output of MBERs [79].The model can predict the growth and loss of "cake" layer on the membrane reactor, to indicate the impact of biofouling on the system performance.The contribution of this model is to accurately model three MBERs with various configurations [80].The versatility of the configuration application can cover many important properties in a MFC-MBR system, such as membrane fouling, nitrogen removal, cake formation and detaching based on various cross-flow rates through the membrane.
Substrate diffusion in the biofilm matrix is another important factor on the biological activity attached on the electrode, and understanding of the parameter is significant to improve the activity of microorganisms on the electrode for greater electricity generation.The ODE stereotypes were modified to enrich the mathematical expressions of mass balance and substrate diffusion between the interface of biofilm and bulk solution, to simulate the biological activity inside the biofilm [44,81].The model is able to simulate the performance of reverse MFC, especially the biofilm thickness and acetate production rate in the cathode [81].The conception of finite difference approach was applied to make ODEs workable to simulate the mass diffusion and the impact of pH change on the substrate consumption and biofilm performance attached on the electrode (Figure 4) [44].The updated ODEs become more powerful for analyzing the dynamic response of biofilm accompanied with the pH change on the substrate consumption and current generation.

Advantages and Limitations
ODEs are an easy and convenient method to reflect the macro scale dynamic response against the change of other parameters.ODEs have advantages in terms of low computational load and relatively easy to be adjusted [19,73,80].
However, ODEs also have intrinsic limitations.ODEs can only trace the time-dependent Reproduced with permission from reference [44].

Advantages and Limitations
ODEs are an easy and convenient method to reflect the macro scale dynamic response against the change of other parameters.ODEs have advantages in terms of low computational load and relatively easy to be adjusted [19,73,80].
However, ODEs also have intrinsic limitations.ODEs can only trace the time-dependent performance of one input [82].Thus, when facing problems related to distance such as geometry of the biofilm and the distribution of ion concentration in the ion exchange membrane between anode and cathode, ODEs cannot sufficiently study the problems.Though the finite difference concept was successfully used to divide the biofilm thickness into multiple finite sections for ODE to predict the mass diffusion into or out of each section of the biofilm one by one (Figure 4), the distance-related problems cannot be handled by ODEs in general [44].To successfully apply ODEs to perform BES modeling, multiple assumptions are needed to simplify the problems, and these assumptions are important error sources for ODEs.Most assumptions are relevant to the elimination of spatial factors, because time is already taken into account as one independent variable and spatial factor (e.g., distance) is difficult to be another independent variable.Assumptions related to the above ODE stereotypes and applications are listed and explained with the relevant key kinetic equations in Supplementary Materials (Tables S1-S3).Some important examples of the assumptions are discussed as follows: First, both anode and cathode compartments are treated as continuously stirred tank reactors with the feature of completely mixing [73,83], and biofilm is distributed uniformly attached on the electrode, and biomass retention is described by two phase growth-wash-out model [38,54,84].This assumption may cause the delayed dynamic response of biofilm attached on the electrode against the change of environmental factors, especially the biological components such as biofilms on the electrode and bacteria in the bulk solution.For example, ODEs were successfully applied to describe the impact of catholyte conductivity (adjusted by varying NaCl concentrations) on the current generation and water flux through the membrane of OsMFC; however, when the catholyte was changed from 35 g¨L ´1 to 2 g¨L ´1 NaCl, leading to decreased conductivity and current generation, the anolyte COD changed very slightly, which was overestimated by model expectation for greater anolyte COD consumption [76].Because microorganisms are the dominant community on the anode, the model output of COD change is expected to be strongly correlated to the change of current generation and anodophilic bacteria on anode [54,73,76,77].However, the actual change of COD is much smaller than the modeling results when changing the experimental conditions, possibly because of delayed response of biological factors for the changing conditions in these cases [76,77].
Second, mass transportation through the membrane due to electro-migration or chemical gradient between the anode and the cathode is assumed to be sufficiently fast to ignore the thickness of ion exchange membrane [85,86].This assumption ignores the impact of ion exchange membrane on the ion transportation, and will generate errors for the gain and loss of ions in anode and cathode chambers.For example, Donnan effects on the ion exchange membrane would lead to strong effect of ion adsorption onto the membrane, which makes electrical potential drop through the thickness of ion exchange membrane [87].The accumulated ions on the solution/membrane interface would have repulsion effect on the same charged ions transferring through the membrane from the solution [88,89], which will have extra ion transferring impedance to cause the interruption on the ion transportation.
Other assumptions are also required to construct ODEs.For example, one can assume the ideal mixing to eliminate concentration gradient, because only one independent variable (i.e., time) can be taken into account while other variables (e.g., distance) cannot be considered in ODE models.The intrinsic drawback of ODE needs other mathematical principles to solve by simultaneously considering more independent variables in the model.

Introduction of Partial Differential Equations
Most BES are impacted with many temporal and spatial parameters.Thus, more complex DEs are needed, which should not only consider the temporal factors, but also the spatial factors for predicting the high dimensional problems.PDEs can deal with multiple inputs simultaneously with the general form shown in Equation ( 2) [90]:

Partial Differential Equation Stereotypes in Bioelectrochemical Systems
Some high-dimensional problems, including biofilm thickness, microbial distribution on the electrode, and the ion transportation through the membrane, are all significant to study the working mechanisms of BES performance to enhance knowledge for performance optimization [38,63,91].Previously, assumptions are needed for ODEs to solve the above problems.Upcoming PDEs can help analyze these assumptions that ODEs cannot deal with.Some PDE stereotype equations are also created to reveal the dynamic response of the biofilm, and the mass transportation through the ion exchange membrane in three-dimension (3D) (Table S4, Supplementary Materials) [38,39,63,91].These PDE stereotypes lay a cornerstone for the future applications of PDEs for various objectives.
A stereotype PDE was developed by integrating Monod and Nernst equations together, to simulate the electron production rate and the local potential of the biofilm and to understand the kinetic and electrochemical property of the biofilms on substrate consumption and electrons production [38,92].This PDE has developed a concept of conductivity of biomass biofilm to represent electrical conductivity through 3D pattern of the biofilm and to make spatial prediction for impact of biomass thickness and accumulation of inner biomass on current generation.The PDE successfully overcomes the limitation of ODE to quantitatively indicate the relation between specific detachment rate and density of biomass on the electrode.
Another PDE was assisted with Nernst-Planck equations to simulate the ion transportation through the ion exchange membrane [63].The simulation was performed in the process of diffusion and electro-migration, and the ion transport process is shown in Figure 5.The model is able to show the ion gradient in the ion exchange membrane, which is promising to simulate ion transportation that ODEs cannot deal with.Moreover, it can describe the pH gradient in the ion exchange membrane to help explain the relation of pH gradient between two chambers and ion exchange membrane.A series of general PDEs were further developed on the basis of original ODEs.The PDEs can predict the microbial activity on the biofilm to consume the substrate for growth and electron generation, which considered substrate diffusion and the biofilm geometry together [39,72], and even the effect of pH on the model prediction [91].A series of general PDEs were further developed on the basis of original ODEs.The PDEs can predict the microbial activity on the biofilm to consume the substrate for growth and electron generation, which considered substrate diffusion and the biofilm geometry together [39,72], and even the effect of pH on the model prediction [91].

Applications of Partial Differential Equation Stereotypes
The above stereotype PDEs are further developed and applied for specific problems in BES, such as impact of oxygen diffusion on substrate concentration profile in single-chamber MFCs, quantitative description of the Donnan effect, and computational fluid dynamics (CFD) to predict mass distribution inside the tubular reactor.The relevant equations are listed and explained in Supplementary Materials (Table S5).
Single-chamber MFCs are a system without ion exchange membrane to separate the anode and cathode chamber, because the design can reduce the membrane resistance for higher current generation [34,93].However, the oxygen diffusion from the cathode to the anode will be greater in the absence of the membrane impedance, and thus the activity of the microorganisms will be impacted by the oxygen transportation [34].Investigating the impact of transported oxygen on the system performance is important to evaluate and optimize the system structure.A previously developed stereotype PDE was modified to simulate the impact of substrate concentration, oxygen diffusion, cathode thickness (material: polytetra-fluoroethylen (PTFE)), and biofilm thickness on the voltage output of the entire system [94].It was found that at the condition of high substrate supply, the biofilm thickness had greater impact on voltage output than the PTFE layer [94].
PDE is also applied to investigate the fluid flow pattern in the BES by CFD.CFD can be applied to study the flow dynamics to improve the condition of substrate distribution to increase electricity generation [95].Kim et al. applied CFD to simulate the flow dynamics in twelve MFCs with different internal structures and to analyze the impact of anode surface on the electricity generation; it was found that power generation increased exponentially with increased anode surface, due to greater bio-electrochemical reaction space [96].
The dynamic change of biofilm on the electrode is another important problem, because the biofilm growth and detaching is strongly relevant to substrate degradation and electricity generation [97,98].A PDE is used to simulate the dynamic response of Geobacter of the biofilm and understand how the species react against the change of external operating conditions.The results reveal the importance of substrate concentration and pH gradient on the biofilm growth and electricity generation through the entire biofilm [97,98].
The simulation on the ion transportation through the ion exchange membrane can serve as a monitor to indicate the relations between the activities of specific ions and system output.Ammonium is a significant species deserving to be monitored and analyzed, because the ammonium is considered to be main carrier of protons through proton exchange membrane from anode to cathode, and the accumulation of ammonia in cathode enhances the possibility to recover the ammonia from cathode due to high pH in cathode [99,100].A PDE model on the basis of Nernst-Planck equation is created to successfully describe the current-driving ammonia evaporation and recovery in the cathode, and the model can describe in detail the impact of different factors on the ammonia recovery including flow rate of gas in the cathode and back diffusion of ammonia from cathode to anode, which makes great contribution to improve ammonia treatment and recovery from the wastewater in BES [85].

Advantages and Limitations
The primary advantage of PDEs is the higher ability to solve the spatial problems [82].PDEs can consider the impact of not only temporal factors, but also the spatial factors on the system performance.Thus, PDE can trace the flow dynamics and biomass growth in the angle of high dimensions in BES [39,41,101].However, because PDEs consider the impact of more factors on the BES performance, the trade-off is the increased time required for PDE construction [82].PDEs structure is generally more complex than ODEs, because a larger number of parameters needs identification.The complexity increases the simulation time, particularly in the multidimensional space dependent models [82].
There are still some limitations even if PDEs are used to substitute ODEs in the model structure, and the details of PDE assumptions are presented and explained in Supplementary Materials (Tables S4-S6).For example, it was assumed that organic substrate is converted by reducing redox mediator, and the redox mediator is to complete the electron transferring mechanism in bulk solution and biofilm [39,91].However, in practical application, some operational steps can make deviation between experimental data and model outputs.When voltage drops to a baseline, replacing the medium will remove some mediator from the bulk liquid, leading to abrupt concentration drop of the mediator in the solution.After medium refreshment, the change of the mediator concentration causes decreasing current in the model while experimental data indicates current increase after acetate addition [39].The deviation exists due to certain assumptions in this particular model [91].More advanced alternative models are required to improve the model accuracy and precision.

Sensitivity Analysis of Model Parameters
The parameters in engineering models need to be adjusted to predict the model output accurately and speed up the model construction.Sensitivity analysis can allow us to determine which parameters are the most influential factor on the model output, and select the most relevant parameters to be adjusted in less computational time [102,103].A common method to perform the sensitivity analysis is to vary each parameter while fixing the other parameters, one factor at a time, to check the impact of the parameter on output [83,104].
Another function of sensitivity analysis is to evaluate the parameters' relative importance for the output [73,105,106].The importance comparison among all parameters can quantitatively assist the explanation for the relationships among various factors in the system.For example, model construction consists of three parts: model fitting, validation and prediction [73,107,108].Sensitivity analysis is one part in the process of model fitting, to determine the impact of a specific parameter on the system output [73].In general, relatively more sensitive parameters will take priority to be adjusted and to fit the experimental data under specific situations in model fitting.In the MDC model, one parameter (k r ) has the highest sensitivity, whose physical meaning indicates how fast the internal resistance of the system responds to the change of microorganism concentration [73].In this case, changing flow rate is applied in model fitting to obtain the experimental data and to give guidance for development of mathematical equations, assumption determination and parameter adjustment to make model structure.Because of high sensitivity, k r significantly impacts the model accuracy and precision.High sensitivity of k r is reasonable because the dynamic change of microorganism concentration in a MFC strongly affects the efficiency and availability of electron transferring to the anode.k r considers and represents the overall impacts of substrate diffusion and reaction kinetics on the anode [109][110][111].Determination of highly sensitive parameters can reduce the computational time to make satisfactory model.Once the model fitting and constructing is finished, model validation is applied to verify the model reliability, by changing operating conditions of other aspects (e.g., organic concentration) [73].All parameters must be fixed in the model validations, except for the target parameter being studied in each test, to guarantee the model applicability [73].
Another significant parameter, mediator yield (Y m ), represents the production of intracellular mediator in the anodophilic bacteria [54].The anodophilic bacteria in the electrode biofilm takes the electron from the substrate oxidation, reduces the intracellular mediator to finally transfer electrons to the electrode by direct contact or nanowires [54].Higher Y m means stronger conversion of the oxidized mediators to reduced mediators and higher electron transferring activity accompanied with higher concentration of anodophilic bacteria on the electrode [8,112,113].Thus, determination of Y m in BES modeling is critical to understand and quantify the impact of mediators for the current generation.BES development would require more parameters to make an accurate model to optimize their applications.For example, submersible microbial fuel cell (SMFC) is a system to place anode in the anaerobic sediment, and the anode connects with the air-cathode to complete a closed circuit and generate electricity by degrading organics in the sediment [114].To improve SMFC performance, it is helpful to construct an effective model to simulate the changes of DO and current generation under different conditions [115][116][117].However, the biological aspects, such as biofouling in the environmental waters and competition among different bacteria for COD degradation and nutrient removal, need more parameters to trace their change and activity.Sensitivity analysis is a strong tool to help select the parameters.For example, biofouling formation and detachment strongly impacts electrochemical reaction on a cathode due to bacterial competition for oxygen [18,118].To address this issue, the researchers adopted the increase rate of resistance in their model to describe the difficulty of water filtration through the membrane under biofouling [80,119].Sensitivity analysis can also assist in parameter tuning for accelerated model construction.Other innovated applications in BES also need new parameters to analyze the system performance, such as growth rate of algal biomass for integrated photo-bioelectrochemical system (IPB), and the constant for hydrogen production rate in microbial electrolysis and desalination cells (MEDC) [120][121][122].
In summary, sensitivity analysis is an important method to identify the most influencing parameters on the model performance.It can not only reduce the working load of parameter adjustment, but also reflect the importance of certain parameters on the system output and help understand the working mechanisms and interrelations among immeasurable factors in MFC [73,123].The current sensitivity analysis frequently applied in BES is one-factor-at-a-time method (OFAT).This method has intrinsic drawback because it cannot completely reflect the systematic impact of various parameters together on the performance.Some important and sensitive parameters might be hidden if other parameters are not set in the suitable range, which will limit the sensitive output of these actually important parameters [124].Thus, more efficient and accurate approaches of sensitivity analysis are required to solve complicated problems [124].

Model Validation
Model validation is a significant process after finishing model construction and adjustment.A principle to validate the model is to see if experimental data can match the modeling data [54,73].The model validation is required to enhance the model reliability and practicability.In addition, the model validation can give significant information of what physical, chemical or biological reactions or phenomena are not explained by the model, which will need further model modification.There are several standards to calculate the modeling error, such as mean square errors (MSE), root mean square error (RMSE), and coefficient of determination (R 2 ) [54,63,69,73].
The results of model validation of above engineering models are listed in Supplementary Materials (Table S7) (models without showing validation results are not listed), as a reference to compare the model accuracy.However, it is difficult to set a general standard to determine if the model has fit the experimental data well, because of the difference in BES configurations and operating conditions.Whether a model is acceptable will depend on both validation results and specific case conditions.
A common standard to optimize the parameter for most accurate output is to minimize the root mean square error (RMSE) by adjusting the model structure and parameters to fit the experimental data shown in Equation (3) [54]: Normally, RMSE below 20% is acceptable for the model validation, but acceptable level is difficult to be fixed due to variously actual cases [73,74,76,77,80,83].For example, Yuan et al. [77] constructed an engineering model to successfully predict the dynamic volume profile of feed and draw solution with RMSE smaller than 2.5%; although the RMSE for a MEC model was relatively high (13%-23.6%),the predicted output was still acceptable because the error came from overestimation of organics in the model, which was related to microbial degradation of organics in anode featured with relatively higher unstableness and delayed response of biological factors than other physical attributes.There are also some cases with high RMSE or intrinsic limitations indicating that engineering models are not powerful enough to deal with specific outputs [49,63,76,80].For example, a model was proposed to simulate the performance of a MBER with fluidized-bed system with granular activated carbon (GAC) as media [80].A significant large RMSE of 55.7% was obtained when changing the anodic hydraulic retention time (HRT) to 5 h [80].The huge error may come from the delayed acclimation of electrochemical active bacteria on the electrode, and it is possible that most of enhanced organics are consumed by non-electrochemical active microorganism on the GAC surface instead of electrode [80].However, the error can be utilized to derive more impact of other factors, which is not considered in the original model structure (e.g., organic consumption by non-negligible microorganisms attached on GAC in above case).
With higher demand for model application in BES, more advanced engineering models are needed.Even engineering models themselves cannot satisfy the model demand, and another model system, statistical model, can be a complementing method to help engineering models to increase the prediction ability, which will be discussed in next section [49].

Overview
Statistical models are constructed based on the measured data from the BES system.The statistical models are useful to reveal the input-output relationships and identify the significant parameters for system quantifications, especially when the engineering models are hard to be constructed in a timely manner.The uncertainty can also be better quantified in statistical models than engineering models.On the other hand, the conclusions drawn from the statistical models needed to be validated with the engineering knowledge.A large variety of statistical models, ranging from regression based methods to data mining methods, can be used for the BES modeling.In the following, we will focus on MFCs as representative BES.
The performance of a statistical model depends heavily on the type and quality of the collected data.In general, the data can be collected from controlled experiments, observational studies, and simulations based on engineering models.Most MFC model works were performed based on controlled experiments with studying the effect of parameters on system performance [125][126][127][128][129][130][131][132][133].More details on data collection for controlled experiments will be introduced in Section 3.2.Very limited observational studies were reported [134,135], where the data collection was performed directly during the real production of the system.Simulation data can be generated from the engineering models based on the first principles (see Section 2 for more details).In these simulation studies, the data are generated from the computer program by setting the program inputs and initial and/or boundary conditions.
Depending on the data structure, a variety of model formats can be chosen from.In general, one can select from linear models, nonlinear models and more complex data mining models [136].A discussion on selecting the appropriate type of model is provided in Supplementary Materials (Model Selection and Figure S1).Most models used for MFC modeling are based on linear regression (LR), and these models usually have good interpretability and simple model structures [125][126][127][129][130][131][132][133].In recent years, data mining methods were also introduced for modeling the MFC performance such as artificial neural network (ANN), genetic programming (GP), support vector machine (SVM) and relevance vector machine (RVM) [128,137,138].These data mining methods can yield good prediction performance.However, they usually do not have an analytical form thus are difficult to be interpreted [19].
In the following, the data generation approaches are first briefly reviewed.Then, different statistical models used in the MFC studies are summarized (Table 1).The motivations, advantages and limitations of these models are discussed.Finally, model diagnostic and variable selection, which are the missing part in the literature, are also discussed.ANOVA, RSM FD, CCD Two-chamber MFC The interaction effect was significant [125] ANOVA, RSM FD Two-chamber MFC Two-factor and three-factor interaction effects were less significant [127] MLM FD with covariates Two-chamber MFC Cathode moisture was significant on energy recovery but insignificant on organic removal efficiency [142] ANOVA, RSM CCD Two-chamber MFC Quadratic effects can be significant, but interaction effects were not significant [129] ANOVA, RSM CCD Mediator-less single-chamber MFC Interaction and quadratic effects can be significant [126] ANOVA, RSM CCD Two-chamber MFC All the linear, square and interaction effects were significant [132] ANOVA, RSM CCD Membrane-less MFC Interaction and quadratic effects can be significant [133] ANOVA PBD Submersible MFC Temperature, initial pH, and conductivity were significant [130] ANOVA, RSM PBD Two-chamber MFC Glucose, NaHCO 3 and KCl were significant [131] GP, ANN, SVM [143] Two-chamber MFC GP performed best among the three models [138] ANN Controlled Experiment Membrane-less MFC R 2 value can be as high as 0.99 [144] RVM UD Two-chamber MFC RVM performed better than SVM [128] [130] uses both OFAT and PBD.OFAT: one-factor-at-a-time; FD: factorial design; CCD: central composite design; PBD: Placket-Burman design; UD: uniform design.

Data Generation
MFCs have been studied via controlled experiments intensively [125][126][127][128][129][130][131][132][133]141,145].The controlled experiments are performed in OFAT or design of experiments (DOE) fashion.In OFAT, only one factor is varied while the others are remained constant.The system performance, such as power generation and organic removal, can be measured by varying the factor under study.The effect of the varied factor is analyzed visually or by statistical models [130,[139][140][141][146][147][148][149][150][151].Note that OFAT can only study the effect of a certain parameter on the MFC performance once at a time, resulting in interpretation problems of interactions or joint effects of multiple factors on the MFC performance.Misleading results can be yielded in such circumstances [125,126,133,152].For instance, it was found that the interaction effect of pH and buffer concentration was very significant, while a false conclusion could be made if the interaction effect was ignored [125].In DOE, each factor will be studied with selected levels.The combinations of these levels over different factors consist of the design table.Based on the design table and measured output, the main effects and interaction effects can be studied simultaneously.Here, main effects refer to the effect of factors themselves, and interaction effects characterize the phenomenon that the effect of one factor on output changes over the level of other factor(s).Moreover, Energies 2016, 9, 111 14 of 27 the number of experimental runs required can be significantly reduced in DOE compared with all possible combinations, or random selections of factor levels.The past MFC studies have used factorial design (FD), central composite design (CCD), box-Behnken design (BBD), Placket-Burman design (PBD), and uniform design (UD) [125,127,128,[130][131][132][133]142,153].More general designs can be found in [152].
Figure 6 shows some of the designs for a three-factor problem, where the red points represent the settings of experiment runs and the cube represents the design space.The factors in FD usually take two levels of values, where only linear effect can be estimated (Figure 6a).In CCD and BBD, the factors will take more levels of values and the quadratic effect can be studied without adding a lot of additional experimental runs (Figure 6b,c).PBD uses a limited amount of experimental runs for screening a relatively large amount of factors (Figure 6d).In the screening experiments, the interactions between the factors are usually neglected [154].Finally, UD uniformly assigns experimental points in the domain, and is used when the underlying model structure is unknown.See details in constructing a UD in [155].Note that along with the experiments in DOE, other related but uncontrolled factors can be recorded.These factors are usually called covariates [152].
resulting in interpretation problems of interactions or joint effects of multiple factors on the MFC performance.Misleading results can be yielded in such circumstances [125,126,133,152].For instance, it was found that the interaction effect of pH and buffer concentration was very significant, while a false conclusion could be made if the interaction effect was ignored [125].In DOE, each factor will be studied with selected levels.The combinations of these levels over different factors consist of the design table.Based on the design table and measured output, the main effects and interaction effects can be studied simultaneously.Here, main effects refer to the effect of factors themselves, and interaction effects characterize the phenomenon that the effect of one factor on output changes over the level of other factor(s).Moreover, the number of experimental runs required can be significantly reduced in DOE compared with all possible combinations, or random selections of factor levels.The past MFC studies have used factorial design (FD), central composite design (CCD), box-Behnken design (BBD), Placket-Burman design (PBD), and uniform design (UD) [125,127,128,[130][131][132][133]142,153].More general designs can be found in [152].
Figure 6 shows some of the designs for a three-factor problem, where the red points represent the settings of experiment runs and the cube represents the design space.The factors in FD usually take two levels of values, where only linear effect can be estimated (Figure 6a).In CCD and BBD, the factors will take more levels of values and the quadratic effect can be studied without adding a lot of additional experimental runs (Figure 6b,c).PBD uses a limited amount of experimental runs for screening a relatively large amount of factors (Figure 6d).In the screening experiments, the interactions between the factors are usually neglected [154].Finally, UD uniformly assigns experimental points in the domain, and is used when the underlying model structure is unknown.See details in constructing a UD in [155].Note that along with the experiments in DOE, other related but uncontrolled factors can be recorded.These factors are usually called covariates [152].The observational data, which are usually collected after the system enters the application phase, are rarely reported in the literature [134,135].These studies did not use statistical models for the observation data collected from the field operation.The simulation studies are also widely performed in MFC literature (see Section 2 for details).In such studies, the focus was on the engineering models themselves.We would like to point out that the engineering models can generate useful simulation data, and statistical surrogate models can then be constructed based on the simulation data and real measurement data.These statistical models can be used to enhance the simulation models' prediction performance.
It is worth noting that experimental procedures vary a lot in different studies, even for the same BES configuration.The statistical models generated from one system may not be directly used for another study.Standard experimental and data collection procedures are therefore required for guaranteeing the data uniformity and quality.In BES, available data to be measured are mainly divided into two categories: electrochemistry and wastewater treatment.Cell voltage and electrode The observational data, which are usually collected after the system enters the application phase, are rarely reported in the literature [134,135].These studies did not use statistical models for the observation data collected from the field operation.The simulation studies are also widely performed in MFC literature (see Section 2 for details).In such studies, the focus was on the engineering models themselves.We would like to point out that the engineering models can generate useful simulation data, and statistical surrogate models can then be constructed based on the simulation data and real measurement data.These statistical models can be used to enhance the simulation models' prediction performance.
It is worth noting that experimental procedures vary a lot in different studies, even for the same BES configuration.The statistical models generated from one system may not be directly used for another study.Standard experimental and data collection procedures are therefore required for guaranteeing the data uniformity and quality.In BES, available data to be measured are mainly divided into two categories: electrochemistry and wastewater treatment.Cell voltage and electrode potential are significant data about electricity generation in BES.The collection and reporting procedures of these data have been previously discussed, and voltage meters, multimeter, and potentialstat are common methods to collect the data and observe the electrochemical behavior of the system under specific situation [8].In wastewater treatment, removal efficiency of organic compounds (e.g., acetate, glucose, and COD), and other indexes (e.g., pH and total nitrogen) are significant data to be collected [76,78].High performance liquid chromatography (HPLC) is a common method to qualitatively and quantitatively measure the organic concentration [156][157][158][159].Chemical analysis such as COD kits (Hach Co., Loveland, CO, USA) can be applied to measure COD, ammonium, nitrate and other ion concentration in water [160][161][162].Some calculation data such as Coulombic efficiency (CE), power density, and energy efficiency are also needed to report the efficiency of electricity generation and assess the BES performance [8,163].

Linear Regression
To model the MFC, a LR, with the form shown in Equation ( 4) can be used.The LR uses an additive form of inputs and an error term for the prediction of the output shown in Equation ( 4) [164]: where y i is the ith observation of the output such as power density, x i,j is the ith observation of the jth input such as substrate concentration, β 0 is the intercept term, β j is the model parameter for the jth input, and ε i is the error term for the ith observation.The basic assumptions for any LR are that the input-output relationships can be represented in a linear form, and the error terms ε i , i " 1, ¨¨¨, n are normal, identically and independently distributed with mean zero and variance σ 2 .These model assumptions need to be checked via model diagnostic after the estimation steps.However, the model diagnostic process is rarely reported in the MFC literature, which should be emphasized.
The data generated from OFAT can be studied via simple linear regression (SLR), where only one factor is used as input.For instance, the relationship between current density and microorganisms concentration was investigated [130].It was found that the current density increased with the increase of active microorganism concentration when the active microorganism concentration was below 6.52 nmol-ATP/L (ATP: adenosine-triphosphate), and the system saturated at higher active microorganism concentrations.In another example, the association of current generation and bulk phase acetate concentration was found to be positive association at acetate concentration value less than 2.3 mM [139].SLR was also used for studying the species that were responsible for the transport of positive charge [141], and the effect of organic loading rate, flow rate and pH on current density, voltage and volumetric power [140].
The analysis of variance (ANOVA) and response surface methods (RSM) are also based on LR analysis, which are widely used for the analysis of DOE data of MFCs.Taking a system with two factors A and B, with a and b levels respectively for example, a typical ANOVA has the form shown in Equation ( 5): y i,j,k " µ `αi `βj `pαβq i,j `εi,j,k , i " 1, ¨¨¨, a, j " 1, ¨¨¨, b, k " 1, ¨¨¨, n where y i,j,k is the kth observation at the level i and j for A and B respectively, µ is the overall mean, α i and β j are main effects of A and B, pαβq i,j is pair-wise interaction effect between A and B, and ε i,j,k " N `0, σ 2 ˘is the error term.ANOVA can be used to identify significant effects.RSM uses a polynomial model, with the form in Equation ( 4), to approximate the input-output relationships.RSM can help determine the optimal settings for a response at the specified design region.The effects of pH and catholyte buffer concentration on power output, columbic efficiency, COD change and COD removal efficiency were studied for a two-chamber MFC via ANOVA and RSM based on FD and CCD [125].It was reported that the increase in buffer concentration would lead to an increase in power generation at a high pH level; while the opposite impact was observed at low pH level.In another word, the interaction between pH and catholyte buffer concentration was significant, which cannot be discovered by the OFAT approach [125].The factor setting for maximal power generation was obtained and validated by physical experiments [125].In a sediment MFC, the effects of pH, distance between electrodes and external resistance on organic removal, nitrogen removal and power generation were studied via ANOVA and RSM [127].It was reported that organic removal, nitrogen removal and power density were optimized if controlling pH in 7. 6-8.5, distance between electrode in 90-100 cm and external resistance in 0-52 ohm.The authors also concluded that two-factor and three-factor interactions were less significant, and might be negligible in the final identified mode.The ANOVA or RSM were used in several other works studying the effects of parameters on MFC performance (as summarized in Table 1) [126,[129][130][131][132][133]153].
It is worth mentioning that the MFC systems are becoming more complex with successively added functions.New factors are of interest and the number of factors increases quickly for the MFC stack.For instance, new parameters such as ammonia and sulfate were detected via bipolar bioelectrodialysis, and a MFC with multiple modules was studied [116,142].Under such circumstances, variable selection is needed.Recently, a multi-task Lasso model (MLM) was proposed for modeling the multi-module MFC system, and showed its great modeling power (Figure 7) [142].
Energies 2016, 9, 111 the effects of pH, distance between electrodes and external resistance on organic removal, nitrogen removal and power generation were studied via ANOVA and RSM [127].It was reported that organic removal, nitrogen removal and power density were optimized if controlling pH in 7.6-8.5,distance between electrode in 90-100 cm and external resistance in 0-52 ohm.The authors also concluded that two-factor and three-factor interactions were less significant, and might be negligible in the final identified mode.The ANOVA or RSM were used in several other works studying the effects of parameters on MFC performance (as summarized in Table 1) [126,[129][130][131][132][133]153].
It is worth mentioning that the MFC systems are becoming more complex with successively added functions.New factors are of interest and the number of factors increases quickly for the MFC stack.For instance, new parameters such as ammonia and sulfate were detected via bipolar bioelectrodialysis, and a MFC with multiple modules was studied [116,142].Under such circumstances, variable selection is needed.Recently, a multi-task Lasso model (MLM) was proposed for modeling the multi-module MFC system, and showed its great modeling power (Figure 7) [142].

Data Mining Methods
Some data mining methods were applied for MFC modeling, though there were no clearly stated reasons on why these data mining methods were preferred over regression models.In general, these data mining methods can have good prediction performance, but usually do not have good engineering interpretations [19].These data mining methods included ANN, GP, SVM and RVM.
ANN is inspired by biological neural networks and uses a number of connected "neurons" for the modeling of input-output relationships.The "neurons" can be characterized by different layers, as shown in Figure 8, where the hidden layers are composed of some function of input neurons [144].More details can be found [165].

Data Mining Methods
Some data mining methods were applied for MFC modeling, though there were no clearly stated reasons on why these data mining methods were preferred over regression models.In general, these data mining methods can have good prediction performance, but usually do not have good engineering interpretations [19].These data mining methods included ANN, GP, SVM and RVM.
ANN is inspired by biological neural networks and uses a number of connected "neurons" for the modeling of input-output relationships.The "neurons" can be characterized by different layers, as shown in Figure 8, where the hidden layers are composed of some function of input neurons [144].More details can be found [165].GP mimics the evolution process, and uses crossover and mutation operations for learning [138].In GP, the solutions to a problem are represented by tree structures, and an initial structure is generated randomly.The optimal structure is learned via heuristic searching over all structures created by crossover and mutation operations until some termination criterion is met.For more details, please refer to [138].GP mimics the evolution process, and uses crossover and mutation operations for learning [138].In GP, the solutions to a problem are represented by tree structures, and an initial structure is generated randomly.The optimal structure is learned via heuristic searching over all structures created by crossover and mutation operations until some termination criterion is met.For more details, please refer to [138].
SVM and RVM are also effective methods for predicting MFC performance.SVM maps the finite-dimensional input space to infinite-dimensional space via kernel functions, and performs the regression in the infinite-dimensional space.The learning in infinite-dimensional space is usually easier than in finite-dimensional input space [136].RVM has very similar formulation as SVM, except that the RVM solves the problem in Bayesian framework [128].The RVM can provide probabilistic classification, and typically use fewer kernel functions than SVM.RVM usually uses expectation maximization algorithm for learning, and may fall into local minima [166].On the other hand, SVM uses sequential minimal optimization and is guaranteed to have a global optimum [167].The effects of temperature and ferrous sulfate concentrations on the voltage were studied via GP, ANN and SVM [138].The researchers adopted the data for a two-chambered MFC, and the model inputs included temperature or ferrous sulfate concentrations and time.They concluded that GP performed best among the three models based on R 2 values.However, their models contained complex transformations such as tan, tanh, sin, log, etc., which are hard to be interpreted.At the same time, R 2 values can be used to describe the model fitting performance, but not the prediction performance [164].In another example, the effects of temperature, pH and electron acceptor concentration on power density were studied via ANN for a membrane-less MFC [144].It was reported that the R 2 value could be as high as 0.99; however, the specific variable effects and the analytical model structure were not provided.A RVM was adopted for investigating the effects of ionic concentration, pH, medium nitrogen concentration and temperature on Coulombic efficiency and power density [128].The R 2 values were larger than 0.99, and the optimized factor setting was obtained via genetic algorithm (GA), and validated by validation experiments [128].

Model Diagnostic
The model assumptions must be checked before applications.If the model assumptions are severely violated, the model inference and interpretation may have some problem [164].However, the model diagnostic was rarely reported in MFC literature.In the following, the model diagnostic for LR will be briefly reviewed as an example.
Model diagnostic for LR is based on residuals, which are calculated as the difference between the observed value and fitted value.The common model diagnostic for LR includes: (1) whether the input-output relationship is linear; (2) whether the error terms have homogeneous variance ("identically distributed"); (3) whether the error terms are independent; (4) whether the error terms are normally distributed; and (5) whether there are outliers [164].The model diagnostic methods include graphical methods and statistical tests.The graphic methods are more vivid and interactive, but the judgment made is subjective.On the other hand, statistical tests provide quantitative tools to evaluate the model assumptions [164].If a certain model assumption is violated according to the model diagnostic in LR, several approaches can be used to relief the problem.For instance, the weighted least square can be used if the variance is heterogeneous.The nonlinear regression can be used if the linear model structure does not work well.Box-cox transformation can be used if the normal assumption is violated [164].If the above remedy approaches do not work, one can use data mining methods as described in Section 3.4.

Variable Selection
Many model candidates are usually available within a given modeling framework.Given the model candidates, a certain evaluation criterion is needed to select the best candidates.These criteria include: R 2 , adjusted R 2 , Mallow's C p , Akaike information criterion (AIC), Bayesian information criterion (BIC), RMSE of cross validation (CV), etc. [136,164].R 2 is usually used in the MFC study to characterize how well the model fits, and p-value is used to judge the significance of variables [125][126][127]129,130,132,133].As discussed before, R 2 may not be a good criterion to characterize the prediction performance.Moreover, the p-value is subjective and there is no consensus on how to select a threshold on variable significance [168].More systematic variable selection methods are required.
The variable selection methods, such as subset selection, stepwise selection and model regularization, can be applied for selecting the best model.Subset selection considers all possible candidate models and select the best based on certain criterion.Such combinatorial analysis is infeasible for high dimensional problems.Stepwise selection starts from a null (empty) model or a full model (model with all the inputs), and systematically add or drop variables [169].The stepwise selection is computationally inefficient for high dimensional problems, where regularized regression can be used.A regularized regression model is estimated by solving the objective function shown in Equation (6): where x i " `1, x i,1 , ¨¨¨, x i,p ˘T, β " ´β0 , β 1 , ¨¨¨, β p ¯T, and Ω p¨q is a measure of the model complexity which can take the form such as |β 1 | `¨¨¨`ˇˇβ p ˇˇ.The x i,j , and β j are defined as before.

˘2
is a loss function to characterize the model prediction performance.λ is a tuning parameter that controls the balance between how well the model fits and how complex the model is.A family of models will be yielded by varying the tuning parameter λ.The best model candidate with the smallest MSE can be selected from a regularization path [170].

Perspective
The BES modeling may be further improved in the aspects of variable selection, modeling with functional data, data fusion and hybrid models.With the development of more BES applications, demand for BES modeling is enhanced.

Variable Selection
Many parameters can potentially have strong correlation with the MFC performance.The number of parameters of interest could increase rapidly for a MFC system consisting of multiple modules [171].However, due to the experimental cost, limited samples can be collected.The number of parameters of interest could be certainly larger than the number of samples.In this case, ANOVA and RSM cannot be directly applied.Variable selection method can be applied for investigating the important parameters for such systems when there are a lot of parameters but limited samples.Currently, there is very limited application of variable selection in MFC studies.In a recent work, a multi-task Lasso was adopted for identifying the important parameters for system performance characterization in a serially connected five-module MFC [142].More works on systematic variable selection are required for understanding, monitoring and control the MFCs.

Modeling with Functional Data
The MFCs are essentially complex nonlinear dynamic systems whose performance would evolve over time [172].The evolving history of key parameters such as temperature, pH and moisture are important for the system characterization.These measurements over time are called functional data.Generally, the dynamics of the system could be characterized by ODEs or PDEs, and validated with sampled points during operation over time.But these ODEs or PDEs are time-consuming to be constructed, and may not be accurate in the prediction.Given the measurements over time, some functional data analysis techniques could be adopted so that the important features of the system can be identified.For instance, the varying-coefficient model can be adopted for modeling a dynamic system [173].

Data Fusion
Because the MFCs, though different configurations, could contain many similar features [8], ensemble modeling strategy could integrate knowledge from an existing configuration to new systems development [174].In this way, fewer samples will be needed for new system modeling.There will also be various differences between the field condition and lab testing condition.Therefore, the model learned from the lab measurement data may not be accurate for applications for field production.Under such a situation, ensemble modeling strategy can be used to tune the model so that it fits the need for field production [174].The basic idea is to adopt the existing knowledge on the model format and significance of variables to the newly developed systems, and most details can be found in a previous publication [174].

Hybrid Models
Hybrid models integrate the engineering models and statistical models systematically.One example is to keep the engineering models as the mean part, but add statistical adjustments to the final model to account for the errors incurred by simplification assumptions in the engineering models.It can therefore take advantage of the strength of each type of models for system characterization, and will yield better modeling performance.Such an approach has been used for manufacturing systems and show their advantages [49,50], and may also be applied for MFC studies.

Conclusions
Mathematical modeling is a powerful tool to help understand and guide BES development for energy recovery from waste and wastewater.Both engineering models and statistical models have their merits and drawbacks.Engineering models are developed based on the biochemical mechanisms to reflect the relationships among versatile variables in MFC systems, which require knowledge in the field of physical-chemistry, biology and electrochemistry.Engineering models have advantage in quantitative analysis and explanation of the impact of certain variables that are hard to be measured.On the other hand, the statistical models have higher flexibility in the model format and are capable of dealing with the system uncertainty.The statistical models are driven by the measurement data, and easier to develop for an unknown complex system.However, the statistical models need to be interpreted by domain expert before applications.

Figure 2 .
Figure 2. Conceptual MFC model showing carbon source conversion and biofilm activity.Reproduced with permission from reference [54].

Figure 2 .
Figure 2. Conceptual MFC model showing carbon source conversion and biofilm activity.Reproduced with permission from reference [54].

Figure 2 .
Figure 2. Conceptual MFC model showing carbon source conversion and biofilm activity.Reproduced with permission from reference [54].

Figure 3 .
Figure 3. System configuration and model output of a MDC.Reproduced with permission from reference [73].

Figure 3 .
Figure 3. System configuration and model output of a MDC.Reproduced with permission from reference [73].

Figure 4 .
Figure 4. Proton concentration profile and concentration gradient within the biofilm.Reproduced with permission from reference [44].

Figure 4 .
Figure 4. Proton concentration profile and concentration gradient within the biofilm.Reproduced with permission from reference [44].

Figure 7 .
Figure 7. MLM for a two chamber multi-module MFC.Reproduced with permission from reference [142].

Figure 7 .
Figure 7. MLM for a two chamber multi-module MFC.Reproduced with permission from reference [142].

Table 1 .
Summary of statistical models.