Modeling Biofilms: from Genes to Communities

Biofilms are spatially-structured communities of different microbes, which have a huge impact on both ecosystems and human life. Mathematical models are powerful tools for understanding the function and evolution of biofilms as diverse communities. In this article, we give a review of some recently-developed models focusing on the interactions of different species within a biofilm, the evolution of biofilm due to genetic and environmental causes and factors that affect the structure of a biofilm.


Introduction
Despite the common view of microbes in their free state, pure culture planktonic growth is rarely how microbes exist in nature.Instead, most microbial species in nature live in the form of biofilms, which are described as a multicellular consortium of microbial cells that are attached to a surface and encased in a self-secreted, extracellular polymeric matrix [1,2].Biofilms are usually heterogeneous in both their spatial structures and component species and interact with the surrounding environment in a complicated way.
Microbial biofilms are ubiquitous in both natural and industrial settings, and bacteria living in a biofilm often behave very differently from their planktonic counterpart.Bacteria inside a biofilm are usually more resistant to antimicrobial agents [3] and usually possess big competitive advantage over bacteria growing in suspension.This means that it is often difficult to remove biofilms efficiently.
Biofilms can cause many severe problems, such as chronic infections, food contamination and equipment damage due to bio-fouling.Biofilms can also be used for good and constructive purposes, such as waste water treatment, heavy metal removal from hazardous waste sites, biofuel production and microbial fuel cells.From a neutral point of view, since much of the microbial biomass appears in the form of biofilm and due to their ability to produce and consume organic materials, the biofilm communities also have a big impact on the global ecosystem and geochemical system.
In order to promote good biofilms and prevent bad biofilms, it is important to understand the mechanisms for biofilm formation, growth and its removal.The development of a biofilm is a complicated process affected by many biological, physical and chemical factors, and understanding it requires both experimental and modeling efforts.Experiments provide directly-measured qualitative or quantitative data of biofilm properties that are of interest, such as cell counts, cell viability, biofilm morphology and EPS structure, nutrient profile, as well as genetic information.A mathematical model translates the conceptual understanding of the biofilm system into mathematical terms, usually by combining the important processes involved, but omitting the less important ones, and the solutions (either analytical or numerical) are obtained by using available mathematical or statistical tools.Since a model can connect different processes and assess their relative importance, modeling results can help us to understand the biofilm system, facilitate experimental design and make predictions that can be tested by experiments.In this sense, progresses made in experimental and modeling research always promote the development of each other.
Mathematical modeling of biofilm started in the 1970s with models studying substrate utilization and mass transport in a homogeneous slab of biofilm [4,5].In the 1980s, models including multispecies and the non-uniform distribution of different biomass types started to emerge [6,7], but they were still primarily for one space dimension and steady-state growth dynamics.Starting in the 1990s and up to today, aided by the fast advancement in computing power and better understanding of biofilms through experimental data, multidimensional, multispecies, multisubstrate models are being developed to incorporate realistic biofilm morphology, biofilm mechanics, interactions between biofilm and the environment and interactions between different species within a biofilm, as well as various time scales involved in biofilm-related processes [8][9][10][11][12].
There is a rich literature on the review of mathematical models for biofilms.A few representative ones are listed below.
The IWAtask group gives an excellent review of mathematical modeling of biofilms [13].The book explains the basic steps in creating a mathematical model, emphasizes that the "golden rule" of modeling is that "a model should be as simple as possible, and only as complex as needed" and presents the model derivation based on mass conservation in detail.Models are classified into analytical models (A), Pseudo-Analytic models (PA), Numerical one-dimensional (N1) and multi-dimensional Numerical models (N2 and N3).The features, definitions and equations, as well as the application of each type of model are discussed.The performances of all models for solving three characteristic benchmark problems are compared, which help identify the trade-offs inherent to using different types of models.The book also points out the significance of the definitions and units of model parameters.
Klapper and Dockery [14] discuss how macroscale physical factors might influence the composition, structure and function of ecosystems within microbial communities from the modeling perspective and emphasizes that despite its difficulty and complexity, it is important to include the physical, chemical and biological processes at a variety of time and length scales in the model to fully understand the physiology and ecology of the microbial communities.Specific modeling aspects discussed include quorum sensing, growth, mechanics and antimicrobial tolerance mechanisms.
Wang and Zhang [15] give a chronological review of some biofilm models developed from the 1980s to the early 2000s.Based on their dimensionality, the way in which diffusion is treated and the complexity in terms of the incorporation of the physics, chemistry and biological effects, models are classified into four main categories: one-dimensional continuum models, diffusion-limited aggregate models, continuum-discrete diffusion models and biofilm-fluid coupled models.
In recent years, fast advancement in experimental technologies, such as microscopy and high-throughput sequencing, has provided an abundance of data at both the genetic and community level and helped researchers to understand biofilms much better.In particular, it is a common belief that biofilms should be viewed as spatially-structured communities of microbes, and the structure and function of the communities are determined by both the surrounding environment and the local interactions between different species within biofilms via complex metabolic networks [16][17][18][19][20].
To understand biofilms as a diverse community and the evolution of different species and its genetic causes, mathematical modeling again serves as a powerful tool.Experimental data with a high level of detail provide both opportunities and challenges for researchers working on biofilm modeling.On the one hand, there are more data available to improve conceptual understanding of biofilm and to compare with model predictions; on the other hand, more sophisticated models are demanded in order to accommodate the data.Song et al. [21] give a methodological review on mathematical modeling of microbial community dynamics.Widder et al. [22] address the challenges in building predictive models for understanding the function and dynamics of Microbial Communities (MCs).
Several specific examples where model-experiment integration has already resulted in important insights into MC function and structure are discussed.These include inferring species interactions from proximal data, predicting species interactions using stoichiometric models and kinetic models for community dynamics.The conclusion is that addressing this challenge requires close coordination of experimental data collection and method development with mathematical model building.
In this article, we review some recent mathematical models that focus on studying biofilms as a diverse community.The rest of the article is organized as follows.Section 2 discusses the genetic basis for biofilm development based on experimental results and its mathematical modeling with emphasis on the models based on the idea of Quorum Sensing (QS); Section 3 discusses models based on Flux Balance Analysis (FBA) and stoichiometry; Section 4 discusses models based on statistical inference; Section 5 discusses models with novel growth kinetics and the ability to resolve the complex spatial structure of biofilm.Table 1 gives a brief overview of the models discussed in the article.

Modeling the Genetic Basis of Biofilm Development
Costerton et al. [3] pointed out that biofilms consist of microcolonies on a surface and that within these microcolonies, the bacteria have developed into organized communities with functional heterogeneity.Clinical characteristics of biofilm infections are discussed, and multiple mechanisms of biofilm resistance to antimicrobial agents are proposed.Furthermore, P. aeruginosa and the chronic lung infections it causes in most patients afflicted with the recessive genetic disease Cystic Fibrosis (CF) are used as a model to reveal information about the molecular and genetic basis of biofilm development.There is evidence [23,24] showing that during the attachment phase of biofilm development, the transcription of specific genes (such as the genes required for the synthesis of the extracellular polysaccharide) is activated.Research on quorum sensing in Gram-negative bacteria [25,26] has shown that acyl homoserine lactone signals are produced by individual bacterial cells.At a critical cell density, these signals can accumulate and trigger the expression of specific sets of genes.Detachment and dispersal of planktonic cells from biofilms could also have a genetic basis.It has been suggested that increased expression of the alginate lyase in the mucoid strain of P. aeruginosa led to alginate degradation and increased cell detachment [27,28].Antibiotic therapy in patients colonized with P. aeruginosa often gives a measure of relief from symptoms, but fails to cure the basic ongoing infection [29,30].One interpretation of this is that the antibiotics act on the planktonic cells that are shed by the biofilms, but cannot eliminate the antibiotic-resistant sessile biofilm communities, and the microcolonies of sessile bacteria in the lung act as niduses for the spread of the infection [31,32].The conclusion from [3] is that the effective control of biofilm infections will require a concerted effort to develop therapeutic agents that target the biofilm phenotype and community signaling-based agents that prevent the formation, or promote the detachment, of biofilms.
Monds and O'Toole [33] give a critical review of the causal basis of biofilm formation and its molecular underpinnings.It discusses the concept of biofilm formation as a developmental process by evaluating experimental data and concludes that the developmental model of biofilm formation must be approached as a model in need of further validation, rather than coveted as a robust platform on which to base scientific inference.Here, the definition of developmental process is a "series of stable and meta-stable changes in the form and function of a cell, where those changes are part of the normal life cycle of the cell" according to [34].Furthermore, the molecular requirement implicit in a developmental process is that a series of hierarchically-ordered genetic elements control temporal transition through the developmental pathway in response to specific cues.The review starts with the origin of the developmental model of biofilm formation as an analogy with Myxococcus xanthus fruiting-body formation [35], then provides some unequivocal support for the developmental model, including both structural transitions occurring during biofilm formation [36] and various phenotypes with biofilm-specific properties, such as increased antibiotic tolerance [37].After that, two specific case studies are presented as evidence that bacteria have evolved genetic pathways that serve to directly link environmental cues to the regulation of stage-specific transitions in biofilm formation.One case is that low extracellular phosphate blocks microcolony formation by Pseudomonas fluorescens [38,39]; the other case is intracellular iron as a signal for biofilm maturation [40,41].However, there is still lack of success in uncovering comprehensive genetic programs specific for the regulation of biofilm development [42,43].Moreover, the development model also requires that groups of biofilm pathways are connected in a hierarchically-ordered genetic network, and there are causal links between form and function.Folkesson et al. [44] provide valuable evidence in this direction by showing that Escherichia coli biofilms formed by F-plasmid-containing cells were more structured and had increased tolerance to colistin relative to cells without the F-plasmids, but it is still not possible to say that pathways controlling the structural development of an E. coli biofilm are directly coupled with pathways for the formation of a subpopulation of cells with increased tolerance to colistin.Next, two examples are used to demonstrate biofilms as multicellular organisms with functional differentiation and alternative cell fates.The first one by Klausen et al. [45] investigated differential roles for motile and non-motile subpopulations in determining the topology (mushroom-like or flat) of P. aeruginosa biofilms.Burrows [46] gave a thorough review on the twitching motility of P. aeruginosa through Type IV pili (T4P) structure and function.The second one [47] examined the spatiotemporal patterns of cell-specific gene expression in B. subtilis biofilms and indicated that different cell types vary in abundance and location in the biofilm over time.Finally, an alternative experimental model for biofilm formation based on the ecological adaptation of individuals was proposed by Klausen et al. [48].In this experiment, deterministic responses are integrated with stochastic interactions with the environment to shape biofilm form and function, where biofilm evolution has been driven by the selection for individual competitiveness in complex and dynamic environments.
Many models have been proposed to describe the genetic processes that regulate biofilm development, and a few representative ones are discussed below.
Reed et al. [49] proposed the gene-centric approach for integrating environmental genomics and biogeochemical models.In this model, the production rate or j-th gene is given by: where Γ j is gene abundance (genes per unit volume), F T is the thermodynamic potential factor accounting for the chemical energy available to drive the metabolism, µ j is the specific growth rate, C s is the concentration of a reactant or nutrient s, K s is the half-saturation constant of the reactant or nutrient s, C x is the concentration of inhibitor x and K x is the half-saturation constant of inhibitor x.Furthermore, metabolic plasticity, whereby growth via one metabolism can lead to the propagation of functional genes associated with other metabolisms, is incorporated into the model by introducing the following governing equation for the gene abundance: where n i is the number of the i-th gene per unit mass of cells that contains this gene, σ ij is a probabilistic measure of the co-occurrence of genes i and j within a genome and λ is the mortality rate constant of a gene.Equations ( 1)-( 2) describing the microbial community are coupled to chemical dynamics by usual reaction equations.There are several advantages of this model: most of its parameters are either directly measurable (µ j , K s ) or easily obtained by calculation (F T ); numerical solutions from the model give gene abundances and chemical concentrations that allow direct comparisons between model predictions and experimental results; and the metabolic plasticity could be important for understanding the complex microbial community dynamics.Zhang et al. [50] developed a theory for the analysis and prediction of the spatial and temporal patterns of gene and protein expression within microbial biofilms based on similar ideas.The theory integrates the phenomena of solute reaction and diffusion, microbial growth, mRNA or protein synthesis, biomass advection and gene transcript or protein turnover.Case studies illustrate the capacity of the theory to simulate heterogeneous spatial patterns and predict microbial activities in biofilms that are qualitatively different from those of planktonic cells.Genetic changes via horizontal gene transfer often make taxonomic distinction among species obscure; thus, sometimes, it is convenient to characterize the dynamics of microbial communities by different traits.Trait-based models were developed based on this idea and applied for analyzing the diversity of ecosystems.Shipley et al. [51] developed the maximum entropy (MaxEnt) model, and Laughlin et al. [52] developed the the Traitspace model; both have been applied to predict the relative abundance of species for plant communities.Both approaches are based on statistical methods and are composed of three key elements: an underlying trait distribution, a performance filter defining the fitness of traits in different environments and a projection of the performance filter along some environmental gradient.The objective of the modeling is to estimate the relative abundance of the some species in a given environment by incorporating information about individual-level functional traits.The MaxEnt model tends to overestimate the relative abundance of species since it maximizes the evenness of their distribution.On the other hand, the Traitspace model tends to underestimate the relative abundance of species since it is based on Bayesian theory and predicts a low probability of abundances for functional groups that do not pass through environmental filters.Though the trait-based models have not be widely adopted for modeling of the biofilm yet, it is certainly promising to apply the methodology to study biofilm as a microbial community.For example, Lennon et al. [53] have found that certain traits are related to the biofilm-producing capability of strains and identified functional groups of microorganisms that will help predict the structure and functioning of microbial communities under contrasting soil moisture regimes.Furthermore, current biofilm models can provide accurate predictions of the environmental gradients inside the biofilm, which can be used as the input of the trait-based models.
Quorum sensing is the regulation of gene expression in response to fluctuations in cell-population density [54][55][56].Quorum sensing bacteria produce and release chemical signal molecules called autoinducers, and the concentrations of autoinducers increase as the cell density increases.Once the concentration of an autoinducer reaches a threshold, an alteration in gene expression is triggered.Recent research on many different bacterial species has shown that quorum sensing systems play an important role in regulating the expression of genes involved in biofilm formation, biofilm maturation, biofilm dispersal and detachment [57][58][59].Naturally, modeling of QS is an important part of the general effort in modeling the genetic processes involved in biofilm development.Ward [60] gives a good review of early mathematical modeling of QS.
The modeling of QS starts with a circuit describing the gene regulation involved in the QS system, which is usually given by a schematic diagram showing all of the genes, autoinducers and the corresponding positive and negative interactions.For example, the las and rhl systems in P. aeruginosa [61] are extensively studied.James et al. [62] and Dockery and Keener [63] pioneered the work on modeling QS at the molecular level.These early QS models use a system of coupled Ordinary Differential Equations (ODEs) to describe the dynamics of the intracellular concentrations of genes (or proteins), autoinducers and substrates, where the reaction kinetics are carefully designed to reflect the interactions within the QS system.Due to their simple forms, these models can be investigated by both numerical and analytical tools, and results suggest that QS works as a biochemical switch between two stable steady states of the system, one with low levels of autoinducer and one with high levels of autoinducer.
The signal production in QS in biofilm can be affected by many physical, chemical and biological factors [64].Examples include diffusion of nutrients and QS molecules inside the biofilm and mass transfer affected by the hydrodynamics of the bulk fluid and biofilm structure.For example, Kirisits et al. [65] studied the influence of the hydrodynamic environment on QS in a P. aeruginosa biofilm and concluded that the amount of biofilm biomass required for full QS induction of the population increases as the flow rate increases.
These more advanced QS models use either the continuum approach with Partial Differential Equations (PDEs) or the individual-based approach, both capable of capturing the spatial structure of the biofilm and its interaction with the surrounding environment, to study the effect of QS on either the biofilm structure or interactions among species within a biofilm.A continuum model involving QS will be discussed in Section 5, and here, we describe the work of Nadell et al. [66], which implemented detailed simulations using individual-based modeling methods [67][68][69] to investigate evolutionary competitions between strains that differ in their polymer production and quorum-sensing phenotypes.It is known that EPS secretion in the process of biofilm formation is under quorum-sensing control in a number of bacterial model systems in very different ways.For example, P. aeruginosa activates EPS production at high cell density [70].In contrast, V. cholerae initiates EPS secretion after attaching to a surface and losing flagellar activity, but halts EPS secretion once it reaches its high cell density quorum-sensing threshold [71].The model presented in [66] focuses on three strains with the following behavior: (1) no polymer secretion and no quorum sensing (EPS − ); (2) constitutive polymer secretion and no quorum sensing (EPS + ); and (3) polymer secretion under negative quorum-sensing control, such that EPS secretion stops at high cell density (QS + ).Cells consume substrate according to their strain-specific metabolism kinetics and produce additional biomass; all cells secrete an autoinducer without cost and at a constant rate, and QS + cells synthesize EPS only when the local autoinducer concentration is below the quorum-sensing threshold concentration, which is represented by a single dimensionless parameter.Results from Nadell et al. [66] suggest that QS + cells have a competitive advantage over EPS + , but only for a limited time window.In contrast, QS + cells suffer an initial disadvantage due to a lower growth rate when competing with EPS − cells, then rapidly ascend to a majority in the biofilm and remain there indefinitely.In addition, the QS + strain can invade populations composed mostly of either EPS + or EPS − cells, but not vice versa.The importance of the work in [66] is that it provides an evolutionary model that can be used to make predictions on the evolution of specific biological outcomes based on the biological constraints, and these predictions can be tested by experiments.In particular, it predicts that pathogenic strains, such as V. cholerae, selected for rapid colonization of, and efficient dispersal from, human hosts or other temporary environments, will exhibit negative quorum-sensing-regulated EPS production.In contrast, upregulation of EPS secretion at high cell density, which focuses resource investment into sustained local competitive ability, is more likely to be favored for organisms occupying specific niches long term, such as P. aeruginosa in chronic infections.
Mathematical tools used for modeling genetic processes related to biofilm include differential equations, statistical methods and individual-based approaches.These models usually enjoy success, but face challenges at the same time.For example, it is challenging to apply the gene-centric model to complex ecological systems since it is not easy to obtain associations between functional genes and reactions.Among many models for studying QS in biofilm systems, most of them focus on upregulation and downregulation of certain genes, and only a few emphasize the effect of QS on biofilm structure, function and its interaction with the environment, which leaves much room for model improvement.

Models Based on Flux Balance Analysis and Stoichiometry
Interactions of different species within a biofilm are closely related to the substrate consumption and metabolite exchange, and mathematical models based on FBA are excellent tools for predicting these interactions.
Early work in this direction involves the synthesis of metabolic pathways.Seressiotis and Bailey [72] developed a computer software system for metabolic pathway synthesis, which can be used to identify biochemical pathways, to predict on a qualitative basis the effects of adding or deleting enzymatic activities to or from the cellular environment, to classify pathways with respect to cellular objectives and to extract information about metabolic regulation.Mavrovouniotis et al. [73] extended the work in [72] by including stoichiometric constraints.Schilling et al. [74] gave a review on the development of computer-aided algorithms for the synthesis of metabolic pathways and explained the important algebraic concepts used in pathway analysis, such as null space and convex cone.
Orth et al. [75] covered the theoretical basis of FBA and provide several practical examples and a software toolbox for performing the calculations.Figure 1 from [75] explains the conceptual basis of FBA as a constraint optimization problem.In FBA, metabolic reactions are represented as a stoichiometric matrix (S) of size m × n.Each row of S represents one unique compound (for a system with m compounds), and each column represents one reaction (n reactions).The entry S ij of S is the stoichiometric coefficient denoting the number of moles of the i-th compound formed in the j-th reaction.The coefficient is positive if the metabolite is produced, negative if the metabolite is consumed and zero if the metabolite does not participate in a particular reaction.The flux through all of the reactions in a network is represented by the vector v of length n, and the concentrations of all metabolites are represented by the vector x of length m.The system of mass balance equations at steady state (dx/dt = 0) gives a set of equality constraints Sv = 0.Each reaction also has upper and lower bounds, which gives a set of inequality constraints on the flux components, namely a i < v i < b i , 1 ≤ i ≤ n.FBA seeks to maximize or minimize an objective function Z = c T v, which can be any linear combination of fluxes, where c is a vector of weights indicating how much each reaction (such as the biomass reaction when simulating maximum growth) contributes to the objective function.The constraint optimization problem is usually solved by linear programming.It is important to note that the stoichiometric matrix S can be directly constructed from knowledge of an organism's metabolic genotype, which in turn can be efficiently determined from the results of genome annotation.Models based on FBA are used to study the role of interspecies exchange of metabolites in determining the spatiotemporal dynamics of microbial communities.Harcombe et al. [76] developed a model that integrates dynamic Flux Balance Analysis (dFBA) [77] with diffusion on a lattice and applied it to engineered communities.Simulations from the model predict the species ratio to which a two-species (E.coli/S.enterica) mutualistic consortium converges, the equilibrium composition of an engineered three-member (E.coli/S.enterica/M.extorquens AM1) community and the beneficial effect of a competitor in spatially-structured mutualism.All predictions are confirmed by experimental results.The strength of the model is highlighted by the fact that it requires very few free parameters and no a priori assumptions on whether or how species would interact.
Phalak et al. [78] developed a model to investigate the multispecies metabolism of a biofilm consortium comprised of two common chronic wound isolates: the aerobe P. aeruginosa and the facultative anaerobe S. aureus.The model combines genome-scale metabolic reconstructions for growth rates via FBA and partial differential equations for metabolite diffusion and provides both temporal and spatial predictions with genome-scale resolution.In particular, the two-species system was predicted to support a maximum biofilm thickness much greater than P. aeruginosa alone, but slightly less than S. aureus alone, suggesting an antagonistic metabolic effect of P. aeruginosa on S. aureus.
Sigurdsson et al. [79] used a systems biology approach to identify candidate drug targets for biofilm-associated P. aeruginosa.This study employed the published reconstruction of P. aeruginosa iMO1056 [80] and used FBA to simulate different medium and oxygen conditions.The effect of single and double gene deletion on bacterial growth in planktonic and biofilm-like environmental conditions was investigated.Condition-dependent genes were found that could be used to slow growth specifically in biofilm-associated P. aeruginosa.In particular, eight gene pairs were found to be synthetically lethal in oxygen-limited environments, and these gene sets may serve as metabolic drug targets to combat biofilm-associated P. aeruginosa.Results from [79] show that FBA can be used to determine key metabolic differences between planktonic and biofilm colonies and shed light on searching for novel drug targets.
The application of models based on FBA in studying biofilm as a microbial community is very promising, but also challenging.In particular, efficient and standardized methods are necessary for generating reliable stoichiometric models when a large number of species is involved.Furthermore, it is important to develop mathematical tools that can effectively incorporate omics-based metabolic pathway information into kinetic functions, which can be used directly in kinetic growth models.The cybernetic approach developed by Song et al. sheds some light on future research in this direction, and a review of this approach is given by [81].

Models Based on Statistical Inference
Species within a biofilm rarely live in isolation; instead, they often coexist and have complex interactions that affect the community structure and function [82][83][84].The types of interactions include win-win (mutualism), win-zero (commensalism), win-lose (predation , parasitism), zero-lose (amensalism) and lose-lose (competition).Community-wide information on microbial interactions can be obtained using statistical inference based on correlations between taxon abundances from high-throughput sequence data [85,86].
Faust and Raes [87] reviewed strategies to construct community models from abundance data and use the models to predict the outcome of community alterations and the effects of perturbations.The prediction of microbial association networks from abundance data is known as the network inference problem [88].The network inference methods can be classified into two categories: the similarity-based methods, which predict pairwise relationships, and the regression-and rule-based methods, which predict complex relationships.Figure 2 from [87] explains the principle of similarity-and regression-based network inference.Network inference starts from an incidence or an abundance matrix.Pairwise scores between two taxa are then computed using a suitable similarity or distance measure, and relationships involving more than two taxa are detected by either multiple regression or association rule mining [89].Then, a random score distribution is generated by repeating the scoring step, and the P-value is computed to measure the significance of the predicted relationship.Finally, taxon pairs with P-values below a given threshold are visualized as a network.Inferred networks can be considered as static models of microbial communities, which describes the community status at a particular time.However, time series data obtained by network inference methods can provide important input (such as growth rates or interaction strengths) for dynamic models of microbial interactions; see [90] for modeling of cheese fermentation community interactions with generalized Lotka-Volterra equations.Network inference has several strengths.It is generic; it can integrate different data types; and it can identify community properties that are encoded in the network structure.However, network inference also suffers from several pitfalls, such as normalization, similarity measure biases, the choice of appropriate null models and multiple testing issues.Despite these pitfalls, network inference is a versatile tool for studying microbial interactions, can be used to build dynamic models that can predict community stability, alternative stable states and microbial succession and ultimately shed light on the manipulation of microbial communities to enhance the abundances of beneficial species and to suppress harmful ones.Faust et al. [91] applied an ensemble method based on multiple similarity measures [92] in combination with Generalized Boosted Linear Models (GBLMs) [93] to taxonomic marker (16S rRNA gene) profiles of the Human Microbiome Project (HMP) cohort [94], resulting in a global network of 3005 significant co-occurrence and co-exclusion relationships between 197 clades occurring throughout the human microbiome.Analysis of the network revealed strong organization of the human microbiota into body area niches, mostly among closely-related individual body sites representing microbial habitats.For example, Fusobacterium species can bridge organisms in the development and maturation of oral biofilms by co-aggregation through physical contact, allowing a more complex use of resources, such as sugars and proteins.The approach in [91] provides a starting point for future mechanistic studies of the microbial ecology of the human microbiome.
Widder et al. [95] investigated the effect of features inherent to fluvial networks on the structure and function of biofilm communities in these ecosystems by combining co-occurrence analyses of biofilms based on pyrosequencing profiling and a probabilistic hydrological model.Co-occurrence networks were constructed using 454 pyrosequencing data of the 16S rRNA gene from benthic biofilms from 114 streams of the pre-alpine Ybbs catchment.Results suggested that hydrological disturbance and metacommunity dynamics affect the co-occurrence patterns of benthic biofilm communities in fluvial networks.In particular, the removal of gatekeepers disproportionately contributed to network fragmentation.This study provides a linkage between the biofilm communities and flow dynamics across fluvial networks, which are important for understanding the whole ecosystem processes.
Although network inference methods can uncover previously-unknown interactions, they still require validation by experimental data, which could be challenging.Some of these interactions could be indirect, such as bacteria modifying their environments via the secretion of metabolically-costly proteins and metabolites [96].Since these indirect interactions usually happen on larger spatial scales than direct interactions, it is very important to develop non-invasive spatially-resolved experimental techniques to collect structure and population data.

Kinetic Growth Models and Spatial Heterogeneity
Kinetic models predict growth rates of different species within a biofilm based on the concentrations of growth-limiting nutrients and species-dependent parameters, such as maximum growth rate.Probably the most widely-used kinetic model is the Michaelis-Menten (or Monod) kinetic model [97] with the growth rate µ given by: where µ max is the maximum growth rate, S is the concentration of the growth-limiting nutrient and K S is the half-saturation constant.The formula given by ( 3) can be readily generalized to cases with more than one limiting nutrient [13] by multiplying the contribution from each nutrient together (S i /(K S i + S i ) for the i-th nutrient), but usually, µ max and K S i are considered constants.Under the nutrient-saturation condition, (3) can be approximated by the zero order kinetics, where µ is independent of S (µ = µ max ).Under the very low nutrient condition, (3) can be approximated by the first order kinetics, where µ is proportional to S (µ = µ max • S).These two approximations provide convenient mathematical bounds on the Monod kinetic forms and have the advantages of allowing analytic solutions [50] to the model equations for simple scenarios (ODE model or 1D in space).
The success of the kinetic models depends crucially both on their particular formula and parameter values.Since the formula is often empirical and the parameter values are usually measured from pure and mixed cultures growing in laboratory reactors, they may miss important factors of the growth kinetics in the biofilm community developed in the natural environment, and their application may require validation.Recently, new kinetic models have been developed to address this problem.Quéméner and Bouchez [98] and Jin et al. [99] developed kinetic models with thermodynamics included.The model proposed in [98] is based on the theory of Boltzmann statistics and builds a relationship between microbial growth rate and available energy, thus connecting microbial population dynamics to the thermodynamic driving forces of the surrounding ecosystem.The work in [99] modified the Monod kinetics by a thermodynamic potential factor, which accounts for the chemical energy available from the reaction (acetate oxidation and sulfate reduction in this case) and evaluated the feasibility of applying experimentally-obtained parameters to the natural environment.The results suggest that some parameters, such as maximum growth rate, can be applied directly to the environment; but others, such as half-saturation constants, should be determined using data from the environment of interest.Bonachela et al. [100] relaxed the requirement that the maximum nutrient uptake rate be a constant; instead, the maximum uptake rate was assumed to increase monotonically as the external nutrient concentration decreases.The model predicts larger uptake and growth rates than the standard Monod kinetic, which explains the ability of marine microbes to persist under extreme nutrient limitation.
Biofilms usually have a complex spatial structure, which contributes to their distinctive properties, such as strong antibiotic resistance and diverse population (niches for different species).Mathematical models designated for characterizing the spatial structure of biofilms include Individual-based Models (IbMs) [8,67], continuum models [101,102] and hybrid models [9], which combine both.For IbMs, biofilms are represented as a collection of individual microbes (usually hard spheres) whose growth and movement are determined by a set of rules depending on the local environment, such as the availability of nutrients and space.Results from IbMs usually provide more detailed information and are generally considered superior for studying interactions at the microbe level.Continuum models represent the biomass by functions depending continuously on time and space, and these functions are governed by differential equations derived by using physical, chemical and biological principles.Continuum models allow both numerical solutions obtained using readily-available numerical methods and theoretical analysis of the qualitative behavior of the solutions, such as stability [103], and they are often considered more applicable at larger spatial scales.
Recently, models incorporating novel features have been successfully applied to study biofilms as a microbial community.Below, we discuss an individual-based model and a continuum model, respectively.
Storck et al. [104] developed an individual-based, mass-spring modeling framework to study the effect of cell properties on the structure of biofilms.In this model, cells are represented by a collection of particles connected by springs, which allows variable morphology (e.g., cocci, bacilli and filaments).Three types of structures are considered: the primary structure, which defines the shape of individual cells; the secondary structure, which defines microbial assemblies related by filial links between immediate siblings; and the tertiary structure, which defines non-filial cell-cell and cell-substratum links, such as sticking and anchoring connections.Forces acted on the cells include both elastic force from the springs they are attached to and DLVOforce (combination of the van der Waals and electrostatic force).Simulation results of the growth of rod-shaped cells on a planar surface suggest that the biofilm may grow as a monolayer if there are no anchoring and filial links; the biofilm can be much thicker and much less spread if there are cell-substratum anchoring; and with filial links, but no anchoring to the substratum, a biofilm with an irregular shape (less circularity) is more likely to develop.Simulation results of the activated sludge floc structure suggest that in the floc with filament branching, the filaments are shorter than that of a floc made of straight filaments growing at a similar rate, therefore resulting in greater floc density and attenuating the bulking tendency of filamentous sludge.Furthermore, simulated flocs with spherical floc formers (in contrast to rod-shaped floc formers) were less dense, since the denser packing of spherical cells in a colony leads to a smaller cluster volume, which lowers the chance to encounter a filament former.These simulations demonstrate the close relationship between the fundamental controlling mechanisms, such as the intracellular, intercellular and cell-substratum links, and the diverse biofilm structures.
Emerenini et al. [105] developed a continuum model that includes biofilm growth, production of quorum sensing molecules, cell dispersal triggered by quorum sensing molecules and reattachment of cells.In this model, two distinct cell types are considered: the sessile cells in the biofilm and the motile cells, which can move into and in the liquid phase.The volume fraction of sessile cells and EPS is denoted by M; the motile cell density is denoted by N; and the concentrations of growth-controlling nutrients and autoinducers are denoted by C and A, respectively.Dispersal of cells from the biofilm is controlled by the local autoinducer concentration through Hill kinetics with switching threshold τ and maximum dispersal rate η 1 .Re-attachment of cells in the biofilm is controlled by the local biofilm density M through Monod kinetics with maximum rate η 2 .The autoinducer production rate is controlled by the local autoinducer concentration, which implicitly represents the switch between down-and up-regulated cells.The governing equations for M, N, C, A are reaction-diffusion-type equations with density-dependent (M) diffusion coefficients.Simulation results suggest that single quorum sensing-based mechanism can explain both periodic dispersal in discrete events and continuous dispersal, depending on the value of switching threshold parameter τ.For smaller values of τ, the switching threshold is reached quickly, leading to a rapid dispersal of the biomass before the biofilm can grow into a large size.After the first dispersal event, the biofilm population starts growing again, and the autoinducer concentration increases again, resulting in an almost periodic pattern of discrete dispersal events.For bigger values of τ, it takes a much longer time to reach the switching threshold, and the biofilm develops into a stronger colony before the onset of dispersal.Release of cells from the biofilm into the liquid phase appears continuous, and the biofilm population reaches a plateau.Simulation also suggests that re-attachment of dispersed cells is negligible.The study in [105] indicated that important properties, such as biofilm mass and thickness, can be modified by changing the QS threshold and dispersal rates, and the systems can change between continuous and oscillating behavior.The findings can also help to optimize treatment strategies.For example, promoting quorum sensing can enhance cell dispersal and limit biofilm thickness, which could increase the efficacy of antibiotic treatment, since planktonic cells are generally assumed to be more vulnerable to antibiotic treatment.
IbMs have the obvious strength of describing the detailed biofilm structure and interactions at the cell level, and the forces between individual particles can be derived from first principles.On the other hand, IbMs also suffer the drawback of high computational cost, especially when the goal is to model a biofilm containing a very large number of cells.Therefore, it is important to use efficient numerical methods in the implementation of IbMs, and parallel computing is often the choice [106].The continuum models can often be analyzed using well-established differential equation theory, and there are many numerical packages available for solving the corresponding discretized system of algebraic equations.However, continuum models often depend on some empirical formula; examples include the effective diffusion coefficient of nutrients inside the biofilm and the constitutive equation for the stress-strain relation when modeling the biofilm as a viscoelastic fluid, and derivation of such a formula is often nontrivial.

Conclusions
We present a review of some recently-developed mathematical models that focus on studying biofilms as diverse communities.Despite obvious overlapping, these models are categorized based on their principal methodologies, such as trait-based models, QS, FBA, statistical inference and spatially-resolved models with specific growth kinetics.There are some important topics that have been left out, such as models incorporating stochasticity and evolutionary game theory.
Even though currently-available models can describe many aspects of biofilms accurately, it still remains a challenge to build models that can predict the overall behavior of biofilms as complex and evolving communities.First, time scales involved in biofilm-related processes can vary in as many as ten orders of magnitude, ranging from the fast scale for fluid dynamics to the slow scale for biofilm growth, which is an obvious challenge for modeling.To address this, it is often necessary to assume equilibrium in the fast processes or a quasi-static biofilm profile, depending on the problem of interest.Second, fast advancement in the experimental technologies has provided an abundance of omics data at both the genetic and community level, and many community-scale models have been proposed to describe the interaction between biology, chemistry and physics inside biofilms.However, there is still the lack of a systematic approach to link the observational data to the community-level understanding, namely to tie system kinetics to omics data in a tractable and general way by translating omics to rate functions at the cellular level.The method based on the FBA approach is very promising in this direction.Third, long-term challenges for modeling biofilm as an MC include the necessity to incorporated evolutionary processes, social evolution and bacterial strategies, community assembly and historical contingency, as well as the importance of spatial structure.Addressing these challenges would inevitably require an integrated approach that not only selectively combines multiple relevant models by adding their strengths, but also combines modeling effort and experimental findings.

Figure 1 .
Figure 1.The conceptual basis of FBA as constraint-based modeling.Reprinted from [75] with permission from Nature Publishing Group.

Figure 2 .
Figure 2. Principle of similarity-and regression-based network inference.Reprinted from [87] with permission from Nature Publishing Group.

Table 1 .
Summary of biofilm models discussed in the article.QS, Quorum Sensing; FBA, Flux Balance Analysis; IbM, Individual-based Model.