Investigating the Ability of Growth Models to Predict In Situ Vibrio spp. Abundances

Vibrio spp. have an important role in biogeochemical cycles; some species are disease agents for aquatic animals and/or humans. Predicting population dynamics of Vibrio spp. in natural environments is crucial to predicting how the future conditions will affect the dynamics of these bacteria. The majority of existing Vibrio spp. population growth models were developed in controlled environments, and their applicability to natural environments is unknown. We collected all available functional models from the literature, and distilled them into 28 variants using unified nomenclature. Next, we assessed their ability to predict Vibrio spp. abundance using two new and five already published longitudinal datasets on Vibrio abundance in four different habitat types. Results demonstrate that, while the models were able to predict Vibrio spp. abundance to an extent, the predictions were not reliable. Models often underperformed, especially in environments under significant anthropogenic influence such as aquaculture and urban coastal habitats. We discuss implications and limitations of our analysis, and suggest research priorities; in particular, we advocate for measuring and modeling organic matter.


Introduction
Vibrio spp. are naturally occurring aquatic bacteria, highly adaptive and freely associated with a variety of biotic and abiotic surfaces including water, sediment, fish, shellfish, algae, and zooplankton. Vibrio spp. comprise a minor portion of the total microbial population, and around 1 percent of the total bacterioplankton in coastal waters [1]. Despite their relatively low abundance, Vibrio species are one of key constituents of aquatic heterotrophic bacterial groups [2].
Aquatic heterotrophic microorganisms have an important role in the mineralization of organic matter, and the variations in abundance, community structure, and activities of heterotrophic microbial communities affect both the biotic and the abiotic components of aquatic environments. Vibrio spp. participate in biogeochemical processes by utilizing a variety of substrates and mineralization of organic matter, thus directly contributing to the recycling of carbon, nitrogen, and organic matter in the aquatic environment [1,3,4]. Alongside their role in the abiotic cycles, some >140 described species from the genus Vibrio can have a strong biotic impact, and consequently pose severe health risks and economic losses. A well-known example is Vibrio cholerae, which causes cholera-a global threat to public health with about four million cases of infection every year, leading to over 100,000 deaths [5]. The rise of noncholera Vibrio species (V. parahaemolyticus, V. alginolyticus, and V. vulnificus) can cause other potentially lethal infections (vibriosis) in humans [2], We aimed to (i) identify and systematize primary and secondary Vibrio spp. growth models, and (ii) to analyze and validate the models by applying them to different sets of available data. We reviewed growth models of Vibrio spp. using examples from research in food safety and water research, and validated them on several sets of field data. Then, we analyzed whether the model(s) can be used to capture in situ Vibrio spp. abundances, with an emphasis on differences between the marine habitat types.

Materials and Methods
Our methodological approach can be divided into four main steps: (1) a literature search, (2) data preparation, (3) model simulations, and (4) performance analysis. The methodology overview is graphically presented in Figure 1. The methodological approach to model analysis. We performed a literature search to find all Vibrio spp. growth models and all available datasets suitable for comparison. We found five published datasets, and included two of our own collected through projects AqADAPT and AQUAHEALTH of the Croatian Science Foundation (HRZZ).

Literature Search and Model Synthesis
The literature search was conducted using the Web of Science (WoS) advanced search in April 2022. The search string was defined based on keywords and Boolean and adjacency operators, and was searched for in Abstracts (Field Tag "AB"). The search string was as follows: AB = (((vibrio*) AND (growth OR abundance) AND (temperature OR salinity OR "pH" OR "COD" OR "organic matter" OR nutrient*) AND (model*))). We obtained 189 results based on our search, which was restricted to the English language. For a detailed description of our literature review, please see Appendix A. We selected 16 papers with Vibrio spp. growth models based on clearly defined functional dependencies for using environmental conditions (e.g., temperature, salinity, pH). We did not include models derived purely by regression or similar statistical means because those are not modular, and cannot be differentiated into primary and secondary. From the 16 papers containing growth models, we extracted, systematized, and classified explicit formulations for primary (Table 1) and secondary growth models (Table 2), using parameters listed in (Table 3) and thus arriving at 28 unique Vibrio spp. growth models for further analysis (Table 4). Table 1. Systematized equations for Vibrio spp. primary growth models. Of the 12 models listed in this table, one (new logistic model) did not have parameters listed, so only the remaining 11 were used in further analysis. The reference in the column "Model" is the original paper containing the equation. The column "Article" lists all published articles that used the given primary models.

Data Preparation
An additional literature search identified five datasets suitable for model validation; hence, a total of seven datasets were available for analysis (Table 5) once our two previously unpublished datasets were added.
The previously unpublished datasets, AqADAPT [54] and AQUAHEALTH [55], contain observed values for Vibrio spp. abundance and environmental parameters from the Adriatic Sea. Sampling was conducted in three floating-cage fish farms in the northern, middle, and southern Adriatic Sea (Croatia), where European sea bass (Dicentrarchus labrax) and sea bream (Sparus aurata) are cultured. The farms in the northern and central Adriatic are located in the semi-open sea at depths of about 49 m and 22 m, respectively. The fish farm in the southern Adriatic is located in the outer part of the Mali Ston Bay at a depth of 18 m. The Mali Ston Bay is occasionally strongly influenced by the (freshwater) Neretva River. Periodic use of antibiotics is possible on all three fish farms, and this would have affected the Vibrio spp. abundance. However, no specific data on antibiotics use are available. We classified these datasets into aquaculture habitat type. Dataset AqADAPT [54] was labeled as AQC1 and dataset AQUAHEALTH [55] was labeled as AQC2. Table 5. The seven datasets used for model validation. AQC1 and AQC2 were previously unpublished; the other datasets are publicly available and can be accessed through the provided reference. Information for each dataset contains reported values (i.e., the number of entries in a dataset), values used for validation (i.e., the number of observations after the missing values were removed from the dataset), temperature, salinity, and pH range. Seven datasets used for model validation were classified into four habitat types based on the characteristics of the collection sites. Methods used for determining Vibrio spp. abundance are listed in Appendix C.1. Bullington et al. [56] and Steward et al. [57] published Vibrio spp. abundance and environmental parameters from the Ala Wai Canal in urban Honolulu, Hawaii, on the island of O'ahu. The 3.1 km-long engineered waterway operates as a tidally influenced estuary with freshwater input from a watershed that covers 42.4 km 2 via the Manoa and Palolo streams, which merge to form the Manoa-Palolo Stream prior to entering the canal, and the Makiki Stream, all of which run through urban areas before reaching the canal. Consequently, the streams are contaminated with a variety of anthropogenic substances, and their convergence in the Ala Wai Canal has contributed to its pollution and eutrophication [57]. We classified datasets from this area as the urban estuary habitat type due to the strong anthropogenic influence. The dataset by Bullington et al. [56] was labeled as URB1 and the dataset by Steward et al. [57] was labeled as URB2.

Dataset
Froelich et al. [58] gathered data from the Neuse River Estuary in Eastern North Carolina (USA). The Neuse River Estuary, located in Eastern NC (USA), is a well-described, lagoonal estuary, with wind-driven mixing characteristics and minimal tidal influence due to the protection offered by the proximal Pamlico Sound. Being broad and shallow (generally less than 3 m in depth), the estuary flow and mixing are dominated by wind and river input. This dataset was classified as an estuary habitat and labeled as EST1.
Urquhart et al. [59] collected data from the Great Bay Estuary, New Hampshire (USA). The Great Bay Estuary (GBE) extends inland from the mouth of the Piscataqua River near Kittery, ME, through Little Bay and eventually into the Great Bay (25 km). The GBE has deep, narrow channels with strong tidal currents, and wide, shallow mudflats. The physical transport regime of the GBE follows the classical estuarine circulation model for drowned river valley estuaries [59]. This dataset was also classified as an estuary habitat and labeled EST2.
Williams et al. [60] presented data from five sites along the Eastern North Carolina coast (USA). Locations were as follows: Harlowe Creek, South River, North River, Hoop Pole Creek, and Jumping Run Creek. These sites were chosen to represent the range of high-and low-salinity environments, some of which experience large salinity fluctuations, while others have very small salinity fluctuations (for more information, please refer to the original manuscript [60]). We classified this dataset as a coastal area, and labeled it COAST.
From the given datasets, we selected variable Vibrio abundance and the following environmental parameters: temperature, salinity, and pH. We excluded all of the missing values from datasets and logarithmically transformed Vibrio abundance greater than 0 (log10 + 1 in datasets AQC1, AQC2, and COAST, log10 in datasets URB1 and URB2). Datasets EST1 and EST2 already contained logarithmically transformed values. The analytical code can be found in the R script named data_preparation. Results of the dataset preparation are summarized in Table 5.

Model Simulations
Herein, we showcase a model simulation approach aiming to calculate Vibrio spp. abundance based on primary and secondary models accounting for environmental parameters (temperature, salinity, and pH). All collected models were developed for controlled environments (i.e., laboratories), where Vibrio spp. growth was monitored at regular, short time intervals. Such data lend themselves to time series modeling, where abundance is plotted against time. In contrast, in situ data are typically irregularly collected from variable abiotic microenvironments, at longer time intervals, and with the noise typically inherent to field measurements. Such data could not be modeled as a time series, but had to be modeled as independent abundances-each data point was considered to be a result of bacterial growth that started some time ago.
The model had to be simulated for the time of growth, but determining how long ago the growth started was a challenge which we needed to overcome in order to select the (optimal) model run time for a dataset. Note that having an independent run time for each data point would create an option to fit each data point exactly by choosing the perfect time, thus defeating the whole point of modeling the bacterial dynamics. To minimize the bias introduced by our choice of the model run time, we determined a (run) time that gives the best result for each model-and-dataset combination. Optimal run time is then the simulation duration that produces the best match between the model predictions and the observations.
We used default parameter values (Appendix B, Table A1) listed in their respective references for each of the 28 models (Table 4). First, the values of the specific growth rate and lag time were calculated, which were then used in the primary models: modified logistic, Barayni, Gompertz, modified Gompertz, three-phase linear, Huang, no-lag phase and met exponential models. These primary models adjust the specific growth rate and lag time using one or more of the secondary models, as described in the original literature and summarized in Table 4. The environmental parameters considered in the literature for modeling Vibrio spp. growth in dynamic conditions were: temperature, salinity, and pH (Table 4). Secondary models sometimes use water activity or NaCl concentration instead of salinity. As all datasets contained information on salinity, water activity or NaCl concentration were calculated from salinity (Table 5) (for more details, please refer to Appendix B). To determine the optimal run time, we simulated Vibrio spp. growth in a selected time range from 1 to 600 hours, and selected the run time that produced the best fit to the data.

Model Performance
We used the coefficient of determination (R 2 ) to evaluate the ability of models to describe the observed experimental data: where y i is the actual value in the dataset,ŷ i is the corresponding predicted value,ȳ is the mean value of the dataset, and n is the sample size.
To determine applicability (i.e., model generality) of a specific model to a specific dataset, we compared R 2 values calculated for all models for that specific dataset, and then selected those models for which the dataset-specific R 2 value was higher than the overall median of all R 2 values for all models and all datasets. For example, if the median overall goodness of fit of all models to all datasets was R 2 = 0.30, all models with R 2 > 0.30 for dataset 1 would be marked as capturing dataset 1. We then tested the robustness of the results by looking at a more stringent requirement, where we marked a certain model as capturing a dataset only if its R 2 for that dataset was in the top 25% (1st quartile) of the values for all models and all datasets.
Robust ANOVA based on trimmed means [61,62] was used to test the difference in model performance (obtained R 2 ) between different habitat types (aquaculture, urban estuary, estuary, and coastal area). Robust ANOVA was used to overcome the problems associated with deviations from homoscedasticity/normality and to reduce the influence of outliers observed in the data. Post hoc tests were also performed in the robust WRS2 environment [61], where p-values were adjusted for multiple testing using the Benjamini-Hochberg (BH) method.

Vibrio spp. Growth Models
Vibrio spp. growth models identified by the literature review could be partitioned into 12 primary (Table 1) and 8 secondary ( Table 2) models, using the parameters listed in Table 3. In total, we identified 29 models for Vibrio spp. growth in a dynamic environment, but we could not find parameter values for Fujikawa et al. [50]. Therefore, we further analyzed only the remaining 28 models (Table 4), using the parameters listed in Table A1.
The model summaries (Table 4) provide an overview of the models used when studying Vibrio spp. growth in a dynamic environment. The main findings were as follows:

•
Baranyi and modified Gompertz are the most commonly used primary models for describing Vibrio spp. growth over time. • Square root and Arrhenius-based models are the most frequently applied secondary models for Vibrio spp. growth in dynamic conditions. • V. cholerae, V. parahaemolyticus, V. harveyi, and V. vulnificus are the species used most often as modeling organisms. • Vibrio growth was monitored in/on various substrates (free water column, within organisms, in broth substrates, etc.), under different temperature, salinity, and pH conditions. This implies that the aquatic environments and organisms (marine and freshwater), as well as food and water health and safety, are the key areas of research and concern. • Temperature was the prevailing environmental parameter used in secondary models, implying a strong effect of temperature on Vibrio spp. abundance. The effect of temperature on the primary model parameters (growth rate and lag time) was most often modeled by the square root or the Arrhenius-based model.

Vibrio In Situ Datasets
We summarize the dataset classification and characteristics of the seven datasets in Table 5, including the number of Vibrio spp. abundance data, and range and type of environmental variables used in model validation (temperature, salinity, pH). Datasets URB2 and COAST do not contain information on pH, so simulations of Model 6 ( Table 4) were not possible for those datasets.

Model Performance
The ability of models to describe the data greatly varied between models and datasets ( Figure 2). The calculated R 2 values ranged from <0.001 (Model 10 for dataset AQC1) to 0.40 (Model 28, dataset URB2), with the overall median value of all models for all datasets R 2 = 0.13. R 2 values also significantly differed between habitat types (Figure 3, Table A2) in all pairwise comparisons (post hoc tests p < 0.017; see details in Table A3). Therefore, R 2 values suggest the models performed best for coastal areas, followed by estuaries, urban estuaries, and aquaculture habitats. Comparing R 2 values provides performance estimates of a particular model on a particular dataset, but does not provide information on generality.   Table 5). Model performance significantly differed between habitat types (robust ANOVA, F(3,75.561) = 49.9, p < 0.001, effect size ξ = 0.77, confidence interval CI(ξ) = [0.68, 0.84]; Table A2). Significance codes are as follows: p ≤ 0 '****', p < 0.01 '**', and p < 0.5 '*'.
Model generality analysis ( Figure 4A) shows rankings above the median (R 2 > 0.13) where: Figure 4. Model generality. Stacked bar chart used for graphical representation of model's applicability on the dataset from a specific habitat: colors represent habitats (see the legend), and stars (*) denote models that exhibit the evaluation issue with some of the data points in some of the datasets (details in Appendix D). Values on y axis denote the frequency of occurrence of a particular model whose R 2 value is above the median (R 2 > 0.13; Panel A), and in the first quartile (R 2 > 0.22; Panel B). Primary models are labeled as follows: ML-modified logistic, Baranyi, Gompertz, modified Gompertz, TPL-three-phase linear, HPM-Huang primary, NL-no-lag and NE-net exponential. Star (*) signifies models that had an evaluation issue with some of the data points in some of the datasets (details in Appendix D). Note that there is only one coastal area dataset, while other habitats have two datasets each. Hence, scoring a single occurrence of the coastal area habitat represents a 100% success rate, while scoring the same in any other habitat represents a success rate of 50%.

•
All models except Model 6 (Baranyi with polinomial pH and salinity secondary models) were able to score above average at least in some habitats, i.e., capture those habitats. Incidentally, Model 6 was the only one not applying a temperature correction Selecting for better capability by scoring only performances within the first quartile (R 2 > 0.22, Figure 4B) shows that: • ≈A total of 72% of the analyzed models had an exceptional ability to capture datasets from the coastal area habitat. • The ability of models to capture/perform well for estuarine habitats was severely diminished, with only 16/28 (57% of the models) capturing one of the two estuary datasets, and only 10 models (36%) capturing both. • None of the models were able to capture for the aquaculture habitat datasets, and only three (Models 8, 9, and 28) captured the urban estuary habitat. • Model 28 seemed even more specialized, as it captured a single urban estuary (URB2) • Most prolific Baranyi models (Models 8 and 9) remained so by capturing datasets from three habitat types, albeit only a single dataset from each.

Discussion
Our work adds to the growing body of knowledge from the past few decades that helped refine a general understanding of the ecology of Vibrio spp. We (i) summarized dynamics and unified nomenclature for all published functional models we could find (12 primary and 8 secondary), (ii) added two datasets to the existing five longitudinal datasets on Vibrio spp. and their environments, and (iii) used the datasets to asses the ability of existing models to capture Vibrio spp. abundances in four different habitat types.
There are no clear winners between the 28 investigated models. Generally, the R 2 values were not particularly large (≤0.40), but the values can be considered acceptable given the difficulty of the task, in particular the multitude of potential factors affecting both the environment and the Vibrio populations.
Baranyi-based models (Models 2 to 10 in Table 4) seemed to capture the widest variety of habitats well, with Models 8 and 9 leading the way in diversity. Although both models had a similar R 2 for the AQC1 dataset (0.14 and 0.12, respectively), Model 8 was the only one that crossed the median threshold and therefore captured an aquaculture dataset as an above-average performer. Notably, Models 8 and 9 are the only Baranyi-based models that included both temperature and salinity secondary models. Model 28 (net exponential) also included both factors and performed quite well, giving the highest overall R 2 (0.40 for the URB2 dataset).
Inclusion of temperature and salinity, however, does not guarantee success. All three Gompertz-based models included both factors, but all three underperformed. This would signal Gompertz-based models should be used with caution. Modified Gompertz-based models (Models 14-24), however, seem to lag only slightly behind the best.
While it may be tempting to proclaim Baranyi-based models as the most versatile, they do have significant drawbacks. First, both Models 8 and 9 had issues with simulations: their fast growth rate sometimes caused very large predictions (see Appendix D). While this has not been a problem for the optimal run time of Model 8, Model 9 did lose up to 3% of data in the evaluation. These issues may not be consequential in the current assessment, but may become so in new datasets.
Second, the secondary models accounting for salinity in Models 8 and 9 yielded unexpected growth rate patterns. For example, at moderate and high salinities, growth rates could be extremely high at the ends of the environmental temperature range (e.g., 3.18 ln(CFU/g)/h for salinity of 20 and temperature of 27°C, Figure 5, Panel B). Likewise, at 10°C, the growth rate of Model 8 was moderate and slightly increased with salinity; at 30°C, however, the rate was extremely high for low salinity, and rapidly decreased with salinity. This may be plausible as abundance of Vibrio spp. increased with water temperatures during periods of reduced salinity [73], but we recommend caution when utilizing Baranyi-based models in highly variable environments.
Overall, predictions for the coastal area and estuary datasets seemed to be more robust than those for urban estuaries and aquaculture (Figure 4, Panel B). We hypothesize this may be due to higher levels of anthropogenic disturbance in aquaculture and urban estuaries, in particular potentially high inputs of organic matter. Vibrio spp., as a prototypical copiotroph, dominates in nutrient-rich environments [74]. They exhibit a feast-and-famine lifestyle and swim to colonize sporadic, nutrient-rich patches and particles [75]. Therefore, we think that organic matter has an important role in determining Vibrio spp. abundance, especially in these types of habitats.
Our study has some limitations. First, we treated all Vibrio spp. the same. While justified in the context of a wide assessment such as ours, Vibrio species clearly differ and could potentially exhibit significantly different dynamics, especially since different species favor different environmental conditions. These differences could become important as environmental conditions change, and the Vibrio community could shift towards diseasecausing species. Unfortunately, current datasets and available models do not allow for a deeper investigation of the issue.
Second, we chose the simulation time that minimized R 2 , but relied on a single simulation time for each dataset. In principle, a different simulation time could be chosen for each data point. Such an approach would, however, in effect result in fitting each value by choosing simulation time that gives a desired result, thus defeating the purpose of modeling. There could, nonetheless, be some value in exploring functional dependencies of the simulation time on various environmental factors (temperature in particular), but additional research would be required to suggest a particular form of such a function.
Third, we set out to investigate published parameters and models with well-defined functional forms. Perhaps a different set of parameters and/or combinations of models could have described the datasets better. We have made a first step towards such research by systematizing primary and secondary models, parameters, and available-including two previously unpublished-datasets. Alternatively, statistical models could possibly be re-fitted to better capture the datasets. Such statistical approaches may be appropriate in some cases, e.g., when aiming to inter-or extrapolate data from a single area.
Clearly, additional research is needed for developing a growth model capable of predicting in situ Vibrio spp. abundances in natural environments. We suggest further development of secondary models should be one priority; dynamics of secondary models should be well-defined for the whole range of expected environmental factors, especially temperature and salinity. Modeling could also benefit from research on additional environmental factors such as organic matter, which has been shown to affect Vibrio growth [56]. Given the role of Vibrio spp. in aquatic environments, it is surprising that less than half datasets include organic matter measurements. We suggest that future modeling development should include organic matter, especially when sampling to ensure measurement of organic matter.
Perhaps a development of a third generation of models based on big data and deep learning could also work in synergy with mechanistic modeling to improve our ability to predict Vibrio spp. dynamics in a changing environment. Improved models could then enhance predictive frameworks, e.g., by replacing the basic ecological niche approach to estimating Vibrio presence used in exploration of future risk scenarios by Trinanes and Martinez-Urtaza [76].
In conclusion, none of the investigated models provide a complete solution: Baranyibased models might be the most versatile, but other models (e.g., net exponential) may provide a better fit for a particular cause. Therefore, the choice of the model should be at least in part guided by the type of the environment and expected ranges of environmental factors; if many data lie outside of the well-described range of a particular secondary model (see Figure 5), perhaps a different model should be tried instead. Our summary and systematization (Tables 1-5) provide currently available primary and secondary models, and data, which can be used as a toolbox for model creation and testing.

Appendix A. Literature Search
The literature search was conducted using the Web of Science (WoS) advanced search in April 2022. We accessed all databases in the Web of Science: Web of Science core collection (Arts & Humanities Citation Index, Book Citation Index-Science, Book Citation Index-Social Sciences & Humanities, Conference Proceedings Citation Index-Science, Conference Proceedings Citation Index-Social Science & Humanities, Current Chemical Reactions, Emerging Sources Citation Index, ESCI Backfiles, Chemicus Index, Science Citation Index Expanded, Social Sciences Citation Index), BIOSIS citation index, Medline, Zoological record, Current contents connect, Derwent innovations index, Data citation index, SciELO citation index, BIOSIS previews, CABI-CAB abstracts and global health, Inspec, KCI-Korean journal database, journal citation reports, essential science indicators, EndNote online. The search string was defined based on keywords and Boolean and adjacency operators, and was searched for in Abstracts (Field Tag "AB"). The search string was: AB = (((vibrio*) AND (growth OR abundance) AND (temperature OR salinity OR "pH" OR "COD" OR "organic matter" OR nutrient*) AND (model*))). We obtained 189 results based on our search, which was restricted to the English language. We detected 9 potentially duplicate articles before the screening, which were resolved, and 180 remained. A primary search resulted in 27 articles for the screening. Furthermore, we performed a backward and a forward search based on identified relevant articles. A backward search was performed by searching a list of references at the end of the articles, and a forward search was conducted using Google Scholar. We performed this procedure two times until no new relevant article was identified. Finally, we included six more articles. The screening was conducted in Rayyan collaborative review application [77]. The first screening was performed by a reviewer, MP, and the final screening was carried out by reviewers TK and MP. We selected 16 papers with the Vibrio spp. growth models for analysis after the full screening of the papers from the primary search and additional search. Please see the PRISMA diagram for a breakdown of the overall procedure ( Figure A1). In the analysis, we included articles with Vibrio spp. growth model equations depending on environmental parameters. Data extraction led to 28 Vibrio growth models for validation. hemolysin gene (vvhA). In dataset EST1 [58], Vibrio spp. concentrations were determined by counting the total number of visible yellow and green colonies that exhibited relief from the plate surface from TCBS, adjusting for dilution, and expressing the results as colonyforming units (CFUs) per 100 mL. In the dataset EST2 [59], oyster tissue was processed for enumeration of V. parahaemolyticus via a three-tube MPN enrichment method following the FDA Bacteriological Analytical Manual coupled with culture-based and polymerase chain reaction (PCR) methods used to confirm the presence of V. parahaemolyticus. In the dataset COAST [60], the authors quantified V. parahaemolyticus from water samples by counting the total number of visible colonies using the CHROMagar Vibrio medium (CHROMagar, Paris, France). Figure A2. Optimal run time for the simulation duration that produces the best match between the model prediction and the observation (data point of a dataset) based on R 2 median value. Models based on Table 4 are specified on the x axis. Primary models are labeled as follows: ML-modified logistic, Baranyi, Gompertz, modified Gompertz, TPL-three-phase linear, HPM-Huang primary, NL-no-lag, and NE-net exponential. Star (*) signifies models that had an evaluation issue with some of the data points in some of the datasets (details in Appendix D).
Model performance and its applicability to specific datasets based on a comparison of R 2 median values (R 2 > 0.13) resulted in a total of 27 applicable models on datasets from four habitat types (aquaculture, urban estuary, estuary, and coastal area) (Figure 2). Model 6, based on the Baranyi equation and polinomial model for the effect of pH and NaCl, exhibited poor performance. Model performance and its applicability to specific datasets based on a comparison of Q1 (upper 25%) R 2 values (R 2 > 0.22) resulted in a total of 21 applicable models on datasets from three areas (urban estuary, estuary, and coastal area) ( Figure 2). Here, Model 3, Model 6, Model 10, Model 12, Model 17, Model 22, and Model 23 displayed poor performance. Models 2, 6, and 10 are Baranyi's type with square root, polynomial, and modified Ratkowsky models for the effect of temperature or pH and NaCl, respectively. Model 12 is Gompertz's type with modified Ratkovsky and Arrhenius-based models, respectively. Model 22 and Model 23 were based on the modified Gompertz model and used square root and Arrhenius-based models for describing the effect of temperature.

Appendix D. Model Evaluation Issues
Some models had evaluation issues, where a proportion of the data had to be disregarded. The issues for each model are described below, and Table A4 summarizes the issues and proportion of data disregarded at the optimum simulation time.
Model 3 had negative specific growth rates generated by the secondary model (the parameters used for a four-parameter polynomial model by [32]). This phenomenon was observed for temperatures below 15°C. Model 3 also generated Inf values for higher simulation times and higher specific growth rates. Model 7 generated negative specific growth rates for temperatures between 11.1 and 11.7°C. A secondary model was polynomial, as described in [34]. Additionally, in this secondary model, lag time had lower values at 17.6°C and the highest at 27.5°C. Model 8 resulted in Inf values in cases where maximum specific growth rate was between 1.630371 (12.3°C) and 2.476982 (27.5°C) and time range was between 289 and 600 h. This large value of the specific growth rate was generated by the response surface secondary model from [37]. Some of the predicted values were equal to the parameter maximum bacterial count (Y max ) (we obtained NA R 2 ). Model 9 resulted in Inf values in cases where mumax was between 1.491716 (15.9°C) and 2.459676 (27.5°C) and the time range was between 291 and 600. This large value of the specific growth rate was generated by Arrhenius-Davey secondary model from [37]. Some of the predicted values were equal to the parameter maximum bacterial count (Y max ) (we obtained NA R 2 ). Model 13 resulted in NAN predictions for values that had salinity, i.e., water activity higher than 0.998, and Model 27 had NAN predictions for temperatures below 10.5°C. COAST [60]