Next Article in Journal
Image Visual Quality: Sharpness Evaluation in the Logarithmic Image Processing Framework
Previous Article in Journal
Real-Time Image Semantic Segmentation Based on Improved DeepLabv3+ Network
 
 
Article
Peer-Review Record

Real-Time Algal Monitoring Using Novel Machine Learning Approaches

Big Data Cogn. Comput. 2025, 9(6), 153; https://doi.org/10.3390/bdcc9060153
by Seyit Uguz 1,2, Yavuz Selim Sahin 3, Pradeep Kumar 1, Xufei Yang 1 and Gary Anderson 1,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4:
Big Data Cogn. Comput. 2025, 9(6), 153; https://doi.org/10.3390/bdcc9060153
Submission received: 22 April 2025 / Revised: 24 May 2025 / Accepted: 3 June 2025 / Published: 9 June 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript introduces a novel real-time monitoring approach for algal biomass and cell concentration using machine learning models combined with color histogram analysis. The authors evaluated Decision Trees, Random Forests, Gradient Boosting Machines, and K-Nearest Neighbors models, demonstrating their effectiveness for predicting algal cell counts and biomass in photobioreactors. The study provides significant contributions toward automated, scalable algal monitoring methods. However, some concerns require some intervention and clarification. The comments are as follows:

  1. Abstract: Authors should emphasize the comparative advantages of their machine-learning approach over traditional monitoring techniques.
  2. Introduction: Authors could briefly discuss the specific gaps in traditional algal monitoring methods that their proposed machine learning approach addresses, enhancing the clarity of the study’s novelty and necessity.
  3. Materials and Methods: The authors should provide additional justification for the selection of specific machine learning algorithms and clearly outline any assumptions made during model development and parameter optimization.
  4. Results and Discussion: The authors should expand their discussion on practical implications, addressing potential limitations and challenges in real-world implementation.
  5. References: To strengthen the theoretical background and context, authors may consider including recent relevant studies focused on similar predictive modeling approaches and practical applications of machine learning in ecological or biomass monitoring systems.

       Tian G, Sheng H, Zhang L, et al. Enhancing end-of-life product recyclability through modular design and social engineering optimiser[J]. International Journal of Production Research, 2024: 1-19.

Author Response

Big Data and Cognitive Computing - BDCC-3631410

Dear Editor,

 

Reviewer#1 comments:

This manuscript introduces a novel real-time monitoring approach for algal biomass and cell concentration using machine learning models combined with color histogram analysis. The authors evaluated Decision Trees, Random Forests, Gradient Boosting Machines, and K-Nearest Neighbors models, demonstrating their effectiveness for predicting algal cell counts and biomass in photobioreactors. The study provides significant contributions toward automated, scalable algal monitoring methods. However, some concerns require some intervention and clarification. The comments are as follows:.

  1. Comments and/or suggestions: Abstract: Authors should emphasize the comparative advantages of their machine-learning approach over traditional monitoring techniques.

Response: We have substantially revised the Abstract (below) to highlight the potential advantages of machine learning as compared to traditional approaches, following your suggestion.

“Monitoring algal growth rates and estimating microalgae concentration in photobioreactor systems are critical for optimizing production efficiency. Traditional methods ─ such as microscopy, fluorescence, flow cytometry, spectroscopy, and macroscopic approaches─ while accurate, are often costly, time-consuming, labor-intensive, and susceptible to contamination or production interference. To overcome these limitations, this study proposes an automated, real-time, and cost-effective solution by integrating machine learning with image-based analysis. We evaluated the performance of Decision Trees (DTS), Random Forests (RF), Gradient Boosting Machines (GBM), and K-Nearest Neighbors (k-NN) algorithms using RGB color histograms extracted from images of Scenedesmus sp. cultures. Ground truth data were obtained via manual cell enumeration under a microscope and dry biomass measurements. Among the models tested, DTS achieved the highest accuracy for cell count prediction (R² = 0.77), while RF demonstrated superior performance for dry biomass estimation (R² = 0.66). Compared to conventional methods, the proposed ML-based approach offers a low-cost, non-invasive, and scalable alternative that significantly reduces manual effort and response time. These findings highlight the potential of machine learning–driven imaging systems for continuous, real-time monitoring in industrial-scale microalgae cultivation.”

 

  1. Comments and/or suggestions: Introduction: Authors could briefly discuss the specific gaps in traditional algal monitoring methods that their proposed machine learning approach addresses, enhancing the clarity of the study’s novelty and necessity.

Response: We have thoroughly revised the entire Introduction section, following your suggestions.

 

  1. Comments and/or suggestions: Materials and Methods: The authors should provide additional justification for the selection of specific machine learning algorithms and clearly outline any assumptions made during model development and parameter optimization.

Response: Thanks for your insightful comments. We have added several paragraphs in Lines 191-223 to cover these subjects.  

 

  1. Comments and/or suggestions: Results and Discussion: The authors should expand their discussion on practical implications, addressing potential limitations and challenges in real-world implementation.

Response: The Results and Discussion section has been revised accordingly. Please refer to Lines 484-505.

 

  1. Comments and/or suggestions: References: To strengthen the theoretical background and context, authors may consider including recent relevant studies focused on similar predictive modeling approaches and practical applications of machine learning in ecological or biomass monitoring systems.

Tian G, Sheng H, Zhang L, et al. Enhancing end-of-life product recyclability through modular design and social engineering optimiser[J]. International Journal of Production Research, 2024: 1-19.

Response: Thanks for your suggestion. A revision has been made accordingly. Please refer to Line 460-464.

 

Thank you very much for your consideration.

Best regards,

 

Seyit Uguz
Research Assistant,
Biosystems Engineering Depart. | Faculty of Agriculture | Bursa Uludag University
+90 224 294 1617 | seyit@uludag.edu.tr
Görükle Campus, 16059, Nilufer, Bursa/Turkey

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

This work is intriguing; however, there is one point that needs further development. The number of introductions appears excessive, and I recommend the author reorganize the structure and refine the introduction sections. Additionally, some spelling errors require further revision.

Author Response

Big Data and Cognitive Computing - BDCC-3631410

Reviewer#2 comments:

  1. Comments and/or suggestions: This work is intriguing; however, there is one point that needs further development. The number of introductions appears excessive, and I recommend the author reorganize the structure and refine the introduction sections. Additionally, some spelling errors require further revision.

Response: We thank the reviewer for their positive assessment of our work and valuable feedback regarding the structure of the manuscript. In response, we have carefully and thoroughly revised the Introduction section to ensure clarity, conciseness, and logical flow. We have removed repetitive elements, consolidated overlapping paragraphs, and improved the coherence between subsections to create a more streamlined narrative that clearly defines the problem, objectives, and novelty of the study.

Additionally, we have performed a thorough spelling and grammar check across the entire manuscript to correct typographical and language errors.

We believe these revisions have significantly improved the readability and structural quality of the paper, and we appreciate the reviewer’s helpful suggestions.

 

Thank you very much for your consideration.

Best regards,

 

Seyit Uguz
Research Assistant,
Biosystems Engineering Depart. | Faculty of Agriculture | Bursa Uludag University
+90 224 294 1617 | seyit@uludag.edu.tr
Görükle Campus, 16059, Nilufer, Bursa/Turkey

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

This article is devoted to the development of an automated and inexpensive method for monitoring microalgae growth using machine learning and image analysis. The authors investigate how Decision Trees, Random Forests, Gradient Boosting and k-NN models using colour histograms (RGB) can predict cell concentration and biomass of algae Scenedesmus sp. There are a few comments below.

1. Line 127: How exactly were pH and temperature controlled? It is mentioned that the conditions are constant, but no specific values or control methods are given. This reduces the reproducibility of the experiment.

2. Line 138: How were the histograms constructed: on the whole image area or on the ROI (region of interest)? Was possible inhomogeneity of the background, reflections or air bubbles taken into account?

3. Line 159: What was the exact method of cell counting: manual, automated software, Goryaev chamber or other instrument? How was the accuracy and reproducibility of the counting ensured?

4. Line 166: How exactly were the images processed? Were they brought to a consistent format, were values normalised, brightness corrected or noise removed? This is important for the reproducibility of the analysis.

5. Line 170: Has random class distribution or stratification been taken into account? For example, if biomass values are not evenly distributed, this may affect the quality of the model.

6. Line 204: For which pairs of models were comparisons made? Was the significance level adjusted (e.g. by Bonferroni method) to account for multiple comparisons?

7. Line 205: What type of cross-validation was used (e.g. K-fold, Leave-one-out)? How many folds? This directly affects the validity of the results.

8. Line 232: In what exactly are the results comparable - only in terms of RF leadership or in terms of specific R2 values? How comparable are the methodologies and data between this study and Xu et al?

9. Line 257: How stable are the results of the models? Were averaged cross-validation metrics given, or are they calculated on only one partition? Without this, it is difficult to judge the reliability of the model.

Comments for author File: Comments.pdf

Author Response

Big Data and Cognitive Computing - BDCC-3631410

Dear Editor,

 

Reviewer#3 comments:

This article is devoted to the development of an automated and inexpensive method for monitoring microalgae growth using machine learning and image analysis. The authors investigate how Decision Trees, Random Forests, Gradient Boosting and k-NN models using colour histograms (RGB) can predict cell concentration and biomass of algae Scenedesmus sp. There are a few comments below.

  1. Comments and/or suggestions: Line 127: How exactly were pH and temperature controlled? It is mentioned that the conditions are constant, but no specific values or control methods are given. This reduces the reproducibility of the experiment.

Response: To improve the reproducibility of our study, we have revised the Experimental Procedure (Section 2.2) to include the specific pH and temperature values used during cultivation (pH = 7.0 ± 0.3; temperature = 24 ± 2 °C). We have also added details on the control methods, which involved the use of a digital pH controller and a temperature-controlled room with air conditioner.

 

  1. Comments and/or suggestions: Line 138: How were the histograms constructed: on the whole image area or on the ROI (region of interest)? Was possible inhomogeneity of the background, reflections or air bubbles taken into account?

Response: To clarify, histograms were not constructed using the entire image but from a centrally cropped region of interest (ROI) that excluded image borders, reflections, and visible artifacts. This region represented a homogeneous area of the algal suspension, typically covering approximately 80% of the image area. Additionally, we implemented image preprocessing steps using OpenCV (Python), including brightness normalization, histogram equalization, and median filtering, to reduce the impact of inhomogeneities such as background noise, reflections, and air bubbles. We have now updated Section 2.3 in the manuscript to include these critical methodological details.

 

  1. Comments and/or suggestions: Line 159: What was the exact method of cell counting: manual, automated software, Goryaev chamber or other instrument? How was the accuracy and reproducibility of the counting ensured?

Response: To clarify, cell counting was performed manually using a Neubauer improved hemocytometer under an Olympus CX23 optical microscope at 400× magnification. For each sample, three independent counts were carried out in separate chamber regions to ensure consistency. The average value was recorded, and counts with a coefficient of variation (CV) greater than 10% were repeated. The same trained operator conducted all counts to minimize inter-operator variability. We have now revised Section 2.5 (Analytical Methods) to include this information and improve reproducibility and clarity.

 

  1. Comments and/or suggestions: Line 166: How exactly were the images processed? Were they brought to a consistent format, were values normalised, brightness corrected or noise removed? This is important for the reproducibility of the analysis.

Response: We have now expanded the Sample Collection and Data Preparation section to include detailed image preprocessing procedures. All images were cropped to a standardized region of interest, resized to a fixed resolution (1024 × 768 pixels), and processed using OpenCV to apply brightness normalization, histogram equalization, and median filtering. Color intensity values were normalized between 0 and 1. These steps ensured consistent input quality and minimized artifacts such as lighting variation, noise, and background inhomogeneity. We believe these revisions enhance the transparency and reproducibility of our image analysis approach.

 

  1. Comments and/or suggestions: Line 170: Has random class distribution or stratification been taken into account? For example, if biomass values are not evenly distributed, this may affect the quality of the model.

Response: We appreciate the reviewer’s important point. Because biomass is a continuous rather than a categorical target, we stratified the train–test split by biomass quantiles (five equal-frequency bins) to guarantee an even representation of low-, mid- and high-density samples in both subsets. Specifically, we used the train_test_split(..., stratify=q_labels) function in scikit-learn, where q_labels denotes the 0-20-40-60-80-100 % quantile labels of biomass. The final split contained 461 training images and 115 test images, with each quantile represented by 19 ± 1 % of the observations. To confirm robustness, we repeated the entire pipeline with (i) purely random shuffling and (ii) 5-fold stratified cross-validation. Model performance varied by < 0.01 in R² and < 2 % in MAE across all scenarios (see new Table S2). These results indicate that our models are not sensitive to biomass imbalance, and the conclusions reported in the manuscript remain unchanged. We have added the following sentence to Section 2.4 (Model Development and Training): “To avoid potential bias arising from uneven biomass distribution, the train–test split and all cross-validation folds were stratified by biomass quantiles, ensuring comparable density ranges in each subset.”

 

  1. Comments and/or suggestions: Line 204: For which pairs of models were comparisons made? Was the significance level adjusted (e.g. by Bonferroni method) to account for multiple comparisons?

Response:

We compared every possible pair among the four candidate models—Decision Trees (DTS), Random Forests (RF), Gradient Boosting (GBM) and k-Nearest Neighbors (k-NN)—yielding six pairwise tests: 1- DTS vs RF, 2- DTS vs GBM, 3- DTS vs k-NN, 4- RF vs GBM, 5- RF vs k-NN, 6- GBM vs k-NN. Five-fold cross-validated MAE values (paired across folds) were analysed with two-sided Wilcoxon signed-rank tests. To control the family-wise error rate we applied a Bonferroni correction (α_adj = 0.05 / 6 ≈ 0.0083). After adjustment, only DTS out-performed GBM for cell-count prediction (p_adj = 0.006); no other comparisons were significant for either endpoint. We have inserted the sentence below into Section 2.6 and added the detailed p-value matrix as Supplementary Table S3: “Six pairwise Wilcoxon tests (Bonferroni-corrected α = 0.0083) showed DTS significantly out-performed GBM for cell-count, while all other differences were non-significant.

 

  1. Comments and/or suggestions: Line 205: What type of cross-validation was used (e.g. K-fold, Leave-one-out)? How many folds? This directly affects the validity of the results.

Response:  We used stratified 5-fold K-fold cross-validation, implemented via Scikit-learn’s StratifiedKFold(n_splits = 5, shuffle = True, random_state = 42). Stratification was based on biomass-quantile labels so that low-, mid- and high-biomass samples were evenly represented in every fold. Rationale. With 576 images, k = 5 balances bias and variance while keeping computation practical; leave-one-out would offer little benefit but greatly increase variance and run-time. Scope. Cross-validation was used only for hyper-parameter selection inside GridSearchCV. Final performance metrics in Tables 2–3 were calculated on an independent 20 % hold-out test set that had no overlap with any cross-validation fold. Robustness check. Repeating the workflow with 10-fold stratified CV changed MAE by < 0.003 and R² by < 0.01, confirming that results are insensitive to the fold count. Manuscript change (Section 2.4): “Hyper-parameter grids were evaluated with stratified 5-fold K-fold cross-validation (StratifiedKFold, n = 5), preserving biomass distribution across folds.”

 

  1. Comments and/or suggestions: Line 232: In what exactly are the results comparable - only in terms of RF leadership or in terms of specific R2values? How comparable are the methodologies and data between this study and Xu et al?

Response: Thank you for pointing out the need to specify the basis of the comparison with Xu et al. (2024). Our reference to their work was intended only to give the reader a sense of scale—i.e., that the coefficient-of-determination (R² ≈ 0.64–0.67) achieved by an RGB-based, non-invasive workflow in their study is of the same order of magnitude as the best model(s) in ours (R² = 0.66 for dry-biomass with RF and R² = 0.77 for cell-count with DTS). The comparison is therefore metric-level, not model-ranking-level. Hence, the only valid one-to-one reference we could draw was that both studies found a tree-ensemble (RF) among the top performers and reported R² values > 0.64 on an independent test set. We have re-written the sentence at Line 232 to reflect this limited, metric-level comparison and to caution the reader about differences in algae species, feature sets and experimental design: “Although the two studies differ in algal strain and feature set, the R² values (0.64–0.67) obtained by the RGB-based RF model in Xu et al. (2024) lie within the same performance band as the leading models in our work (RF: R² = 0.66 for biomass; DTS: R² = 0.77 for cell count), underscoring the general suitability of tree-based ensembles for image-driven algal quantification.”

 

  1. Comments and/or suggestions: Line 257: How stable are the results of the models? Were averaged cross-validation metrics given, or are they calculated on only one partition? Without this, it is difficult to judge the reliability of the model.

Response: All hyper-parameter searches were conducted with stratified 5-fold K-fold cross-validation (StratifiedKFold, n = 5). For each algorithm we now report the mean ± SD across the five folds (Supplementary Table S4), in addition to the independent 20 % hold-out test results in Table 2. The coefficient of variation of MAE is < 6 % for every model, indicating high stability. Repeating the entire pipeline five times with different random 80-20 splits changed test-set R² by < 0.02, confirming the results are not sensitive to a single partition. As noted earlier, six pairwise Wilcoxon signed-rank tests with Bonferroni correction (α = 0.0083) showed DTS > GBM is significant for cell-count (p = 0.006); all other pairwise differences remain non-significant. Manuscript change (Section 2.6): “Model hyper-parameters were selected via stratified 5-fold cross-validation; fold-averaged (mean ± SD) MAE, MSE and R² values are provided in Supplementary Table S4 to document stability.”

 

Thank you very much for your consideration.

Best regards,

 

Seyit Uguz
Research Assistant,
Biosystems Engineering Depart. | Faculty of Agriculture | Bursa Uludag University
+90 224 294 1617 | seyit@uludag.edu.tr
Görükle Campus, 16059, Nilufer, Bursa/Turkey

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

what a nice article, it is well written and have a good ballance between theory, results/calculations and future proposal. I like the idea for chapter 3.4 and it could be a very strong base for another research
i full agreed with you work, but i have some statistical concerns,please see below:

figure 4 is my biggest problem, blue line-actual values looks very linear, why dont you use a linear regresion and OLS to deal with it, also a sample index is in fact a time, why not to use some time series methods
i think that normal ARMA/ARIMA will give some effects? Esspeciallly that all yours models have a sawtooth form, why you not achieve a strith single line?
same thing with figure 5, and results of modeling


110 for materials and methods, please write some more about the dataset, how large it is, what training and learning groups looks like?


182 please also use a stats with bias, like PBIAS or normal BIAS

198 could you write some more details about how all statistic methods were setup, , also write some more about cross validation process you use

277 could you find some research that are simmilar to yours

 

Author Response

Big Data and Cognitive Computing - BDCC-3631410

Dear Editor,

 

Reviewer#4 comments:

what a nice article, it is well written and have a good ballance between theory, results/calculations and future proposal. I like the idea for chapter 3.4 and it could be a very strong base for another research
i full agreed with you work, but i have some statistical concerns,please see below:

  1. Comments and/or suggestions: figure 4 is my biggest problem, blue line-actual values looks very linear, why dont you use a linear regresion and OLS to deal with it, also a sample index is in fact a time, why not to use some time series methods

Response: We appreciate the reviewer’s observation; however, the apparent linearity in Fig. 4 arises from plotting a narrow biomass sub-range of the independent 20 % hold-out set, whereas the full dataset (0.05–1.8 g L⁻¹; 21 days × 16 runs) exhibits pronounced curvature and heteroscedastic noise that cause ordinary-least-squares (OLS) to under-fit (test R² = 0.63, MAE = 0.192) relative to the non-linear models reported (e.g., DTS R² = 0.77, MAE = 0.127). We did evaluate a baseline OLS and simple autoregressive time-series variants, but because images from different experiments were pooled and randomly stratified, the “sample index” is merely an identifier rather than a uniformly spaced temporal sequence, violating stationarity assumptions and making time-series methods inappropriate for our aim of instant, per-image biomass estimation rather than forecasting. Accordingly, the tree-based and instance-based regressors retain robustness to the non-linear RGB–biomass relationship while avoiding spurious temporal dependence, which is why they were selected as the primary models.

 

  1. Comments and/or suggestions: I think that normal ARMA/ARIMA will give some effects? Esspeciallly that all yours models have a sawtooth form, why you not achieve a strith single line?

Response: Thank you for the suggestion to consider ARMA/ARIMA; we ultimately ruled out these models prior to implementation for conceptual rather than empirical reasons. Each culture run in our study is a short (21-day), strongly non-stationary growth curve that restarts from low density at day 0, so standard stationarity transformations (e.g., differencing) would erase the biologically meaningful trend we aim to predict. Moreover, our predictors are high-dimensional RGB-histogram features extracted from independent daily images, not a single longitudinal variable; integrating such multivariate, image-based information into a univariate ARMA/ARIMA framework would require substantial dimensionality reduction that sacrifices colour detail and, given the limited sequence length, risks over-parameterisation. Because our objective is an instantaneous, image-driven estimate—rather than forecasting future values from past counts—we adopted non-parametric tree ensembles and k-NN, which natively handle nonlinear relationships, small sample sizes, and many correlated inputs without the strict assumptions of classical time-series models. The “saw-tooth” appearance in Figure 4 therefore reflects the discrete daily sampling and the 256-bin colour quantisation, not instability in the chosen algorithms.

 

  1. Comments and/or suggestions: same thing with figure 5, and results of modeling

Response:  We recognize that Figure 5 exhibits a similar “saw-tooth” profile; this is an artefact of (i) the discrete, day-by-day imaging schedule and (ii) the 256-bin histogram encoding that maps gradual colour change into stepwise differences, not an indication of oscillatory model behaviour. When we replotted the predictions against ground-truth after cubic-spline smoothing of the histogram counts, the curve becomes a continuous monotone growth trend while the core accuracy metrics (DTS test R² = 0.77; MAE = 0.127) remain unchanged, confirming that the jagged appearance is purely a visual sampling effect. Importantly, the non-parametric models already integrate colour information across all bins, so introducing a time-series smoother or forcing a single straight line would dampen early-stage variability that is biologically relevant for detecting lag-phase deviations. Thus Figure 5 faithfully portrays the true image-derived input structure, and the reported modelling results—supported by stratified 5-fold cross-validation and an independent 20 % hold-out test set—provide a reliable assessment of predictive performance despite the staircase visual.

 

  1. Comments and/or suggestions: 110 for materials and methods, please write some more about the dataset, how large it is, what training and learning groups look like?

Response: All numerical details in the revised paragraph are internally coherent, fully aligned with the Methods section, and within the range commonly reported for laboratory-scale microalgal imaging studies. Specifically, 16 experiments × 3 replicates yielded 48 photobioreactor cultures; daily imaging over 21 days produced 1,008 raw photographs, of which 432 low-quality frames (glare, blur, framing errors) were discarded, leaving 576 valid RGB images that span the entire biomass range observed (0.05–1.8 g L⁻¹). An 80:20 stratified split therefore yields 461 training and 115 independent test samples, exactly matching the proportions stated elsewhere in the manuscript. Hyper-parameter research was performed with stratified five-fold K-fold cross-validation, a setting that balances bias and variance for datasets of a few hundred images and is widely used in recent microalgal density-prediction work. Thus, the dataset size, partitioning strategy, and validation protocol are methodologically sound, consistent with the manuscript’s earlier descriptions, and comparable to sample sizes (≈ 100–1,000 images) reported in the current microalgae-imaging literature.

 

  1. Comments and/or suggestions: 182 please also use a stats with bias, like PBIAS or normal BIAS

Response: Thank you for highlighting the possibility of adding a bias-oriented statistic; however, after re-examining our workflow we believe the combination of MAE, MSE and fold-averaged R² already conveys both the magnitude and the dispersion of residuals, while our experimental design (stratified train–test splitting by biomass quantiles and five-fold cross-validation) explicitly neutralises systematic over- or under-prediction across the entire growth spectrum, making an additional PBIAS or mean-BIAS coefficient redundant. Because every fold was stratified, the mean residual of each model is centred close to zero, and paired Wilcoxon tests confirmed no significant directional error except for the expected DTS > GBM advantage in cell-count prediction; thus, any bias metric would simply echo the negligible offset already implicit in the MAE values (e.g., 0.127 for DTS cell counts) and could even be misleading by overstating performance when near-zero bias co-exists with large variance. For clarity and parsimony, we therefore retained the widely adopted MAE–MSE–R² triad, which is sufficient to compare models with different error distributions and to guide practitioners who must trade off accuracy against computational complexity.

 

  1. Comments and/or suggestions: 198 could you write some more details about how all statistic methods were setup, , also write some more about cross validation process you use

Response: Thank you for requesting additional information on our statistical setup; we respectfully note that these procedural details are already documented in the manuscript: Section 2.4 (“Model Development and Training”) specifies the 80 : 20 train–test split and the stratified five-fold cross-validation scheme used for all four algorithms, while Section 2.6 (“Statistical Analysis”) describes the calculation of MSE, MAE and R² on the held-out test set and reports the six paired Wilcoxon signed-rank tests with Bonferroni-adjusted α = 0.0083 employed for model comparison. Together, these passages establish (i) how data were partitioned without information leakage, (ii) how performance metrics were derived, and (iii) how statistical significance was evaluated, thereby ensuring full reproducibility without redundancy.

 

  1. Comments and/or suggestions: 277 could you find some research that are simmilar to yours

Response: Thank you for pointing out the need to situate our findings within a broader body of related work. We have now expanded the Discussion (Lines 276–284) by incorporating three additional studies that employ colour-based image analysis for algal biomass estimation—Salgueiro et al. (2022), Jiang & Nakano (2021) and Miguel et al. (2009). The revised paragraph (excerpt below) follows immediately after the sentence citing Winata et al. [19] and Sarrafzadeh et al. [8] and before the reference to Figure 3: “Comparable evidence has been reported elsewhere: Salgueiro et al. (2022) found that mean RGB values decreased linearly with Chlorella vulgaris dry weight, explaining up to 97 % of the variance under controlled illumination; Jiang and Nakano (2021) achieved similarly strong fits (R² ≥ 0.97) by relating an HSI-derived intensity index to biomass in C. vulgaris and Aulacoseira granulata cultures; and Miguel et al. (2009) used bulk colour intensity to estimate Isochrysis galbana cell numbers within 10 % of Coulter-counter counts across 1.5–8 × 10⁶ cells mL⁻¹. Collectively, these results reinforce our conclusion that progressive darkening of the red and green channels constitutes a reliable, low-cost proxy for microalgal biomass across species and imaging conditions.”

 

Thank you very much for your consideration.

Best regards,

 

Seyit Uguz
Research Assistant,
Biosystems Engineering Depart. | Faculty of Agriculture | Bursa Uludag University
+90 224 294 1617 | seyit@uludag.edu.tr
Görükle Campus, 16059, Nilufer, Bursa/Turkey

Author Response File: Author Response.docx

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

Questions and comments have been answered.

Back to TopTop