Next Article in Journal
Quality, Nutritional Composition, and Antioxidant Potential of Muffins Enriched with Flax Cake
Next Article in Special Issue
One-Pot Improvement of Stretchable PEDOT/PSS Alginate Conductivity for Soft Sensing Biomedical Processes
Previous Article in Journal
Energy Production and Process Costing for Biomass Obtained from Underutilized Plant Species in México and Colombia
Previous Article in Special Issue
AI-Driven Maintenance Optimisation for Natural Gas Liquid Pumps in the Oil and Gas Industry: A Digital Tool Approach
 
 
Article
Peer-Review Record

Artificial Intelligence and Extraction of Bioactive Compounds: The Case of Rosemary and Pressurized Liquid Extraction

Processes 2025, 13(6), 1879; https://doi.org/10.3390/pr13061879
by Martha Mantiniotou 1, Vassilis Athanasiadis 1, Konstantinos G. Liakos 2, Eleni Bozinou 1 and Stavros I. Lalas 1,*
Reviewer 1:
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Processes 2025, 13(6), 1879; https://doi.org/10.3390/pr13061879
Submission received: 19 May 2025 / Revised: 10 June 2025 / Accepted: 12 June 2025 / Published: 13 June 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The research article “Artificial Intelligence and Extraction of Bioactive Compounds: The Case of Rosemary and Pressurized Liquid Extraction” is a wonderful effort by Mantiniotou and the co-authors.

  • Title: The title seems ok in the current form; however, it might be replaced a better one such as :  “Optimizing Bioactive Compounds using Artificial Intelligence in Plant Base Extraction”
  • Abstract: This section needs rephrasing, particularly the concluding sentence as it sounds quite directive, not prospective.

Introduction:

  • Literature cited is old except three latest references, the rest are quite in back date
  • Paragraph 3 should be replaced with paragraph 4.
  • Line 87-90 gives a sense of Results Section; should be rephrased.

Materials & Methods:

  • Specification of Deionized column are missing.
  • Instead of verbosity, all chemicals/reagents to be once written “Chemical A (97% Purity, Analytical Grade, Company XYZ) with the statement of purchase.
  • Aren’t Tabl1 and Table 2 part of Results section?
  • I guess the Figures of Experimental Design are part of “Results” section.

Results & Discussion:

  • The captions of the figures to be revised; such as

Figure 5. TPC: (A) covariation of X1 (ethanol concentration, C, % v/v) and X2 376 (liquid-to-solid ratio, R, mL/g); (B) covariation of X1 and X4 (extraction time, t, min); 377 (C) covariation of X2 and X3 (extraction temperature, T, °C); (D) covariation of X2 and X4. FRAP: (E) covariation of X1 and X2; …….

  • The discussion part is weak. Some latest researches should be compared with results.

Conclusion:

  • The section is similar to the Results section. It lacks overall conclusion of the study and future prospects.

References:

  • There should be replacement of old citations with the new ones and also to be highlighted.
Comments on the Quality of English Language

Revision required.

Author Response

The research article “Artificial Intelligence and Extraction of Bioactive Compounds: The Case of Rosemary and Pressurized Liquid Extraction” is a wonderful effort by Mantiniotou and the co-authors.

We would like to thank the reviewer for their insightful comments.

  • Title: The title seems ok in the current form; however, it might be replaced a better one such as:Optimizing Bioactive Compounds using Artificial Intelligence in Plant Base Extraction”

The authors prefer to keep the title as is. The suggested title is nice but does not precisely reflect the content of the study, as Artificial Intelligence (AI) was not used to optimize the extraction. The optimization was carried out through statistical models and then AI optimization was conducted to compare the results of the two models.

  • Abstract: This section needs rephrasing, particularly the concluding sentence as it sounds quite directive, not prospective.

The abstract has been revised, as suggested.

Introduction:

  • Literature cited is old except three latest references, the rest are quite in back date

Older references have been replaced with references from the past five years (2020-2025).

  • Paragraph 3 should be replaced with paragraph 4.

The paragraphs have been reordered, as requested.

  • Line 87-90 gives a sense of Results Section; should be rephrased.

The sentence has been removed to avoid future misconceptions.

Materials & Methods:

  • Specification of Deionized column are missing.

The deionized column contains mixed-bed ion exchange resin, ensuring conductivity below 1 µS/cm, with a standard flow rate and operating pressure.

  • Instead of verbosity, all chemicals/reagents to be once written “Chemical A (97% Purity, Analytical Grade, Company XYZ) with the statement of purchase.

Chemicals and reagents are listed according to the company that provided them to ensure clarity. Listing them according to solvents, standards, etc., would cause redundancy, as company names would be repeated throughout the paragraph. Furthermore, it is widely understood which chemicals are solvents and which are standards.

  • Aren’t Tabl1 and Table 2 part of Results section?

Tables 1 & 2 have been moved to the Results section, as requested.

  • I guess the Figures of Experimental Design are part of “Results” section.

We thank the reviewer for this observation. Figures related to the Experimental Design have been relocated to the Results and Discussion section, as suggested.

Results & Discussion:

  • The captions of the figures to be revised; such as

Figure 5. TPC: (A) covariation of X1 (ethanol concentration, C, % v/v) and X2 376 (liquid-to-solid ratio, R, mL/g); (B) covariation of X1 and X4 (extraction time, t, min); 377 (C) covariation of X2 and X3 (extraction temperature, T, °C); (D) covariation of X2 and X4. FRAP: (E) covariation of X1 and X2; …….

Captions have been rewritten properly, as requested.

  • The discussion part is weak. Some latest researches should be compared with results.

Latest studies have been incorporated into the discussion section. Unfortunately, the antioxidant assays could not be compared with recent studies, since other researchers present their results in different units, making a comparison infeasible and inaccurate.

Conclusion:

  • The section is similar to the Results section. It lacks overall conclusion of the study and future prospects.

We thank the reviewer for this valuable comment. The Conclusion section has been revised to avoid overlap with the Results section. It now provides a more concise summary of the study and emphasizes future prospects and potential applications.

References:

  • There should be replacement of old citations with the new ones and also to be highlighted.

We thank the reviewer for this suggestion. We have updated the References section accordingly: older references were substituted with more recent publications (2020–2025), where applicable. This ensures that the literature cited is relevant and reflects the current state of research.

Comments on the Quality of English Language

Revision required.

English language revision has been conducted, as recommended by the reviewer.

Reviewer 2 Report

Comments and Suggestions for Authors

Refer attachment.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Requires rewriting to trim overly long sentences.

Author Response

Artificial Intelligence and Extraction of Bioactive Compounds: The Case of Rosemary and Pressurized Liquid Extraction

Overview

This study optimized PLE for bioactive compounds from rosemary, identifying optimal conditions as 25% ethanol, 160 ℃, 25 minutes, and a 10 ml/g liquid-to-solid ratio, yielding high polyphenol and antioxidant content. ML models, particularly RF, were employed to predict extraction outcomes, with synthetic data augmentation improving model performance. The findings highlight PLE's efficiency for sustainable extraction, with potential applications in food, pharmaceutical, and cosmetic industries.

We would like to thank the reviewer for their valuable comments.

Comments/Suggestions

  1. Abstract is too detailed-exact quantitative results (e.g., polyphenol content values, ML model specifics) may overwhelm readers. Please summarize outcomes instead of listing all numeric results. Please emphasize the novelty more clearly-what sets this study apart?

The abstract has been revised to better summarize the outcomes and highlight the study's novelty.

  1. Introduction is overly descriptive and repetitive (e.g., traditional uses of rosemary are mentioned multiple times). So, please reduce redundant information on rosemary's uses. The justification for using ML is not sufficiently developed or clearly linked to extraction optimization. So, please clearly articulate the research gap-why exactly is AI needed for this kind of extraction? Further, explain how this work builds on or diverges from previous efforts using ML in extraction.

We thank the reviewer for this very helpful comment. The Introduction section has been revised to reduce redundancy regarding the traditional uses of rosemary. We also improved the justification for using ML techniques in this context by clarifying the research gap. Specifically, while previous studies have applied ML to bioactive compound prediction and extraction modeling, there is limited work on combining ML with data augmentation techniques to overcome small dataset limitations in green extraction processes such as PLE. Our study explores this novel approach, aiming to improve predictive performance and support more robust extraction optimization.

  1. Methods: Sample size (n = 17) for ML is critically small, even for regression-this should be acknowledged more openly. Data augmentation methodology is buried; the rationale behind using RF predictions + Gaussian noise could be challenged for generating "realistic" data. So, authors should consider providing confidence intervals or variability analysis on synthetic data. Separate statistical methods (RSM/ANOVA) more clearly from ML procedures.

We thank the reviewer for this valuable comment. We have revised the manuscript to more clearly acknowledge the limitation imposed by the small sample size (n = 17) and the rationale for using Random Forest-based predictions with Gaussian noise for data augmentation. This choice was motivated by the need to generate realistic synthetic data while preserving the nonlinear relationships captured by the RF model and introducing controlled variability. We have also clarified in the text that the noise was parametrized to reflect 5% of the standard deviation of each target variable, providing a meaningful level of variability. Additionally, we have ensured that the separation between statistical methods (RSM/ANOVA) and ML procedures is clearly delineated in the Methods section.

  1. Results: PCA and heatmaps are used for descriptive purposes but not integrated into interpretation beyond superficial statements. Include learning curves or regularization discussion to address overfitting. Feature importance interpretation depends heavily on synthetic data, which risks biasing conclusions.

We thank the reviewer for these insightful suggestions. We have revised the Results section accordingly. We added a brief discussion in Section 3.2 to better integrate the PCA and heatmap findings with the interpretation of the extraction and ML modeling results. We also clarified that regularization (e.g., Ridge, Lasso penalties and cross-validation) was applied during ML model training to mitigate overfitting, and we added a corresponding comment in the Methods and Results sections. Additionally, we explicitly acknowledged the potential bias introduced by relying on feature importance derived from synthetic data, and this limitation is now clearly stated in Section 3.8 and discussed further in the limitations paragraph.

  1. Conclusions: Overly claims regarding the utility of AI and PLE synergy despite evident overfitting and weak test set performance. Furthermore, lack of a balanced view on study limitations and potential biases introduced by data augmentation. Provide more actionable recommendations for future research (e.g., real-world testing, model transferability, industrial validation).

We thank the reviewer for this important comment. The Conclusions section was revised to adopt a more balanced tone regarding the integration of AI and PLE. We explicitly acknowledged the limitations related to model performance and potential bias introduced by data augmentation. In addition, we added more actionable recommendations for future research, including real-world testing, model transferability, and industrial validation.

  1. Figures and tables are not all well integrated into the text-some are not interpreted thoroughly.

We thank the reviewer for this useful comment. We have carefully revised the Results section to ensure that all Figures and Tables are better integrated into the narrative and thoroughly interpreted. Additional clarifying statements were added where appropriate, including further discussion of the scatter plots in Section 3.9, to improve the interpretability and contextualization of the visual results.

 

Remark

Please consider above suggestions when revising the manuscript.

Reviewer 3 Report

Comments and Suggestions for Authors

The search for more ‘green’, and equally or more efficient extractions, is certainly a major challenge. Therefore, the research carried out is interesting and valuable, but in some places its description is chaotic and needs to be completed. The following is a list of comments:

Lines 55-64 - as the authors mention a number of extraction methods and then describe the disadvantages that arise, they should indicate which disadvantages arise in which methods. Similarly advantages.

Lines 65-77 - the authors describe PLE extraction and mention the temperatures used. In the previous paragraph, temperature was indicated as one of the disadvantages, which can destroy samples. Can this situation also occur in this case? Perhaps it is also worth describing in a few sentences the inconveniences that can occur when using PLE.

Line 152 - the statement that ‘methodology described elsewhere’ is misplaced? What does it mean described elsewhere? Is it in another publication or in this manuscript, but we won't tell you the secret where?

Similarly, subsections 2.5 and 2.6 - should at least include a brief description of the sample preparation for analysis

Subsection 2.7 - there is no information on the stationary phase used and the type of detector used.

Chapter 2.13 - line 312 - ‘A total of 100 synthetic input samples were created’ - originally the authors had data from 17 samples. 100 were generated, which is about 5 times as many. Why was it assumed that the range of data obtained from so few samples was valid and could be extended?

Chapter 3.1 is entitled optimisation, but does not actually contain any results. Only Chapters 3.2 and 3.2. deal with the process. Therefore, they should not be separate chapters, but should be subchapters of 3.1.

The entire chapter 3 should present a description of the results and a discussion of them. But the entire chapter lacks any reference to the literature. And also a comparison of the PLE method to the results obtained with other types of extraction. In which lies the advantage of this method?

Author Response

The search for more ‘green’, and equally or more efficient extractions, is certainly a major challenge. Therefore, the research carried out is interesting and valuable, but in some places its description is chaotic and needs to be completed. The following is a list of comments:

We would like to thank the reviewer for their valuable comments.

Lines 55-64 - as the authors mention a number of extraction methods and then describe the disadvantages that arise, they should indicate which disadvantages arise in which methods. Similarly advantages.

These lines were removed from the manuscript following another reviewer’s suggestion. Additionally, while such a comparison could be useful, it would add extra detail to an already lengthy paper and might overwhelm readers. Moreover, it extends beyond the scope of this study.

Lines 65-77 - the authors describe PLE extraction and mention the temperatures used. In the previous paragraph, temperature was indicated as one of the disadvantages, which can destroy samples. Can this situation also occur in this case? Perhaps it is also worth describing in a few sentences the inconveniences that can occur when using PLE.

PLE integrates high pressure and temperature to enhance extraction efficiency. To prevent misunderstandings, this paragraph was removed, and a revised section was added in the next paragraph clarifying the effects of PLE on extracts.

Line 152 - the statement that ‘methodology described elsewhere’ is misplaced? What does it mean described elsewhere? Is it in another publication or in this manuscript, but we won't tell you the secret where?

We apologize for this typographical error, as the specific citation was misplaced.

Similarly, subsections 2.5 and 2.6 - should at least include a brief description of the sample preparation for analysis

Additional details were added, as requested.

Subsection 2.7 - there is no information on the stationary phase used and the type of detector used.

The relevant information has now been included.

Chapter 2.13 - line 312 - ‘A total of 100 synthetic input samples were created’ - originally the authors had data from 17 samples. 100 were generated, which is about 5 times as many. Why was it assumed that the range of data obtained from so few samples was valid and could be extended?

We thank the reviewer for this important comment. Indeed, we fully acknowledge that the size of the original dataset (n=17) imposes limitations on the generalizability of the synthetic data generation process. In this study, our intention was not to perform large-scale synthetic data augmentation or to expand the parameter space, but rather to conduct an initial small-scale proof-of-concept experiment using 100 synthetic samples. Due to limited computational resources, it was not feasible to generate and optimize a larger synthetic dataset at this stage. Our goal was to test whether the concept of augmenting the data with RF-based synthetic samples combined with controlled noise could improve model generalization, which was confirmed to some extent. We plan to further explore this approach in future work, using more powerful computational resources and more sophisticated generative techniques. Chapter 3.1 is entitled optimization, but does not actually contain any results. Only Chapters 3.2 and 3.2. deal with the process. Therefore, they should not be separate chapters, but should be subchapters of 3.1.

The suggested modifications were added into the manuscript.

The entire chapter 3 should present a description of the results and a discussion of them. But the entire chapter lacks any reference to the literature. And also a comparison of the PLE method to the results obtained with other types of extraction. In which lies the advantage of this method?

Comparisons with literature have been added, as requested, detailing how PLE compares to other extraction techniques and its advantages.

Reviewer 4 Report

Comments and Suggestions for Authors

Brief summary:

The paper has potential but is too broad and technically complex for the target audience. The large number of models and figures should be reduced and focused on key findings. There is too much text for the scope of the research, and it is recommended to shorten it, as quantity does not necessarily equate to quality.

 

General comments:

The abstract should be concise and clearly state the aim of the research. It currently includes too much theoretical background. According to the journal's author guidelines, the abstract should not exceed 200 words.

The keywords are mostly aligned with MeSH, which is good, but not all of them (e.g., HPLC-DAD; response surface methodology; regression models; generative models), so they should be revised/adjusted accordingly.

The introduction is overly extensive for a scientific article, containing too many details (e.g., rosemary description, extraction method overviews, PLE parameters, etc.). It should be shortened by briefly introducing the topic, the plant and its relevance, highlighting the importance of extracting bioactive compounds and the challenges (both conventional and innovative technologies), mentioning the role of AI in process optimization, and ending with a clearly formulated aim of the study.

In the methodology, section 2.1 reads like a random list of chemicals with no clear categorization, making it unclear to readers which chemicals are solvents, standards, etc. It is recommended to categorize them.

In section 2.2, the plant material and its sample preparation are described, so the subsection title should reflect that—e.g., Plant Material and Sample Preparation or Rosemary Leaves Raw Material and Pre-treatment.

In section 2.3 (Experimental Design), results are included, which is inappropriate for the methodology section—they should be moved to Results and Discussion. Only the experimental design description should remain in this section, and the results tables should be relocated.

In section 2.7, only basic chromatographic information is provided. Details should be added: the type of detector used (including manufacturer), detection wavelength, column description (type, dimensions, particle size, manufacturer), whether the column was thermostated, injection volume, and number of replicate analyses (duplicate, triplicate, etc.).

From section 2.11 to section 3 (Results and Discussion), the content becomes too expansive, which is not typical for experimental studies on extraction of bioactive compounds. This part should be shortened to the essentials.

The conclusion is too long. It should be shortened and focus on the conclusions of this study. Anything not proven by this study should not be included in the conclusion. Also, study limitations should be removed from this section.

 

Out of 52 references (which is too many for an original scientific article, unless it's a review), more than half are over 5 years old. Some are over 20 years old and should be removed or replaced with more relevant references.

 

Specific comments:

    Lines 37–48: Please, shorten the section on rosemary.

    Lines 65–78: The explanation of PLE is too broad—please shorten it to the basics.

    Line 84–85: Please remove methodology from the introduction.

    Line 95: RF is mentioned here for the first time and must be explained (regardless of its mention in the abstract).

    Line 97: TPC is introduced here—define the abbreviation in brackets.

    Line 124: Please use only the abbreviation TPC.

    Line 152: In the sentence “A Folin-Ciocalteu methodology described elsewhere...”—a reference must be provided.

    Line 189: RSM is not mentioned here for the first time—use only the abbreviation.

    Lines 190–191: This sentence is unnecessary; the information is already mentioned in 2.3 and 2.4.

    Line 193: ANOVA is not mentioned here for the first time—use only the abbreviation 

    Lines 203–205: Unnecessary, please remove.

    Lines 206–211: Unnecessary, please remove.

    Lines 336–337: This sentence does not belong here—it should be in the methodology.

    Lines 341–348: This entire part doesn’t belong in the results—it's too descriptive and already well known. please remove it for this section.

    Line 349–350: please move this sentence to the methodology.

    Lines 390–393: Unnecessary here, please remove.

    Lines 418–419: Please add reference(s) for this statement.

    Lines 784–788: please move limitations to the end of the discussion (before the conclusion).

    Line 790: Please remove references from the conclusion.

 

Author Response

Brief summary:

The paper has potential but is too broad and technically complex for the target audience. The large number of models and figures should be reduced and focused on key findings. There is too much text for the scope of the research, and it is recommended to shorten it, as quantity does not necessarily equate to quality.

We would like to thank the reviewer for their valuable comments.

General comments:

The abstract should be concise and clearly state the aim of the research. It currently includes too much theoretical background. According to the journal's author guidelines, the abstract should not exceed 200 words.

The abstract has been shortened, as suggested.

The keywords are mostly aligned with MeSH, which is good, but not all of them (e.g., HPLC-DAD; response surface methodology; regression models; generative models), so they should be revised/adjusted accordingly.

Keywords follow MDPI policies, which require relevance to the manuscript. However, not all keywords apply to MeSH.

The introduction is overly extensive for a scientific article, containing too many details (e.g., rosemary description, extraction method overviews, PLE parameters, etc.). It should be shortened by briefly introducing the topic, the plant and its relevance, highlighting the importance of extracting bioactive compounds and the challenges (both conventional and innovative technologies), mentioning the role of AI in process optimization, and ending with a clearly formulated aim of the study.

The introduction has been revised accordingly.

In the methodology, section 2.1 reads like a random list of chemicals with no clear categorization, making it unclear to readers which chemicals are solvents, standards, etc. It is recommended to categorize them.

Chemicals and reagents are listed according to the company that supplied them. Listing them by type would be redundant, as company names would be repeated throughout.

In section 2.2, the plant material and its sample preparation are described, so the subsection title should reflect that—e.g., Plant Material and Sample Preparation or Rosemary Leaves Raw Material and Pre-treatment.

The title has been modified, as suggested.

In section 2.3 (Experimental Design), results are included, which is inappropriate for the methodology section—they should be moved to Results and Discussion. Only the experimental design description should remain in this section, and the results tables should be relocated.

Results were transferred to the Results & Discussion section, as requested.

In section 2.7, only basic chromatographic information is provided. Details should be added: the type of detector used (including manufacturer), detection wavelength, column description (type, dimensions, particle size, manufacturer), whether the column was thermostated, injection volume, and number of replicate analyses (duplicate, triplicate, etc.).

These details have been included, as requested.

From section 2.11 to section 3 (Results and Discussion), the content becomes too expansive, which is not typical for experimental studies on extraction of bioactive compounds. This part should be shortened to the essentials.

We thank the reviewer for this important observation. We revised the Results and Discussion section to ensure that the content remains focused on the most essential findings. Redundant descriptions were removed, and the interpretation of the ML results was streamlined to emphasize practical implications rather than technical details. The goal was to maintain clarity and relevance for readers interested in bioactive compound extraction while still presenting the added value of the ML-based modeling.

The conclusion is too long. It should be shortened and focus on the conclusions of this study. Anything not proven by this study should not be included in the conclusion. Also, study limitations should be removed from this section.

We thank the reviewer for this comment. The Conclusions section was revised to be more concise and focused on the actual findings of this study. No explicit study limitations are included in the Conclusion, and statements not directly supported by the study results have been removed.

Out of 52 references (which is too many for an original scientific article, unless it's a review), more than half are over 5 years old. Some are over 20 years old and should be removed or replaced with more relevant references.

Older references have been replaced with newer ones, and the total number of references was reduced, as suggested.

 

Specific comments:

Lines 37–48: Please, shorten the section on rosemary.

Section has been shortened.

Lines 65–78: The explanation of PLE is too broad—please shorten it to the basics.

Modifications have been made.

Line 84–85: Please remove methodology from the introduction.

Sentence removed.

Line 95: RF is mentioned here for the first time and must be explained (regardless of its mention in the abstract).

Explanation added.

Line 97: TPC is introduced here—define the abbreviation in brackets.

Definition added.

Line 124: Please use only the abbreviation TPC.

The redundant phrase was removed.

Line 152: In the sentence “A Folin-Ciocalteu methodology described elsewhere...”—a reference must be provided.

Reference added.

Line 189: RSM is not mentioned here for the first time—use only the abbreviation.

The redundant phrase was removed.

Lines 190–191: This sentence is unnecessary; the information is already mentioned in 2.3 and 2.4.

Sentence removed.

Line 193: ANOVA is not mentioned here for the first time—use only the abbreviation 

The redundant phrase was removed.

Lines 203–205: Unnecessary, please remove.

Sentence removed.

Lines 206–211: Unnecessary, please remove.

Sentence removed, but the Figure 1 explanation was relocated since figure descriptions must be included.

Lines 336–337: This sentence does not belong here—it should be in the methodology.

Sentence moved to Section 2.3.

Lines 341–348: This entire part doesn’t belong in the results—it's too descriptive and already well known. please remove it for this section.

Section removed.

Line 349–350: please move this sentence to the methodology.

Sentence was inconsistent after preceding removals and was eliminated entirely.

Lines 390–393: Unnecessary here, please remove.

Removed.

Lines 418–419: Please add reference(s) for this statement.

Reference added.

Lines 784–788: please move limitations to the end of the discussion (before the conclusion).

Moved accordingly.

Line 790: Please remove references from the conclusion.

References removed.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Great Job.

Reviewer 3 Report

Comments and Suggestions for Authors

The Authors responded to all comments. I have no further.

Back to TopTop