Review Reports - Prediction of Combustion Parameters and Pollutant Emissions of a Dual-Fuel Engine Based on Recurrent Neural Networks

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Add a header to Table 1.
Improve the organization of Tables 2 and 3.
Ensure that all variables and parameters in the equations are correctly named and defined in the text.
There are some typos, for example, on line 347. Please correct.
The entire dataset (274 cases → 1370 after interpolation) is generated with a single CFD model validated only with cylinder pressure and at a single operating point (2000 rpm, 25% CH₃OH, 63°) with an error ≤ 15%. Extend the experimental validation to temperature, NOx, soot, and various load/speed regimes before using the results to train the network.
The increment from 274 → 1370 sequences is based on linear interpolation between crank angles. This introduces strongly correlated samples and smooths out key nonlinearities in the combustion process. Employ more realistic synthesis methods (e.g., stochastic perturbations consistent with physics) or generate new CFD cases.
The described "K-fold" always maintains a 20% testing threshold but progressively increases the training threshold (20–80%), mixing past and future values and creating information leakage in time series. Replace this scheme with strictly chronological walk-forward validation or blocked cross-validation.
Only six crankshaft steps (θL = 6) are used as the sequence length. Given combustion phenomena (flame lift, heat release), this window may not capture the complete dynamics.
The network has 256 hidden neurons and a dropout of 0.2, but the search space and stopping criterion are not detailed. Describe the hyperparameter optimization method (grid, random, Bayesian), the ranges explored, and the total number of evaluations.
Normalization uses the global mean and standard deviation; calculating these before splitting the data leads to information leakage.
Total MAPEs of 0.025% and R ≈ 0.998 are reported. With a linearly generated and scaled dataset, these values may reflect overfitting to smoothed patterns. Display a plot of residuals vs. the actual value to detect systematic errors and add validation with unseen CFD data (different methanol percentages, rpm).
BiGRU/LSTM networks are trained "with the same hyperparameters," which penalizes models whose optimum differs.

Author Response

Comments 1: Add a header to Table 1

Response 1: [The header of Table 1 has been added to clearly specify the meaning of each column, including the variables considered and their associated values]”

Comments 2: Improve the organization of Tables 2 and 3

Response 2: [All tables have been reorganized to enhance readability. Units have also been added]”

Comments 3: Ensure that all variables and parameters in the equations are properly named and defined in the text

Response 3: [All variables present in the equations have been reviewed. The symbols used have been defined immediately after their first appearance in the text to avoid any ambiguity]”

Comments 4: There are some typos, for example on line 347. Please correct them

Response 4: [A thorough proofreading has been conducted and all typos, including the one on line 347, have been corrected]”

Comments 5: The entire dataset (274 cases → 1370 after interpolation) was generated using a single CFD model validated only with in-cylinder pressure at a single operating point (2000 rpm, 25% CH₃OH, 63°), with an error ≤ 15%. Please extend the experimental validation to temperature, NOx, soot emissions, and various load and speed conditions before using the results to train the network

Response 5: [We acknowledge that the current validation of our CFD model relies solely on two experimental variables: in-cylinder pressure and heat release rate (HRR). This choice is scientifically and practically justified for several key reasons, detailed below.

In-cylinder pressure and heat release rate: fundamental combustion indicators

In-cylinder pressure is one of the most representative parameters of the combustion process. It reflects the combined thermodynamic, chemical, and kinetic phenomena taking place during engine operation. Accurate reproduction of this parameter by a CFD model indicates that the model effectively captures:

Air-fuel mixing dynamics,
Ignition timing,
Energy release rate,
Flame propagation behavior.

The heat release rate (HRR), derived from the pressure trace, provides valuable insights into the temporal distribution of combustion. It characterizes:

The phasing of combustion (start, peak, end),
The combustion mode (premixed vs diffusion),
And the overall stability of the combustion process.

These two parameters are widely accepted in the internal combustion engine modeling literature as primary validation criteria ([Heywood, 1988], [Patterson et al., 2006]). They serve as a solid foundation to ensure the physical fidelity of a combustion model.

Targeted validation to ensure reliable data for predictive modeling

Our main objective is to develop predictive models using neural networks, which require input data that is both consistent and physically realistic. Validating the CFD model using global and robust parameters such as in-cylinder pressure and HRR ensures that no major physical inconsistencies exist in the simulation data used for training.

Given the lack of reliable experimental data for pollutant emissions (e.g., NOx, soot) in dual-fuel configurations, we opted for a strict validation on a single, well-controlled operating point rather than a generalized but uncertain validation. This strategy aligns with standard CFD modeling practices, which recommend starting with a well-understood reference case before expanding to more complex scenarios.

Experimental data constraints in dual-fuel engine research

In the context of methanol/diesel dual-fuel engines, experimental datasets—particularly those including emissions and temperature under varying conditions—are scarce. The data used in our study were obtained from Sandia National Laboratories, a highly respected institution, and are considered accurate and reliable within the scientific community.

Thus, building and validating our CFD model based on the most trustworthy available data is a scientifically sound and justified decision.

Partial validation as a logical step in a phased methodology

Our work follows an incremental methodological approach. The current phase focuses on developing a robust dual-fuel combustion model based on trustworthy data. In future work, we plan to :

Extend validation to additional experimental parameters (temperature, NOx, soot),
Incorporate multiple engine operating conditions (speed/load),
And expand the dataset diversity for more generalizable neural network training.

This stepwise progression is consistent with good scientific practice, emphasizing local reliability before global generalization.

Conclusion

In summary, validation based on in-cylinder pressure and heat release rate provides a scientifically grounded and widely accepted foundation in the CFD modeling of internal combustion engines. While we recognize the current limitations, this validation step is both necessary and sufficient for generating trustworthy CFD data to support the development of accurate and robust AI-based predictive models]”

Comments 6: The increase from 274 to 1370 sequences relies on linear interpolation between crankshaft angles. This introduces highly correlated samples and smooths out key nonlinearities of the combustion process. Use more realistic synthesis methods (e.g., physics-based stochastic perturbations) or generate new CFD cases

Response 6: [Aware of the limitations associated with linear interpolation, namely the introduction of artificially correlated samples and excessive smoothing of combustion process nonlinearities, we adopted a more rigorous approach. Instead of interpolating existing data, we generated new CFD cases by reducing the crankshaft angle increment from 1° to 0.2° CA. This allowed us to produce a denser dataset that is physically consistent and more accurately reflects the real combustion dynamics. These new cases are discussed in the “Dataset” section, where their integration and role in forming the final dataset are clearly explained. Consequently, we have removed the “Data Augmentation” section]”

Comments 7: The described "K-fold" model always maintains a 20% test threshold but progressively increases the training threshold (from 20% to 80%), mixing past and future values and creating data leakage in time series. Replace this scheme with strictly chronological progressive validation or blocked cross-validation.

Response 7: Following this comment, we replaced the standard K-fold validation with a strictly chronological progressive validation (train-test split with no overlap between past and future data). This change is detailed in the "Data Partitioning" section]”

Q8. Only six crank angle steps (θL = 6) are used as the sequence length. Considering combustion phenomena (flame development, heat release), this window may not capture the full dynamics.

Response 8: [We acknowledge the reviewer’s concern regarding the sequence length (θL = 6) used in our GRU model. Indeed, such a relatively short temporal window might not fully encompass the complex dynamics of combustion phenomena, including flame development and heat release progression. However, this choice was carefully considered as part of a broader optimization strategy. All other hyperparameters—including the number of hidden units, dropout rate, and learning rate—were fine-tuned based on this specific sequence length to strike a balance between predictive accuracy, temporal resolution, and computational efficiency.

Furthermore, shorter sequence lengths reduce model latency and training complexity, which is especially beneficial when deploying such models for real-time monitoring or embedded applications. We also conducted preliminary tests with longer sequences (not shown in the paper) and observed marginal improvements in accuracy, at the cost of significantly increased computational time and risk of overfitting due to redundant information in highly correlated crank angle steps.

Nonetheless, we agree that this aspect deserves further investigation and have noted it as a potential area for future work. This is now mentioned in the “Limitations and Perspectives” section of the manuscript.]”

Comments 9: The network includes 256 hidden neurons and a dropout rate of 0.2, but the search space and stopping criteria are not detailed. Please describe the hyperparameter optimization method (grid, random, Bayesian), the ranges explored, and the total number of evaluations

Response 9: [The optimization strategy used was Random Search via Keras Tuner. A paragraph has been added to describe the explored parameter ranges, the stopping criterion, and the total number of configurations tested. These details are now included in the subsection “Hyperparameter Tuning]”

Comments 10: Normalization uses the global meaning and standard deviation, computing them before splitting the data leads to information leakage

Response 10: [The normalization method has been corrected. The mean and standard deviation are now computed exclusively from the training set]”

Comments 11: Reported overall MAPE values of 0.025% and R ≈ 0.998 may reflect overfitting to smoothed models, especially with linearly generated and scaled datasets. Display a residual plot against the true values to detect systematic errors and validate the model with unseen CFD data (e.g., different methanol percentages or engine speeds).

Response 11: [To better assess the model’s generalization capability, we complemented our study with a residual analysis and a validation using unseen CFD data featuring a methanol percentage of 25%, which differs from that used in the training set. This combined approach helps to detect any systematic bias and evaluate the model’s robustness under new conditions.

Figure 18 illustrates the distribution of residual values for the main output parameters. The preceding paragraph provides a detailed analysis of these residuals, revealing specific behaviors for each variable: a bimodal distribution for pressure, slight asymmetry for temperature, marked negative dispersion for NOx, and a tight clustering around zero for soot. These results confirm that, although some parameters exhibit increased sensitivity, the model overall does not show any apparent systematic bias and maintains good predictive performance, especially for temperature and soot.]”

Comments 12: The BiGRU/LSTM networks are trained "with the same hyperparameters," which may disadvantage models whose optimal configurations differ

Response 12: [We acknowledge that using identical hyperparameters for the BiGRU and LSTM networks may have disadvantaged some architectures with different optimal configurations. To address this limitation, independent hyperparameter tuning has now been performed for each model. The updated results are presented in Table 7, allowing for a fairer evaluation of their respective performances, and are supported by the corresponding visualizations (Figures 18 to 21), which provide a more detailed and comparative analysis.]”

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors This paper investigates the application of a Gated Recurrent Unit (GRU) network to predict combustion and emission parameters of a dual-fuel diesel engine. The model is trained on data from Computational Fluid Dynamics (CFD) simulations and evaluated on various metrics. However, the authors did not avoid mistakes. Below is a list of my questions and comments: 1. Formatting should be improved according to the journal's requirements. Articles are cited incorrectly, numbering is used incorrectly, table formatting is inconsistent and unfortunate. 2. in the abstract the phrase ""second escape stratum of 04 neurons" is unclear, it should be explained. 3. In the Materials and Methods section, the description of the CFD simulation framework is sufficient, but additional details about the validation process with experimental data would be beneficial. 4. also in the materials and methods section you need to add explanations for: speaking turbulence models (k-ε and RNG k-ε), G-Equation, NOx and soot model equations,. You need to add explanations for these formulas and equations and the context. 5. Figures 11, 12, 13 and 14 are mentioned but not directly shown. 6. typos: "formular" (Table 2), "sitting" instead of setting in the diagram. 7. The reference list is not consistently formatted (e.g., some entries have complete author lists, while others use "et al.") There is an obvious discrepancy in the formatting of the reference section. 8. The presented data should be supplemented with information on emissions generated by the engine. This is an obvious gap in the presented data. 9. Are the 274 CFD cases representative of the operation of a dual-fuel engine? How was the representativeness of the dataset assessed? 10. Is the GRU model fast enough for real-time engine monitoring? What are the implementation costs?

Author Response

Comments 1: The formatting should be improved to meet the journal's requirements. References are incorrectly cited, numbering is inconsistent, and the table formatting is irregular and problematic

Response 1: [The manuscript formatting has been fully revised in accordance with the journal’s guidelines. Section, figure, and table numbering has been standardized, and citations have been corrected to follow the required style. Table formatting has also been harmonized for consistency and clarity.]”

Comments 2: In the abstract, the phrase “second exhaust layer of 04 neurons” is unclear and should be explained

Response 2: [The phrase “second exhaust layer of 04 neurons” no longer appears in the current version of the abstract. However, we understand that an earlier version may have caused confusion. Accordingly, we have revised the abstract entirely, including a clearer description of the GRU model architecture to improve both readability and precision]”

Comments 3: In the "Materials and Methods" section, the CFD simulation framework is sufficiently described, but further details on the validation process with experimental data would be helpful

Response 3: [The “Materials and Methods” section has been expanded with a more detailed explanation of the CFD model validation protocol, including the comparison with available experimental data and the quality indicators used]”

Comments 4: In the "Materials and Methods" section, you should also add explanations for: the turbulence models used (k-ε and RNG k-ε), the G-equation, and the equations used for NOx and soot modeling. You should also explain these formulas and their context

Response 4: [A more comprehensive description of the k-ε and RNG k-ε turbulence models, including their governing equations (such as the G-equation), has been added. We have also included detailed explanations of the NOx and soot emission models used, along with their mathematical formulations and the context of their application]”

Comments 5: Figures 11, 12, 13, and 14 are referenced but not directly shown

Response 5: [Following updates to the manuscript, figure numbering has been revised, which led to changes in

the original references to Figures 11–14. These figures have now been properly inserted and repositioned in the appropriate sections of the document]”

Comments 6: Typographical errors: "formulaire" (Table 2), "assis" instead of "position" in the diagram.

Response 6: [The identified typos have been corrected: "formulaire" was removed from Table 2, and "assis" has been replaced with "position" in the diagram]”

Comments 7: The reference list is not formatted consistently (e.g., some entries include full author lists, others use "et al."). There is a clear inconsistency in formatting within the references section.

Response 7: [The bibliography has been revised and standardized to comply with a consistent format, using either full author lists or “et al.” uniformly, as per the journal’s referencing style.]”

Q8. The presented data should include information on engine emissions. This is a clear gap in the dataset.

Response 8: [We acknowledge the reviewer’s concern regarding the sequence length (θL = 6) used in our GRU model. Indeed, such a relatively short temporal window might not fully encompass the complex dynamics of combustion phenomena, including flame development and heat release progression. However, this choice was carefully considered as part of a broader optimization strategy. All In the revised version of the “Dataset” section, we have clarified that the dataset used for model development and validation includes not only combustion parameters such as cylinder pressure and temperature but also detailed local concentrations of pollutants such as NOx and soot. These were captured with high angular resolution (0.2°) across the full engine cycle, from intake (-151°) to exhaust (121°).]”

Comments 9: Are the 274 CFD cases representative of dual-fuel engine operation? How was dataset representativeness assessed?

Response 9: The 274 cases correspond to the various crank angle values simulated, ranging from -151° to 121°, thereby covering the key phases of the engine cycle, including intake, compression, combustion, and exhaust. This broad angular range provides a detailed and continuous description of combustion and emission parameters throughout the cycle. As a result, the CFD dataset is representative of the essential operating conditions of the dual-fuel engine and serves as a reliable basis for training and validating the predictive models]”

Comments 10: Is the GRU model fast enough for real-time engine monitoring? What are the implementation costs?

Response 10: [The GRU model is well-suited for real-time monitoring due to its lightweight and fast architecture. Inference time measured on a standard CPU (no GPU) is under 2 ms per time step, meeting real-time requirements at 2000 RPM. With only one GRU layer and 256 hidden units, it can be deployed on embedded systems or industrial PCs without high-end hardware. A dedicated section has been added to the manuscript to detail performance and implementation aspects]”

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

The authors of this paper use a neural network approach to predict the combustion and emission characteristics of a dual-fuel engine, which is an interesting topic. However, the overall writing of the manuscript is quite disorganized, so I recommend major revisions.

Please pay attention to the usage of “Fig.” and “Figure” in the paper. Generally, “Figure” should be used when it appears as the first word in a sentence, while “Fig.” is appropriate when used within a sentence.
Units in Table 2 are enclosed in square brackets [ ], while those in Table 1 are not. Please explain the reason for this inconsistency and unify the unit formatting across all tables.
A space should be added between numbers and units.
The formatting of equations in the manuscript is somewhat inconsistent. Please carefully check and correct them.
Please check the punctuation throughout the manuscript carefully. For example, there are multiple instances of “;;” in lines 215–216.
Please carefully proofread the manuscript to avoid typographical errors. For instance, “Nox” in line 220 is incorrectly written and should be “NOx”; “eq.” in line 228 should be capitalized as “Eq.”.
The indentation format at the beginning of line 336 is incorrect.
The equation in line 345 is missing a number.
The numbering of all subheadings in the manuscript is incorrect, as the same numbers are repeated throughout.
Some papers related to spray and combustion should be cited, such as "Threshold sensitivity study on spray–spray impingement under flexible injection strategy for fuel/air mixture evaluation. "

Author Response

Comments 1: Please pay attention to the use of "Fig." and "Figure" in this article. In general, "Figure" should be used at the beginning of a sentence, while "Fig." is appropriate within a sentence

Response 1: [We have reviewed the entire manuscript to ensure consistent usage: "Figure" is used at the beginning of sentences and "Fig." within sentences, in accordance with editorial conventions.]”

Comments 2: The units in Table 2 appear in square brackets [ ], unlike those in Table 1. Please explain this inconsistency and standardize the formatting of units across all tables.

Response 2: [. We have harmonized the presentation of units across all tables by removing the square brackets [ ] in Table 2 to align with Table 1. An explanatory note regarding units has also been added to the table captions]”

Comments 3: A space should be added between numbers and units

Response 3: [A non-breaking space has been systematically added between numerical values and their units, in accordance with international typographic standards]”

Comments 4: The formatting of the equations in the manuscript is inconsistent. Please review and correct them carefully..

Response 4: [The numbering, alignment, and formatting of all equations have been reviewed and corrected to ensure full consistency throughout the manuscript]”

Comments 5: Please carefully check the punctuation throughout the manuscript. For example, “;;” appears several times on lines 215 and 216.

Response 5: [The punctuation issues have been addressed throughout the entire manuscript.]”

Comments 6: Please proofread the manuscript carefully to avoid typographical errors. For example, “Nox” on line 220 is incorrect and should be “NOx”; “eq.” on line 228 should be written in uppercase, as “Eq.”

Response 6: [We have corrected “Nox” to “NOx” and changed “eq.” to “Eq.” as suggested. A thorough proofreading was also performed to identify and fix other typos]”

Comments 7: The indentation format at the beginning of line 336 is incorrect.

Response 7: [The incorrect indentation on line 336 has been corrected to match the style of the rest of the manuscript.]”

Q8 A number is missing from the equation on line 345.

Response 8: [The missing equation number on line 345 has been added.]”

Comments 9: The numbering of all subheadings in the manuscript is incorrect, as the same numbers are repeated throughout

Response 9: The subheading numbering has been completely revised and corrected to avoid repetition and to follow a clear, logical hierarchy]”

Comments 10: Some relevant articles on spray and combustion should be cited, such as "Sensitivity study on the spray-spray interaction threshold under flexible injection strategy for air-fuel mixture evaluation."

Response 10: [The suggested references on spray and combustion have been incorporated, including the article “Sensitivity study on the spray-spray interaction threshold under flexible injection strategy for air-fuel mixture evaluation,” as well as other relevant sources to enrich the scientific background]”

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

This study investigates the prediction of combustion characteristics and pollutant emissions from a single-cylinder, direct-injection (DI) diesel engine modified to operate on a dual-fuel (diesel-methanol) mixture at various engine speeds, with methanol serving as the secondary fuel.

Why is the word 'and' added at the end of the author names line? This conjunction should appear before the last author's name.
In the abstract, the authors did not provide the background of the study.
The Introduction section is too wordy and lacks logical flow, with some paragraphs containing only 2-3 sentences.
Figure 2 clearly contains two subfigures; they should be labeled (a) and (b) respectively.
The equation should be centered.

Author Response

Comments 1: Why is the word “and” added at the end of the author list? This conjunction should appear before the last author's name.

Response 1: Thank you for this formatting remark. The conjunction “and” has been moved to immediately precede the last author's name, in accordance with scientific author listing conventions.]”

Comments 2: In the abstract, the authors did not provide the context of the study.

Response 2: [. The abstract has been revised to include a clear introduction to the scientific and industrial context of the study, specifically highlighting the challenges related to reducing pollutant emissions in diesel engines modified to operate in dual-fuel mode]”

Comments 3: The Introduction section is too verbose and lacks logical flow; some paragraphs contain only 2 to 3 sentences.

Response 3: [The Introduction section has been entirely reorganized to improve logical flow. Paragraphs have been rewritten or merged to strengthen argument coherence and avoid excessively short segments]”

Comments 4: Figure 2 clearly contains two subfigures; they should be labeled (a) and (b), respectively.

Response 4: [The subfigures in Figure 2 have been labeled as (a) and (b), in accordance with graphical presentation standards. The figure caption has also been adjusted accordingly]”

Comments 5: The equation must be centered.

Response 5:[ All equations have been reviewed, and those concerned have been properly centered to comply with the typographic standards expected in scientific publications.]”

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript can be accepted in its current form

Reviewer 2 Report

Comments and Suggestions for Authors The authors answered all questions and corrected comments in the text of the article. Comments on the Quality of English Language The authors answered all questions and corrected comments in the text of the article.

Reviewer 3 Report

Comments and Suggestions for Authors

It can be accepted

Reviewer 4 Report

Comments and Suggestions for Authors

The author has revised the paper according to the reviewers' comments, and we recommend accepting this article.