Mapping Existing Modelling Approaches to Maritime Decarbonisation Using Latent Dirichlet Allocation
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsComments to the Authors:
This is a well-written and well-structured manuscript that introduces a novel way to review maritime decarbonisation modelling using topic modelling (LDA). The language is fluent, the structure is logical, and the figures effectively communicate the main results.
Major strengths:
-
Innovative methodological contribution with LDA applied to an underexplored field.
-
The results are well interpreted and aligned with known decarbonisation strategies (operational, fuel, policy, system design).
-
The link between topic clusters and industry implications is convincingly argued.
Suggestions for improvement:
-
Methods clarity: briefly explain why full texts were not included in the corpus—e.g., to avoid bias from unrelated literature citations.
-
Validation: mention if any domain experts besides the authors validated the topic interpretations.
-
Figures: ensure uniform style and add a note about software used (pyLDAvis version).
-
Discussion: expand on how topic modelling could integrate commercial datasets (as mentioned in the conclusion).
-
Formatting: ensure figure and table numbering comply with MDPI guidelines.
Minor language issues:
-
Replace “decarbonise” → “decarbonize” for consistency with American English.
-
Review a few long sentences in the introduction for smoother flow.
Author Response
Thank you very much for your comments, please find our responses attached.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsIn total, the paper tackles an important and interesting topic. This is the reason why research on the subject should be encouraged. However, it reveals many weaknesses and shortcomings that would need to be addressed in order to qualify for eventual publication.
Specifically, the introduction provides a solid rationale for using text mining to analyze the marine industry's challenges to develop and implement decarbonization models. The transition between the discussion of the challenges facing the marine industry to select the appropriate topic modelling approach (i.e., LDA) could be improved upon by including a brief comparative justification for selecting LDA over other topic modelling approaches that are also discussed in the report. In addition, while the theoretical development of topic modelling was accurate, additional detail about how the pre-processing decisions were made, what type(s) of token filtering were applied, and what parameters were chosen for LDA would have been beneficial to understand how these decisions affected the stability and interpretability of the topics.
The results section was apparent and provided valuable insights in terms of the visual representation of the proximity of the topics. Nevertheless, there are areas in the results section that could be further contextualized to enhance the interpretive value of the results. For example, when discussing the overlap of topics related to different types of energy systems and alternative fuels, the authors should include a brief overview of the relevant policy or technological trends to add explanatory depth to the results.
The discussion of limitations reflected the authors' potential for bias and the limitations of the study; however, the authors may have been able to make clearer links between those identified limitations (i.e. using default parameters for the model, approximating language), and the implications of such limitations on the conclusions reached by the study.
The discussion of limitations reflected the authors' potential for bias and the limitations of the study; however, the authors may have been able to make clearer links between those identified limitations (i.e. using default parameters for the model, approximating language), and the implications of such limitations on the conclusions reached by the study.
Author Response
Thank you very much for your comments. Please find our responses attached (section: Responses to Reviewer 4 Round 1).
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis paper focuses on utilizing the Latent Dirichlet Allocation topic model to analyze trends in maritime decarbonization modeling research. The topic is both cutting-edge and of significant practical importance, providing a quantitative overview that offers new insights for complex and heterogeneous shipping decarbonization modeling research. The research methodology is well-designed, data sources are clearly stated, the sample size is adequate, and the text preprocessing and parameter selection processes are transparent. Overall, this study makes a meaningful attempt to merge methodological exploration with research on maritime decarbonization, possessing both academic value and implications for policy-making.
1. The determination of the number of topics in LDA primarily relies on semantic judgment. It is recommended that the authors specify the exact criteria for semantic evaluation or the expert validation process to enhance the reproducibility of the results.
2. The paper employs the default Dirichlet prior parameters from Gensim. Have the authors tested the impact of varying α and β values on topic clarity? A sensitivity analysis could be beneficial.
3. The handling of stop words involves the removal of general and domain-specific terms. It would be prudent to explain why dual-word or part-of-speech filtering was not implemented, to prevent the disassembly of important compound concepts.
4. In the comparison of keyword weights, three scenarios (λ = 0, 0.5, 1) are mentioned. Has any evaluation metric, such as UMass coherence, been calculated to assist in determining the optimal λ?
5. The clustering results from the t-SNE dimensionality reduction could benefit from a detailed explanation of the specific similarity metrics or parameter settings used (e.g., perplexity, learning rate) to enhance reproducibility.
6. In the results discussion section, interpretations of associations between topics rely heavily on subjective inference. Quantitative validation using co-occurrence or cosine similarity between topics may provide a more objective analysis.
7. Although the literature review is comprehensive, it lacks a systematic comparison between existing review methods and this study. Including a brief section to contrast these approaches would be valuable.
8. The discussion section could be further refined to extract technological or policy insights corresponding to the identified themes, thereby deepening the connection between theory and practice.
Author Response
Thank you very much for your comments, please find our responses attached.
Author Response File:
Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThis article explores the application of topic modeling, specifically Latent Dirichlet Allocation (LDA), to analyze existing approaches to modeling maritime decarbonization. The authors highlight the fragmented and heterogeneous nature of research in this area and propose using LDA to identify key thematic trends based on an analysis of the abstracts of 200 scientific articles from the Web of Science. Four key themes emerged: operational emission reduction measures, alternative fuel deployment strategies, emission reduction policy development, and ship energy system design. Visualization of the results using t-SNE and pyLDAvis allows for an assessment of the degree of interrelationships and combinations of themes across studies. The article demonstrates that LDA can serve as an effective tool for analyzing complex and interdisciplinary areas such as shipping decarbonization and identifying gaps for future research.
However, it would be necessary to clarify a number of comments that are available to the article:
- Using article abstracts alone can lead to a loss of context and depth in thematic analysis, especially for complex models.
- LDA's performance was not compared with other topic modeling methods (e.g., BERTopic, Top2Vec), which undermines the validity of LDA's choice.
- Using default parameters for α and β without justification may impact the quality of topic clustering.
- Coherence or perplexity values for the selected number of topics (K = 4) are not specified, which calls into question the validity of the choice.
- Column charts of keyword weights do not allow assessing the semantic coherence of topics without additional metrics (Figure 2). An approach to integrating multi-scale models of ship systems (micro-, macro-, meta-, and mega-levels) could be considered. This approach could be a logical extension of the work on topic modeling (https://doi.org/10.1016/j.oceaneng.2025.122539, https://doi.org/10.3390/en16248101). While the peer-reviewed article examines thematic clusters of research, the proposed works propose a methodology for integrating the models themselves that may correspond to these clusters. For example, the topics "alternative fuels" and "energy system design" could be combined within a mega-level model that takes into account both technological and management aspects. Thus, these works complement the article by offering a tool for implementing interdisciplinary research identified using LDA.
- t-SNE parameters (e.g., perplexity) are not specified (Figure 4).
- Keyword changes at λ=0, 0.5, and 1 are not accompanied by a statistical assessment of the significance of these changes.
- Equations or a formal description of the LDA generative process are not presented.
- The dynamics of topic changes over the years, which could reveal the evolution of research interests, are not examined.
Author Response
Thank you very much for your comments, please find our responses attached.
Author Response File:
Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsThe revised manuscript can be accepted.
Author Response
With many thanks for your consideration and time,
Lily Reece
Reviewer 4 Report
Comments and Suggestions for AuthorsThe authors took into account the comments of the reviewers of the previous version of the article, which significantly improved the quality of the work. An assessment of the main corrections and remaining issues is provided below. The authors justified their choice by arguing that the introductory and overview sections of the articles may contain topics unrelated to the main content of the model, which could bias the results. This is a reasonable explanation, but the limitation is acknowledged, and further work with the full texts is planned. The authors rightly note that their goal is not to compare methods, but to demonstrate the applicability of topic modeling in general. LDA was chosen as a well-established method. At the same time, the conclusion indicates that comparisons with BERTopic, Top2Vec, and other methods are planned for future work. The authors explained that the lack of a priori knowledge about the distribution of topics and words in the corpus prevented them from making informed choices about the values of α and β. This limitation is acknowledged, and a sensitivity analysis is planned for the future. In the revised version of the article, the authors added an explanation for the choice of the number of topics (K = 4), based on a tradeoff between semantic interpretability and metrics. Independent experts were also engaged to validate the clusters. The authors expanded the analysis using pyLDAvis and similarity metrics (Hellinger, Jaccard), which improved topic interpretation. The proposed connection with mega-level modeling holds promise for future research. t-SNE parameters have been added to the text, increasing the reproducibility of the results. The authors acknowledge that a statistical evaluation of the effect of λ was not conducted and cite the need for further research using the methodology of Sievert and Shirley. A formal description of the generative process of LDA has been added to the article, improving the understanding of the method. The authors explained that analyzing temporal trends is difficult due to the uneven distribution of publications across years. This direction is reserved for future research. Recommendations:
- In future work, it is recommended to conduct a comparative analysis of LDA with more modern methods (e.g., BERTopic) to evaluate their effectiveness on this corpus.
- It is recommended that further analysis include determining the optimal λ value based on user studies.
- Including commercial solutions and patents will enrich the analysis and identify trends in industrial implementation.
Author Response
Thank you very much for your consideration and time, we are very grateful for your recommendations for future work.
