Review Reports - IndoGovBERT: A Domain-Specific Language Model for Processing Indonesian Government SDG Documents

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

- While the paper claims IndoGovBERT outperforms other models, the evaluation is confined to a specific use case (Indonesian government SDG document processing). Broader generalization across different domains or languages is not demonstrated.

- The presented work discusses the usefulness of IndoGovBERT for India government, but does not explicate on the issues of implementation, upgrade and management among other things.

- Based on BERT architecture which is indeed strong but this paper does not consider other transformer architectures which can be stronger in low resource conditions.

- The authors listed the contributions of the manuscript in the introductions section. However, these points need to be highlighted with more technical details.

- Lack of explanation for the relative difference in performance of IndoGovBERT to other equivalent models that can be trained with less focus using relatively small datasets used for fine tuning.

- The model is only to address Indonesian documents; however, the multilingual issues of the government for processing the multiple language documents such as documents in the context of NSA are not fully dealt with. It may reduce generality in multi-level governmental environments.

Comments on the Quality of English Language

Dear Editor,

All comments should be carefully addressed before resubmitting the revised version of this manuscript.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript introduces IndoGovBERT, a domain-specific BERT model that addresses the lack of resources in Indonesian NLP, which is critical for government document processing about the Sustainable Development Goals (SDGs). It provides a unique methodology for processing government documents, which is highly applicable to other non-English-speaking countries facing similar challenges in NLP and SDG documentation.

Developing a specialized Pre-Trained Language Model (PTLM) tailored to the Indonesian government context fills an important gap in domain-specific NLP models. Comprehensive evaluation and comparison with other models demonstrate thorough experimentation, including general-purpose Indonesian language models, the Multilabel Topic Model (MLTM), and the Multilingual BERT.

The paper is well-structured and clearly articulates its methodology and experimental results. The discussion on different approaches to PTLM development is insightful and relevant.

Comparative experiments with well-known transformer-based models could improve the validation of IndoGovBERT's superiority. Overall, this article strongly contributes to domain-specific NLP and offers valuable practical implications for government SDG document processing.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Comments

The research background and introduction of this paper are relatively comprehensive, the data set source is authentic and reliable, the workload meets the requirements, and the paper results have certain value and practicality. However, there are still some areas that need to be improved in this paper, as follows：

1. There are some obvious data or grammatical errors in the paper due to human mistake For example, in Section 4.1.1, the size of corpora. in Table 5 is miscalculated (size of C1 1.9 plus size of C2 4.3 should equal 6.2 instead of 5.2).

2. The model development of the paper lacks theoretical support. You used existing methodology and algorithm when building the model. Why did you choose these methods? Is there a corresponding theoretical support? Before the experiment, could you conclude based on theory that your model performance would be better if built in this way?

3. There are some problems in the experimental design. Wirawan’s model performs better than SC-C2-FT-C1 and SC-C1-FT-C2 in the C3 dataset. However, you did not test Wirawan’s model when testing the SDG Multi-Label Classification task in section 5.1 and 5.2. Although your IndoGovBERT performs better than MLTM in this type of task, the original base models may perform better？

Suggestions

1. It is recommended to introduce the relevant theory when introducing the model development, theoretically show that your model has its own advantages compared with the original base models, and use subsequent experiments to prove that your model does perform better in terms of performance。

2. Suggesting that you use Wirawan’s model for the SDG Multi-Label Classification task in section 5.1 and 5.2. If your model performs better than Wirawan’s model, it further highlights the advantages of your model. If not, you can also explain the shortcomings of your model and what needs to be improved.。

3. Please read your paper carefully more times to identify and correct any grammatical, logical, and calculation errors. Thanks.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have made the required modifications. The paper has been improved and can be accepted for publication in this journal.

Comments on the Quality of English Language

- The manuscript needs a proofreading.