Next Article in Journal
A Cost-Aware Framework for QoS-Based and Energy-Efficient Scheduling in Cloud–Fog Computing
Previous Article in Journal
Financial Market Correlation Analysis and Stock Selection Application Based on TCN-Deep Clustering
Previous Article in Special Issue
A Novel Logo Identification Technique for Logo-Based Phishing Detection in Cyber-Physical Systems
 
 
Article
Peer-Review Record

BERT- and BiLSTM-Based Sentiment Analysis of Online Chinese Buzzwords

Future Internet 2022, 14(11), 332; https://doi.org/10.3390/fi14110332
by Xinlu Li, Yuanyuan Lei and Shengwei Ji *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Future Internet 2022, 14(11), 332; https://doi.org/10.3390/fi14110332
Submission received: 30 September 2022 / Revised: 7 November 2022 / Accepted: 10 November 2022 / Published: 14 November 2022
(This article belongs to the Special Issue Security and Community Detection in Social Network)

Round 1

Reviewer 1 Report

The authors proposed a hybrid deep learning model combining BERT and BiLSTM for sentiment analysis of Online Chinese Buzzwords (OCBs). I think that this topic is interesting. They also give a short review of previous papers and perform a experimental comparison between the proposed model and other models using several metrics (precision, F1 score and recall).

However, I think that there is a minor issue: The authors should clarify the limitations of the proposed model.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper is titled – “BERT and BiLSTM-Based Sentiment Analysis of Online Chinese Buzzwords”. A deep learning model combining BERT and BiLSTM has been proposed in this work to generate dynamic representations of OCBs vectors in downstream tasks by fine-tuning the BERT model, and to capture the information of the text at the embedding layer to solve the problem of static representations of word vectors. The experimental results indicate that the model works well in the comprehensive evaluation index F1. The work seems novel. However, the presentation of the paper needs improvement. It is suggested that the authors make the necessary changes/updates to their paper as per the following comments:

1. The authors have cited [35] and [36] by stating that the dataset was developed based on these sources. These sources are two websites that do not have any dataset available for direct use or re-use. If the authors used any form of web scraper or similar to collect data from these websites – it should be clearly mentioned, and the methodology should also be presented and discussed. 

2. The authors have not mentioned anything about the ethical nature of data mining. It seems they collected the data from both websites (references 35 and 36). However, do the websites define published information as “public”? At the same time, do the privacy and security policies allow any form of data mining, such as the approach that was used by the authors?

3. The LSTM update formula (equations 1 to 6) is not clear. Please elaborate. The definitions of some of the variables used in these equations are missing.

4. At the beginning of the literature review section, the papers that have been cited are not recent ones. Consider updating at least two papers by recent papers in this field, such as https://doi.org/10.3390/covid2080076 and https://doi.org/10.3390/app12178765

5. As the accuracy seems very high – please explain what measures were taken to remove false positives, overfitting, and/or similar to ensure optimal model performance.

 

6. There are several grammatical errors and sentence construction errors in the paper. 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Sentiment analysis of Online Chinese Buzzwords (OCBs) is important for the healthy  development of platforms such as games and social networking, which can avoid the transmission  of negative emotions through the prediction of users' sentiment tendencies. The traditional sentiment analysis model is but difficult to obtain accurate sentiment  characteristics in OCBs. Due to the text with irregular structure and semantics, OCBs  have no obvious emotional tendency. Besides, the current main text sentiment analysis  models are all based on short text. Furthermore, when the amount of existing data sets are large, the convergence time of the model is often too long.

To solve the above problems, the authors, BERT pre-training model and BiLSTM  introduced to learn deep sentiment features from the context of the irregularity and  incomplete semantic structure of OCBs. Furthermore, the BERT pre-training model is fine-tuned to accelerate the convergence rate of the model.   The model BERT-BiLSTM does not require word separation during sentiment analysis and is able to capture deep contextual information of word order structure.

Experimental results show that  the Recall and the F1-score of the BERT-BiLSTM model are the highest among the eight sets of comparative experiments.

The article can be published without changes.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

The paper is dealing with an interesting subject that of performing sentiment analysis to the so called Online Chinese Buzzwords, or OCBs. This topic is rather unusual but rather interesting, because of the peculiarities those texts seem to have. I am not much of an expert in the field, but their approach seems methodologically sound, and it is definitely interesting that the authors are able to use BERT for the upstream operations and BiLSTM for the downstream.

 

My main concern is that the submission, needs extensive proof reading, because e.g. there are typos, parts that make little sense as well as parts where the sentence ends, while one would expect it to continue and be connected to the next sentence.

 

Some more specific comments:

-       As Twitter posts are also very short, do you think that tweets can have similar difficulties in sentiment analysis? Discuss and add in the Related work section, if you find something relevant.

-       On the Related work, I would suggest the authors to include the following, https://doi.org/10.1016/j.future.2020.08.005, as it seems to be using an approach that is similar, in the sense that invokes Bidirectional Models and it is also for Sentiment Analysis. I would propose the authors to comment on if the approach in the aforementioned paper, could be used for OCBs as well. It might be that the special nature of OCBs as well as the nature of the Chinese language, do not allow the application of that paper, but please comment on that.

- Line 333: What do you mean by "irregular text"? Are OCBs irregular because of their nature, or is your dataset unbalanced, between positive and negative sentiment or something else?

-       Figure 8: Use the same model order as in Table 4, because the reader gets confused and cannot easily see which model performs how.

-       Figures 9-10: In the text preceding the figure you write: 

Figures 9 and 10 show the performance of the proposed BERT-BiLSTM model in the

training set and validation set, respectively.”

Perhaps something is missing here; how do you measure the performance in the training set and not in the validation set? 

You continue you text writing:

” In Figures 9, it can be seen from the curves of Train Acc, Train Loss

What can be seen from Figure 9?

In Figure 10, the Val Acc, Val Loss of the proposed model on the validation set tends to stabilize. The final Test Loss and Test Acc on the test set are 0.15 and 93.74%, respectively” 

I see Val Acc is ~93% and Val Loss is ~0.15. Do you represent Test Loss/Acc in Figure 10? Please correct accordingly.

-       4.4.1 Ablation experiment: Which ablation experiments did you perform? You generally better expand this section, by describing, even shortly, your experiment. Not very sure which example to recommend, but I think that, arXiv:1901.08644v2, Section 3.2 gives a good example on how to present ablation experiments. 

On lines 395-396: better change your phrase to:  “As it can be seen from Figure 11, the BiLSTM model converges much slower, than the other two models.”

Please also comment on your finding. What are the consequences of the slower convergence speed for your experiment and for your article?  

-       L. 411-412: Do you mean “We even plan to analyze positive and negative emotions”?

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The authors have revised the paper as per all my comments and suggestions. I do not have any additional comments at this point. I recommend the publication of the paper in its current form. 

Back to TopTop