Next Article in Journal
Geomorphological Mapping Global Trends and Applications
Previous Article in Journal
Background Tests and Improvements at LAC-UFF Aiming at Sample Size Reduction in Foraminifera 14C Measurement
 
 
Article
Peer-Review Record

Temporal Relationship between Daily Reports of COVID-19 Infections and Related GDELT and Tweet Mentions

Geographies 2023, 3(3), 584-609; https://doi.org/10.3390/geographies3030031
by Innocensia Owuor * and Hartwig H. Hochmair
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Geographies 2023, 3(3), 584-609; https://doi.org/10.3390/geographies3030031
Submission received: 21 June 2023 / Revised: 3 September 2023 / Accepted: 9 September 2023 / Published: 16 September 2023

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

see the attached file

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors

Thank you for the paper. I find it very interesting. Generally, it is well written. have several concerns, however. 

First, I would really appreciate if the novelty of this can be further clarified. Authors did explain but some contribution especially methodological and theoretical perspectives is crucial.

I think in terms of literature review authors should expand on how and why twitters response or any social response is correlated with infection rate. Currently it is insufficient... Perhaps it is due to lack of a clear concept or theoretical framework to base on. I believe having this component the paper will be more robust. Kindly also strengthen this idea in the discussion section. 

Time lag (positive or negative) aspect needs to be expanded particularly on why they behaved as such. Eg 2 or 3 weeks before the new cases arise? What's the implication of it? More of prediction for better precautionary measures for the government? 

In the discussion, authors mention about ebola.. That particular statement is uncanny. Please revise. I think the authors want to draw some connection with the previous study but that phrase perhaps due to English wording is misleading.

Lastly, I believe the following paper is beneficial for your study.

https://www.sciencedirect.com/science/article/pii/S235277142300071X?via%3Dihub

Author Response

Dear editor,

Thank you for providing us with the opportunity to revise the paper manuscript based on reviewer comments. Please see below how we addressed each of the comments. We highlighted relevant changes in the manuscript in yellow.

Best regards,

Innocensia Owuor

Hartwig Hochmair

Reviewer 2 comments

Thank you for the paper. I find it very interesting. Generally, it is well written. have several concerns, however. 

1.First, I would really appreciate if the novelty of this can be further clarified. Authors did explain but some contribution especially methodological and theoretical perspectives is crucial.

Response: We clarified this from in the Introduction section from a methodological perspective (exploring the limitations of various time series analysis techniques) and a practical perspective (explore changes in public attention in response to health emergencies, exploration of alternative datasets for cross-correlation analysis).

  1. I think in terms of literature review authors should expand on how and why twitters response or any social response is correlated with infection rate. Currently it is insufficient... Perhaps it is due to lack of a clear concept or theoretical framework to base on. I believe having this component the paper will be more robust.

Response: We added some related explanations and references to the last paragraph in the literature review section.

  1. Kindly also strengthen this idea in the discussion section. 

Response: We added this aspect to the third paragraph of the discussion section, expressing the role of social media applications and news media for communication during the pandemic.

  1. Time lag (positive or negative) aspect needs to be expanded particularly on why they behaved as such. Eg 2 or 3 weeks before the new cases arise? What's the implication of it? More of prediction for better precautionary measures for the government?

Response: We addressed this in the discussion section by explaining that the lead-lag relationships observed between COVID-19 related GDELT and Twitter responses and COVID-19 cases show whether these data sources anticipate, follow or mirror a real-world event (COVID-19 pandemic). These patterns can guide the design health communication messaging that coincides with the increase in activity in these datasets. 

  1. In the discussion, authors mention Ebola. That particular statement is uncanny. Please revise. I think the authors want to draw some connection with the previous study but that phrase perhaps due to English wording is misleading.

Response: We added this detail about Ebola to the second sentence in the discussion section for clarification.

  1. Lastly, I believe the following paper is beneficial for your study.

 https://www.sciencedirect.com/science/article/pii/S235277142300071X?via%3Dihub

Response: We added this reference to the discussion section (third paragraph).

 Reviewer 3 Report

Comments and Suggestions for Authors

The following are my suggestions for the improvement of parts of this paper: 

1. The authors state – “The full archive search endpoint in the Twitter API was used to retrieve geotagged COVID-19 related tweets………..” Rewrite this section to clearly highlight how this data was extracted. Was the Standard Search API or the Advanced Search API used? How long did the data collection take? How did the author design their data collection to comply with the rate limits of accessing the Twitter API?

2. In the context of working with Tweets, how was bot-generated content detected and eliminated? For instance, if a bot account on Twitter tweets a certain news several times – this could create the perception that there was a degree of public attention/engagement related to that news topic (whereas in reality, it is a bot account Tweeting the same news multiple times). How were such scenarios addressed?

3. The authors state – “Cross-correlation analysis has been used to study the temporal relationship between factual information and misinformation related to COVID-19 on Twitter as well as the effect of various climate variables, such as solar exposure, on the spread of COVID-19” in the context of discussing works in this field that used Tweets. In addition to these two areas, several works in this field have focused on sentiment analysis and fake Tweet detection, which are not mentioned in this review. Consider citing these recent papers - https://doi.org/10.3390/bdcc7020116 (focuses on sentiment analysis of Tweets about COVID-19) and https://doi.org/10.1080/03772063.2023.2220710 (focuses on the detection of fake Tweets about COVID-19)

4. A comparison with prior works is missing: Please include a comparative study (qualitative and quantitative) with prior works in this field to highlight the novelty of this work

5. Please double-check the references for minor typos and/or missing information. For instance, in [44], the name of the publisher is provided, but the name of the journal/conference is missing. 

Author Response

Dear editor,

Thank you for providing us with the opportunity to revise the paper manuscript based on reviewer comments. Please see below how we addressed each of the comments. We highlighted relevant changes in the manuscript in yellow.

Best regards,

Innocensia Owuor

Hartwig Hochmair

Reviewer 3

The following are my suggestions for the improvement of parts of this paper: 

  1. The authors state – “The full archive search endpoint in the Twitter API was used to retrieve geotagged COVID-19 related tweets………..” Rewrite this section to clearly highlight how this data was extracted. Was the Standard Search API or the Advanced Search API used? How long did the data collection take? How did the author design their data collection to comply with the rate limits of accessing the Twitter API?

Response: We added more details to section 3.1.2, detailing the use of the academic research product track on the Twitter API to retrieve tweets, the download process duration, and handling API download limitations.

  1. In the context of working with Tweets, how was bot-generated content detected and eliminated? For instance, if a bot account on Twitter tweets a certain news several times – this could create the perception that there was a degree of public attention/engagement related to that news topic (whereas in reality, it is a bot account Tweeting the same news multiple times). How were such scenarios addressed?

Response: We added to section 3.1.2., that we used only tweets from mobile applications (Twitter for iPhone, Twitter for Android and Instagram) which are typically not used by bots. In addition, we excluded duplicate tweets shared by the same user in a day as they are characteristic of those sent by bots.

  1. The authors state – “Cross-correlation analysis has been used to study the temporal relationship between factual information and misinformation related to COVID-19 on Twitter as well as the effect of various climate variables, such as solar exposure, on the spread of COVID-19” in the context of discussing works in this field that used Tweets.

In addition to these two areas, several works in this field have focused on sentiment analysis and fake Tweet detection, which are not mentioned in this review. Consider citing these recent papers - https://doi.org/10.3390/bdcc7020116 (focuses on sentiment analysis of Tweets about COVID-19) and https://doi.org/10.1080/03772063.2023.2220710 (focuses on the detection of fake Tweets about COVID-19)

Response: We added these references to the last paragraph of the literature review section.

  1. A comparison with prior works is missing: Please include a comparative study (qualitative and quantitative) with prior works in this field to highlight the novelty of this work

Response: This has been addressed in section 3.3.4. We interpreted the qualitative (e.g., highest peak magnitude) and quantitative (e.g., statistical significance of the lag of the peak magnitude) aspects of the cross-correlation plots by drawing comparisons with earlier studies.

  1. Please double-check the references for minor typos and/or missing information. For instance, in [44], the name of the publisher is provided, but the name of the journal/conference is missing.

Response: We checked and updated references accordingly.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

·       There are various errors concerning the automatic references to formulas and tables

 ·       The notation is still not consistent

Lines 275- 283  notation of the model (2) should be consistent with formulas (3) and (4)

Please, use the same symbols for denoting the White noise, the AR and MA coefficients in the mentioned formulas.

In other words, if in the formulas (3) and (4) you denote the AR operator with ϕ(B) then the AR coefficients of formula (2) should be ϕ12 …ϕp 

Similar consideration applies to the other terms of the model.

If you denote the time series in formula (2) with a capital letter Xt, then the time series in formula (3) and (4) should be denoted with a capital letter.

Check carefully the formulas.

Lines 335-341 formulas (5) and (6)

Equation (6) should be consistent with equation (5)

In (5) the coefficient matrices are denoted with Ai  (capital alpha) , moreover, the White noise is denoted with ͼ and the constant vector with c.

In order to be consistent when the VAR(1) is illustrated in formula (6), you should use: c1

and c2  for the constant terms; ͼ 1  and ͼ 2 for the white noise terms; α11, α12, α21,; α22 as coefficients. 

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Author Response

Dear editor,

Thank you for providing us with the opportunity to revise the paper manuscript based on reviewer comments.

Please see below how we addressed each of the comments. We highlighted relevant changes in the manuscript in yellow.

Best regards,

Innocensia Owuor

Hartwig Hochmair

Reviewer 1 comments

  1. There are various errors concerning the automatic references to formulas and tables

Response: We checked and fixed cross references where necessary.

  • 2. The notation is still not consistent

Lines 275- 283  notation of the model (2) should be consistent with formulas (3) and (4)

 Please, use the same symbols for denoting the White noise, the AR and MA coefficients in the mentioned formulas.

In other words, if in the formulas (3) and (4) you denote the AR operator with ϕ(B) then the AR coefficients of formula (2) should be ϕ1 ,ϕ2 …ϕp 

Similar consideration applies to the other terms of the model.

If you denote the time series in formula (2) with a capital letter Xt, then the time series in formula (3) and (4) should be denoted with a capital letter.

Check carefully the formulas.

Response: We changed now Equation 3 and 5 (which is the previous Eq. 4) to include the symbols used in Equation 2 (Xt, Zt). We added Equation 4 to clarify the meaning of  in Equation 3.

  1. Lines 335-341 formulas (5) and (6)

Equation (6) should be consistent with equation (5)

In (5) the coefficient matrices are denoted with Ai  (capital alpha) , moreover, the White noise is denoted with ͼ and the constant vector with c.

In order to be consistent when the VAR(1) is illustrated in formula (6), you should use: c1

and c2  for the constant terms; ͼ 1  and ͼ 2 for the white noise terms;  α11, α12, α21, ; α22 as coefficients.

Response: Equation 7 has been changed to reflect the symbols in Equation 6, following the notation in the Chatfield and Xing (2019) book.

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have revised their paper as per all my comments and feedback. I do not have any additional comments at this point. I recommend the publication of the paper in its current form. 

Author Response

Reviewer was satisfied with the response to his initial comments.

Back to TopTop