Next Article in Journal
Pancreatic Stone Protein and C-Reactive Protein as Biomarkers of Infection in ICU COVID-19 Patients: A LASSO-Based Predictive Study
Previous Article in Journal
Development and Validation of ELISA for In Vitro Diagnosis of SARS-CoV-2 Infection
Previous Article in Special Issue
Development and Validation of a Methodology to Measure Exhaled Carbon Dioxide (CO2) and Control Indoor Air Renewal
 
 
Article
Peer-Review Record

Analysis of the Spatiotemporal Spread of COVID-19 in Bahia, Brazil: A Cluster-Based Study, 2020–2022

by Ramon da Costa Saavedra 1,*,†, Rita Carvalho-Sauer 1,†, Maria Yury Travassos Ichihara 2, Maria da Conceição Nascimento Costa 3, Enio Silva Soares 1 and Maria Gloria Teixeira 3
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Submission received: 11 June 2025 / Revised: 6 July 2025 / Accepted: 11 July 2025 / Published: 13 July 2025
(This article belongs to the Special Issue Airborne Transmission of Diseases in Outdoors and Indoors)

Round 1

Reviewer 1 Report

The article proposes a detailed spatiotemporal analysis of the spread of COVID-19 in the state of Bahia, Brazil, over an extended period (2020–2022), using robust statistical methods (spatiotemporal scanning with Poisson model in SaTScan). The topic is relevant for epidemiology, public health and health planning, even post-pandemic.
To improve the quality of the article, I recommend:
The abstract is overloaded with figures and details. It should be condensed to highlight only 2–3 essential ideas.
Introduction:
The introductory part requires a broader development of the topic.
Although the originality given the 3-year duration is asserted, a comparative positioning with similar studies from other states or countries would be useful.
The research niche and the relevance of the topic for the scientific community should be debated more clearly.
At the end of the first chapter, the purpose of the research is mentioned, but I recommend adding the objectives, research questions and possibly research hypotheses.
Materials and methods:
A broader description of the analyzed datasets.
It is not explained clearly enough why the limit of 50% of the population and 6 months was chosen. A justification based on the specialized literature would be useful.
Some methodological limitations (e.g., the assumption of uniform risk in the Poisson model) are mentioned only late, and should be discussed earlier, in the Methodology or Discussion section.
Discussions:
The influence of socioeconomic factors is asserted, but they are not empirically analyzed. It would be useful to include additional data on SVI/HDI or even a multivariate analysis.
Concrete recommendations for authorities on the use of these data in future policies are missing.
Conclusions:
It should be a little more consistent and synthetic, focused on practical implications.
It should be clearly explained why the topic is relevant for the scientific community, how they can use the data for the pandemics that exist at the moment.
The added value compared to previous studies in Bahia is only vaguely mentioned – this aspect should be better argued in the introduction and conclusions.

-

Author Response

Comment #1: The article proposes a detailed spatiotemporal analysis of the spread of COVID-19 in the state of Bahia, Brazil, over an extended period (2020–2022), using robust statistical methods (spatiotemporal scanning with Poisson model in SaTScan). The topic is relevant for epidemiology, public health and health planning, even post-pandemic. To improve the quality of the article, I recommend: The abstract is overloaded with figures and details. It should be condensed to highlight only 2–3 essential ideas.

Response #1: Thank you for the comments and suggestions! We have removed all the excessive information from the abstract and kept only the main ideas and results of the study. We chose not to remove the reported results, as we consider them essential for understanding the study. We appreciate your understanding.

 

Comment #2: Introduction: The introductory part requires a broader development of the topic. Although the originality given the 3-year duration is asserted, a comparative positioning with similar studies from other states or countries would be useful. The research niche and the relevance of the topic for the scientific community should be debated more clearly. At the end of the first chapter, the purpose of the research is mentioned, but I recommend adding the objectives, research questions and possibly research hypotheses.

Response #2: As suggested, we have revised all the introduction and highlighted the relevance of the study for the scientific community, as well as reformulated the objectives to provide more textual clarity about our purpose.

 

Comment #3: Materials and methods: A broader description of the analyzed datasets. It is not explained clearly enough why the limit of 50% of the population and 6 months was chosen. A justification based on the specialized literature would be useful. Some methodological limitations (e.g., the assumption of uniform risk in the Poisson model) are mentioned only late, and should be discussed earlier, in the Methodology or Discussion section.

 

Response #3: Thank you for noting this. We have expanded Materials and Methods, subsection “Data sources” to clarify this.  We chose to use a maximum spatial window of 50% of the underlying population, as this is the software default and the setting adopted by most empirical and simulation studies using the Poisson scan statistic. This choice prevents the algorithm from returning clusters so large that they become uninformative. For the temporal aspect, the maximum window is set at 6 months (50% of an epidemiological year) to capture seasonal peaks without masking short-lived outbreaks. Several applications have shown that a window of 3 to 6 months balances sensitivity and specificity for pathogens with clear seasonality (References: 1) Martínez Avilés M, Montes F, Sacristán I, de la Torre A, Iglesias I. Spatial and temporal analysis of African swine fever front-wave velocity in wild boar: implications for surveillance and control strategies. Front Vet Sci. 2024 Mar 24;11:1353983. doi: 10.3389/fvets.2024.1353983.; 2) Rau A, Munoz-Zanzi C, Schotthoefer AM, Oliver JD, Berman JD. Spatio-Temporal Dynamics of Tick-Borne Diseases in North-Central Wisconsin from 2000-2016. Int J Environ Res Public Health. 2020 Jul 15;17(14):5105. doi: 10.3390/ijerph17145105. PMID: 32679849; PMCID: PMC7400118.; 3) Lee, S., Moon, J. & Jung, I. Optimizing the maximum reported cluster size in the spatial scan statistic for survival data. Int J Health Geogr 20, 33 (2021). https://doi.org/10.1186/s12942-021-00286-w)

Regarding the methodological limitations of using the Poisson model, we agree with your suggestion. In this sense, we have added dedicated paragraph to clarify this at the end of the Methods (new lines 114-118)

 

Comment #4: Discussions: The influence of socioeconomic factors is asserted, but they are not empirically analyzed. It would be useful to include additional data on SVI/HDI or even a multivariate analysis. Concrete recommendations for authorities on the use of these data in future policies are missing.

 

Response #4: We appreciate and agree with the suggestion. Unfortunately, we tried but were unable to obtain the social vulnerability index data for the municipalities of Bahia because the website of the Institute of Applied Economic Research (IPEA) was under maintenance during the period available for the article review. We thank you for the suggestion. We have added concrete public-policy recommendations at the end of the Discussion section (lines 317–324).

 

Comment #5: Conclusions: It should be a little more consistent and synthetic, focused on practical implications. It should be clearly explained why the topic is relevant for the scientific community, how they can use the data for the pandemics that exist at the moment. The added value compared to previous studies in Bahia is only vaguely mentioned – this aspect should be better argued in the introduction and conclusions.

 

Response #5: Thank you very much for the suggestion. We have reworded the entire conclusion section to make it more objective and purposeful (new lines 328-348).

All your suggestions have helped to greatly improve our article. Thank you very much.

Reviewer 2 Report

The conducted study is relevant, specific, and describes the clustering of municipal areas in Bahia, Brazil, during the COVID-19 pandemic. The advantage of the work is well-chosen research methods. They allowed us to clearly identify a number of cluster separations that differ in some of the features that accompanied the pandemic. Important information is descriptions of the characteristics of the territory and the population, explaining the accelerated spread of infection.

In general, the work is interesting and deserves publication. However, in my opinion, such a study should solve the problem of not only describing, but also warning against possible future shocks.

1. The authors name a number of reasons for the formation of the identified clusters. For the most part, these are social and geographical reasons. At the same time, the reasons related to the nature of anti-epidemic measures are not considered. The identification of such prerequisites could influence subsequent preventive measures in the event of new waves of the pandemic. Examples of studies showing different socio-political attitudes of countries towards the spread of the pandemic may include https://doi.org/10.3390/economies10110278 , https://doi.org/10.3390/covid1030045 and others .

2. It should be shown what is the probability of maintaining the identified cluster division in case of repeated epidemic manifestations, or it should be noted that this clustering is characteristic only for this period of action of pandemic factors.

3. Complete the list of keywords, in particular: Brazil, clusters, pandemic, etc.

4. Suggested additional materials at: https://www.mdpi.com/article/doi/s1 they don't load. Also, the data is not loaded at: https://dados.ba.gov.br/dataset/dados_covid . At the same time, the address is about demographic and geographical data (https://www.ibge.gov.br / ) should be specified in more detail.

Author Response

Comment #1: The conducted study is relevant, specific, and describes the clustering of municipal areas in Bahia, Brazil, during the COVID-19 pandemic. The advantage of the work is well-chosen research methods. They allowed us to clearly identify a number of cluster separations that differ in some of the features that accompanied the pandemic. Important information is descriptions of the characteristics of the territory and the population, explaining the accelerated spread of infection.

In general, the work is interesting and deserves publication. However, in my opinion, such a study should solve the problem of not only describing, but also warning against possible future shocks.

  1. The authors name a number of reasons for the formation of the identified clusters. For the most part, these are social and geographical reasons. At the same time, the reasons related to the nature of anti-epidemic measures are not considered. The identification of such prerequisites could influence subsequent preventive measures in the event of new waves of the pandemic. Examples of studies showing different socio-political attitudes of countries towards the spread of the pandemic may include https://doi.org/10.3390/economies10110278 , https://doi.org/10.3390/covid1030045 and others .

Response #1: Thank you for highlighting these important points. We agree that linking the observed clusters to the timing, stringency and political acceptance of non-pharmaceutical interventions and anti-epidemic measures strengthens the practical value of our study. We have added new arguments to the discussion section to address these issues (lines 282 to 294)

 

Comment #2: It should be shown what is the probability of maintaining the identified cluster division in case of repeated epidemic manifestations, or it should be noted that this clustering is characteristic only for this period of action of pandemic factors.

 

Response #2: Thank you for raising this point. We have added a brief discussion paragraph (lines 307-316) that addresses the possibility of clusters occurring in new epidemic events.

 

Comment #3: Complete the list of keywords, in particular: Brazil, clusters, pandemic, etc.

 

Response #3: Done!

 

Comment #4: Suggested additional materials at: https://www.mdpi.com/article/doi/s1 they don't load. Also, the data is not loaded at: https://dados.ba.gov.br/dataset/dados_covid . At the same time, the address is about demographic and geographical data (https://www.ibge.gov.br / ) should be specified in more detail.

 

Response #4:  We have updated the access URLs for the databases (lines 374 – 380): "Data Availability Statement: The COVID-19 data used for this research are available for download on the website of the Bahia State Health Department, at <https://dados.ba.gov.br/dataset/38f0f19d-fd41-4213-8bd5-c62baffb1ec9/resource/62eee774-8dab-49c9-ae54-c250c6eab25d/download/onedrive_1_27-06-2023.zip> and population at  <http://tabnet.datasus.gov.br/cgi/deftohtm.exe?ibge/cnv/popsvs2024br.def>. The geographical data (latitude and longitude) were obtained using the geobr package < https://ipeagit.github.io/geobr/> with the R programming language, but can also be found on the website <astro.if.ufrgs.br/br.htm>."

 

We would like to thank you for the time you dedicated to reviewing our article. Your suggestions have helped to strengthen it and make it much more useful for the scientific community. Thank you!

Reviewer 3 Report

Overall

This paper is a detailed analysis of the spatiotemporal spread dynamics of the COVID-19 pandemic in the state of Bahia, Brazil, at the municipal level, and the findings are of great value in considering future public health policies and infectious disease control. Clarifying the dynamics of the pandemic in an area with a mixture of densely populated and remote areas and with problems of medical disparities will also provide important insights for Asian countries with similar challenges. As you point out, it is expected that delving deeper into the relationship between these socio-economic factors and demographic issues such as aging and infection risk will further increase the value of this study.

Question

(1) Problems with data on the number of cases and deaths

Understanding the accurate number of cases and deaths during a pandemic is a global challenge. In the Discussion section of this paper, the possibility of "underreporting, delays in data registration, and variation in data quality depending on the medical infrastructure of each region" is clearly mentioned as "limitations inherent in secondary data analysis." This indicates that the authors recognize that some of the number of cases may be unknown or inaccurate, and your comment accurately captures the issue of this paper. In the introduction, the "heterogeneity of its territory" of Bahia and the "deep socioeconomic inequalities" of Brazil as a whole are mentioned, but there is a lack of detailed description of how this directly translates to the difficulties of collecting data.

To make the background and importance of this study clearer, I would like to suggest that the difficulties of collecting data be discussed more specifically in the introduction. For example, "In areas such as Bahia, which have a mixture of densely populated cities and remote rural areas with difficult access to medical care, there are differences in the capacity of testing and reporting systems, and the number of cases may be underestimated, especially in the early stages of the pandemic and when medical care is under strain. This problem is not unique to Brazil, but is a common issue in many countries." In this way, the constraint of incomplete data itself is positioned as the starting point of the research. Furthermore, by linking this to the above, the significance of this study becomes clearer when we add that "even in such circumstances, spatiotemporal cluster analysis can be a powerful tool for identifying regions and periods when relative risk was particularly high."

(2) Discussion of virus types and regional differences
This paper mentions the gamma strain, which was first identified in Brazil, and the association between the cluster and the Omicron strain epidemic in early 2022, and recognizes the impact that virus variants have had on the dynamics of the pandemic. In particular, the discussion points out that clusters 2 and 3 may reflect the high transmissibility of the Omicron strain. As you point out, the resolution of the analysis would be further improved by deepening the discussion of which variants were rampant in which regions and to what extent, and how this relates to the data quality issues discussed in (1).

We propose expanding the Discussion section to more directly link the timing of the occurrence of major clusters to the characteristics of the variants (transmissibility, immune evasion, etc.) that were said to be dominant in the state of Bahia during that period. For example, one hypothesis could be that "the highest local relative risk (RR = 3.37) observed in Cluster 5 suggests that a specific variant may have invaded a small community and caused an explosive infection." In addition, by linking the characteristics of the variants with data issues, such as "the number of infected people during the Omicron strain epidemic (Cluster 2, 3) increased sharply compared to the previous epidemic (Cluster 1), while there is insufficient data on the severity rate, so there are further challenges in understanding the number of deaths," a more multi-layered consideration can be made.

(3) High-risk patterns and the validity of analytical methods

This study reveals the most risky spatiotemporal patterns. In the discussion, the factors behind the large-scale Cluster 1 are cited as people's movements associated with social events such as festivals and elections, inter-city connections through agribusiness and the federal highway network, and high population density in metropolitan areas. These are important high-risk patterns revealed by this study.

Regarding the analytical method: In the Materials and Methods section, it is described that space-time scan statistics using a discrete Poisson model using SaTScan were used. However, as you pointed out, there is a lack of specific explanations regarding data preprocessing. For example, how the data from two different databases, e-SUS Notifica and Sivep-Gripe, were integrated and organized, and how records with unknown addresses and month of occurrence were handled are important information for ensuring the reliability of the analysis, but there is no detailed description.

I would like to see a discussion on the clarification of high-risk patterns. In the discussion or conclusion section, I propose that the "high-risk patterns" revealed by this study be summarized and organized in bullet points or other formats and presented. For example, by clearly stating the patterns in the form of "① Wide-area spread due to people's movements during festivals and election periods," "② Virus entry into inland areas via cities along major highways," and "③ Sustained infection risk in socio-economically vulnerable areas," the main findings of this study will be clearer to readers. We strongly recommend adding a subsection on "Data Cleaning and Preprocessing" to the Materials and Methods. This should specifically describe the procedure for integrating the two data sources, how missing values ​​and duplicate records were handled, and the process for determining the final dataset used for analysis. In addition, a brief explanation of why the discrete Poisson model scan statistics are more suitable for the purpose of this study than other spatiotemporal analysis methods (e.g., because they are suitable for comparing municipalities with different population sizes) will make the analysis method more persuasive.

Overall

This paper is the result of an investigation into the spatiotemporal spread of the pandemic in Brazil by virus type, cluster, and municipality. The findings are very interesting and useful for control and public health.

In particular, COVID-19 is assumed to be very risky in a country with densely populated areas similar to those in Asia and with problems of medical disparities depending on the region.

It is also assumed that it is linked to the aging problem, and I look forward to more in-depth discussion of these issues.

1. What are the main questions that this research is mainly addressing?

I will first list the questions for this research.

(1) The number of cases and deaths described in the introduction of this paper are partially unknown, and if there is a little more discussion in the introduction, such as the regional characteristics, it will be very important knowledge.

Similar areas exist in other countries, and even in countries that are said to be developed countries, there are actually areas where some are unclear. (Especially in areas with medical facilities under strain)

(2) If there are differences in the types of viruses that were most prevalent and their regional characteristics, I would like to see more discussion on (1).

(3) If there are any patterns that have emerged in this paper, such as the most risky spatiotemporal behaviors, zones where the regions stand out, and behavioral patterns, I would appreciate it if you could discuss them.

2. What parts do you think are original or relevant to the field?

It is an analysis of COVID-19 outbreaks in Brazil in areas with large urban disparities from the perspective of medical professionals.

3. How does this paper contribute to the field compared to other published materials?

It attempts a spatiotemporal approach to discuss the actual virus outbreak risk by converting the size of clusters and risk spans and risk populations to gain insight into cases.

4. What specific improvements should the authors consider in terms of methodology? (4) I would like more specific explanation regarding the validity of the spatiotemporal analysis method and the pre-processing of the dataset.

5. Are the conclusions consistent with the evidence and logic presented?

Yes.

6. Are the references adequate?

As I wrote in the question, I would like to see some non-medical discussion.

7. Do you have any additional comments on the quality of the tables, figures, and data?

None in particular.

Author Response

Comment #1: Overall. This paper is a detailed analysis of the spatiotemporal spread dynamics of the COVID-19 pandemic in the state of Bahia, Brazil, at the municipal level, and the findings are of great value in considering future public health policies and infectious disease control. Clarifying the dynamics of the pandemic in an area with a mixture of densely populated and remote areas and with problems of medical disparities will also provide important insights for Asian countries with similar challenges. As you point out, it is expected that delving deeper into the relationship between these socio-economic factors and demographic issues such as aging and infection risk will further increase the value of this study.

Question (1) Problems with data on the number of cases and deaths. Understanding the accurate number of cases and deaths during a pandemic is a global challenge. In the Discussion section of this paper, the possibility of "underreporting, delays in data registration, and variation in data quality depending on the medical infrastructure of each region" is clearly mentioned as "limitations inherent in secondary data analysis." This indicates that the authors recognize that some of the number of cases may be unknown or inaccurate, and your comment accurately captures the issue of this paper. In the introduction, the "heterogeneity of its territory" of Bahia and the "deep socioeconomic inequalities" of Brazil as a whole are mentioned, but there is a lack of detailed description of how this directly translates to the difficulties of collecting data. To make the background and importance of this study clearer, I would like to suggest that the difficulties of collecting data be discussed more specifically in the introduction. For example, "In areas such as Bahia, which have a mixture of densely populated cities and remote rural areas with difficult access to medical care, there are differences in the capacity of testing and reporting systems, and the number of cases may be underestimated, especially in the early stages of the pandemic and when medical care is under strain. This problem is not unique to Brazil, but is a common issue in many countries." In this way, the constraint of incomplete data itself is positioned as the starting point of the research. Furthermore, by linking this to the above, the significance of this study becomes clearer when we add that "even in such circumstances, spatiotemporal cluster analysis can be a powerful tool for identifying regions and periods when relative risk was particularly high."

Response #1: We thank the reviewer for highlighting the need to clarify data-collection challenges in heterogeneous settings such as Bahia. We agree that under-reporting and delayed case registration are central issues and now acknowledge them briefly in the Introduction (lines 47-50). We have, however, kept the fuller discussion of these limitations in the Discussion section for two reasons. The opening section is intended to frame the epidemiological context and research gap; a detailed exposition of surveillance weaknesses there would dilute that focus. And, the Discussion already examines how reporting gaps and health-system inequities may have influenced our findings, a placement that allows us to link those caveats directly to the study’s results. We appreciate your understanding.

 

Comment #2: Discussion of virus types and regional differences. This paper mentions the gamma strain, which was first identified in Brazil, and the association between the cluster and the Omicron strain epidemic in early 2022, and recognizes the impact that virus variants have had on the dynamics of the pandemic. In particular, the discussion points out that clusters 2 and 3 may reflect the high transmissibility of the Omicron strain. As you point out, the resolution of the analysis would be further improved by deepening the discussion of which variants were rampant in which regions and to what extent, and how this relates to the data quality issues discussed in (1).

We propose expanding the Discussion section to more directly link the timing of the occurrence of major clusters to the characteristics of the variants (transmissibility, immune evasion, etc.) that were said to be dominant in the state of Bahia during that period. For example, one hypothesis could be that "the highest local relative risk (RR = 3.37) observed in Cluster 5 suggests that a specific variant may have invaded a small community and caused an explosive infection." In addition, by linking the characteristics of the variants with data issues, such as "the number of infected people during the Omicron strain epidemic (Cluster 2, 3) increased sharply compared to the previous epidemic (Cluster 1), while there is insufficient data on the severity rate, so there are further challenges in understanding the number of deaths," a more multi-layered consideration can be made.

Response #2: We appreciate your valuable consideration, which will significantly contribute to the improvement of the manuscript. The suggestion was carefully analyzed and incorporated into the discussion (lines 257-268).

 

Comment #3: High-risk patterns and the validity of analytical methods. This study reveals the most risky spatiotemporal patterns. In the discussion, the factors behind the large-scale Cluster 1 are cited as people's movements associated with social events such as festivals and elections, inter-city connections through agribusiness and the federal highway network, and high population density in metropolitan areas. These are important high-risk patterns revealed by this study. Regarding the analytical method: In the Materials and Methods section, it is described that space-time scan statistics using a discrete Poisson model using SaTScan were used. However, as you pointed out, there is a lack of specific explanations regarding data preprocessing. For example, how the data from two different databases, e-SUS Notifica and Sivep-Gripe, were integrated and organized, and how records with unknown addresses and month of occurrence were handled are important information for ensuring the reliability of the analysis, but there is no detailed description.

I would like to see a discussion on the clarification of high-risk patterns. In the discussion or conclusion section, I propose that the "high-risk patterns" revealed by this study be summarized and organized in bullet points or other formats and presented. For example, by clearly stating the patterns in the form of "â‘  Wide-area spread due to people's movements during festivals and election periods," "â‘¡ Virus entry into inland areas via cities along major highways," and "â‘¢ Sustained infection risk in socio-economically vulnerable areas," the main findings of this study will be clearer to readers. We strongly recommend adding a subsection on "Data Cleaning and Preprocessing" to the Materials and Methods. This should specifically describe the procedure for integrating the two data sources, how missing values ​​and duplicate records were handled, and the process for determining the final dataset used for analysis. In addition, a brief explanation of why the discrete Poisson model scan statistics are more suitable for the purpose of this study than other spatiotemporal analysis methods (e.g., because they are suitable for comparing municipalities with different population sizes) will make the analysis method more persuasive. Overall

 

Response #3: Thank you for these suggestions. We agree with them and have revised the entire article to incorporate them, from a better explanation of data collection and processing to a more detailed description of the statistical method employed (new lines 93-108; 116-118; 257-268; 282-294; 307-316). We hope that these changes have sufficiently improved our scientific writing. We take this opportunity to express our sincere thanks for your time spent reviewing our article. Your suggestions were fundamental to its strengthening and adequacy.

 

 

Round 2

Reviewer 1 Report

-

-

Reviewer 2 Report

As a result of the revision of the manuscript, its presentation, design, description of the results and discussion points were improved. The work began to look more systematic with a clear understanding of promising research areas.

The work in the presented edition does not require improvement, it can be published in its present form.

Back to TopTop