Next Article in Journal
Perspectives on Knowledge, Precautionary Behaviors, and Psychological Status of Patients with Cardiovascular Diseases During the First Wave of the COVID-19 Pandemic in Lebanon: A Multicentric Cross-Sectional Study
Previous Article in Journal
Persistence of Cognitive Difficulties in Adults Three Years After COVID-19 Infection
 
 
Article
Peer-Review Record

SOFA Score Trends in Predicting Mortality in Critically Ill COVID-19 Patients

by Fadhilah Abdul Munim 1, Aliza Mohamad Yusof 1, Saw Kian Cheah 1, Mohd Khazrul Nizar Abd Kader 2, Wan Rahiza Wan Mat 1, Normahaini Abdul Hamid 3 and Muhammad Maaya 1,*
Reviewer 1:
Reviewer 2:
Submission received: 11 June 2025 / Revised: 29 August 2025 / Accepted: 10 September 2025 / Published: 12 September 2025
(This article belongs to the Section COVID Clinical Manifestations and Management)

Round 1

Reviewer 1 Report

1. The study did not fully extract important factors known to affect mortality outcomes in critically ill COVID-19 patients. Such as therapeutic interventions, time of treatment initiation, biomarkers, etc. Missing variables may affect the accuracy and generalization of model predictions.

2. SOFA evaluation was only evaluated at fixed time points (days 1, 3, and 5), and there is no strong literature to support this choice. This approach may not capture key dynamic changes on days 2 or 4.

3. The cutoff value of SOFA ≥ 9 was used throughout the analysis, but the basis for this threshold was not clearly explained, and 9 was selected for specific clinical relevance compared to other values ​​(e.g., 8 or 10). The clinical interpretability of this choice is lacking.

4. The paper only studies the SOFA score, without benchmarking it with other widely used ICU scoring systems (such as APACHE II, NEWS2, or qSOFA). Comparative analysis will help determine whether SOFA has superior or complementary predictive value, which is critical to justify their independent use in clinical decision making.

5. The AUC of the SOFA-based model was reported to be 0.796, but validation is lacking, which is critical to assess the accuracy of predicted probability in a clinical setting.

1. The study did not fully extract important factors known to affect mortality outcomes in critically ill COVID-19 patients. Such as therapeutic interventions, time of treatment initiation, biomarkers, etc. Missing variables may affect the accuracy and generalization of model predictions.

2. SOFA evaluation was only evaluated at fixed time points (days 1, 3, and 5), and there is no strong literature to support this choice. This approach may not capture key dynamic changes on days 2 or 4.

3. The cutoff value of SOFA ≥ 9 was used throughout the analysis, but the basis for this threshold was not clearly explained, and 9 was selected for specific clinical relevance compared to other values ​​(e.g., 8 or 10). The clinical interpretability of this choice is lacking.

4. The paper only studies the SOFA score, without benchmarking it with other widely used ICU scoring systems (such as APACHE II, NEWS2, or qSOFA). Comparative analysis will help determine whether SOFA has superior or complementary predictive value, which is critical to justify its independent use in clinical decision making.

5. The AUC of the SOFA-based model was reported to be 0.796, but validation is lacking, which is critical to assess the accuracy of predicted probability in a clinical setting.

Author Response

Comment 1: The study did not fully extract important factors known to affect mortality outcomes in critically ill COVID-19 patients. Such as therapeutic interventions, time of treatment initiation, biomarkers, etc. Missing variables may affect the accuracy and generalization of model predictions.

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have added these missing variables as limitations [Lines 353-358].

 

Comment 2. SOFA evaluation was only evaluated at fixed time points (days 1, 3, and 5), and there is no strong literature to support this choice. This approach may not capture key dynamic changes on days 2 or 4.

Response 2: The three time points were based on references 11 and 12 [Lines 98-99]. However, we note this comment and also add it to the limitation [Lines 353-358].

 

Comment 3. The cutoff value of SOFA ≥ 9 was used throughout the analysis, but the basis for this threshold was not clearly explained, and 9 was selected for specific clinical relevance compared to other values ​​(e.g., 8 or 10). The clinical interpretability of this choice is lacking.

Response 3: The cutoff value of 9 was derived from the analysed data of our population. A phrase has been added to clarify this [line 221].

 

Comment 4. The paper only studies the SOFA score, without benchmarking it with other widely used ICU scoring systems (such as APACHE II, NEWS2, or qSOFA). Comparative analysis will help determine whether SOFA has superior or complementary predictive value, which is critical to justify their independent use in clinical decision making.

Response 4: Thank you for pointing this out. We agree with this comment. However, the APACHE II data was not collected. Therefore, we have it as a limitation [Lines 353-358].

 

Comment 5. The AUC of the SOFA-based model was reported to be 0.796, but validation is lacking, which is critical to assess the accuracy of predicted probability in a clinical setting.

Response 5: The value was based on the data from our population. Rightfully, as commented, further validation is required, and this has been added into the discussion. [Lines 351-352].

 

4. Response to Comments on the Quality of English Language

Point 1:

Response: Thank You for this observation. We have submitted and received the edited version (changes seen as blue and green texts)

 

5. Additional clarifications

Statements on the number of patients analysed have been made in various parts of the document as these were not clear in the original version.

 

Reviewer 2 Report

The study enrolled 400 COVID-19 ICU patients from the real world and conducted a retrospective analysis. The analyzed data included age, gender, length of ICU stay, underlying conditions, SOFA score, comorbidities, and endotracheal intubation. The statistical results indicate that the trend in SOFA scores can predict the prognosis of severely ill COVID-19 patients, potentially providing valuable reference for patients, clinicians, and public health policy-making departments, particularly in terms of resource management in intensive care units.

 

The paper is logically clear, well-argued, and reliable in its conclusions. Although it might be outdated in terms of timeliness, it still provides valuable data for summarizing regional public health events.

The following discussion does not involve an evaluation of the paper, but as a reader, may be interested in.

 

1. Lines 336–349 discuss the allocation of ICU resources, which often presents ethical dilemmas in real world.

If a patient has a high SOFA score, indicating a very poor prognosis, and the patient and their family strongly wish to continue treatment, should the doctor continue treatment or refuse their request for ICU use? What is the author's opinion?

In other words, can the conclusions of this retrospective analysis be applied to prospective studies or even clinical practice?

Furthermore, what was the actual ICU utilization pattern for patients in the non-survival group included in this study? Is the SOFA score referenced in ICU resource allocation of the real world? After all, as shown in Table 4 of the paper, even after adjusting for confounding factors, tracheal intubation remains significantly associated with mortality.

 

2. How many participants were enrolled in the survival group and non-survival group in Table 2? Similarly, in Table 4, how many enrolled participants were used for statistical analysis?

In other words, Tables 2 and 4 also need to indicate the number of enrolled participants, like Tables 1 and 3, to accurately present the data.

 

3. Line 354, whether due to system issues or a clerical error, the supplementary materials cannot be downloaded.

This means that the data between Lines 222-227 and Lines 237-243 are merely textual descriptions without supporting figures or tables.

Please provide the supplementary materials.

Author Response

Comment 1: Lines 336–349 discuss the allocation of ICU resources, which often presents ethical dilemmas in real world.

If a patient has a high SOFA score, indicating a very poor prognosis, and the patient and their family strongly wish to continue treatment, should the doctor continue treatment or refuse their request for ICU use? What is the author's opinion?

In other words, can the conclusions of this retrospective analysis be applied to prospective studies or even clinical practice?

Furthermore, what was the actual ICU utilization pattern for patients in the non-survival group included in this study? Is the SOFA score referenced in ICU resource allocation of the real world? After all, as shown in Table 4 of the paper, even after adjusting for confounding factors, tracheal intubation remains significantly associated with mortality.

Response 1: Thank you for the poignant observation. These predictors could be applied to these situations as mentioned in the conclusion. However, during the bed shortage crisis, the admission was based on a combination of age, co-morbidities and frailty. [Lines 342-344].

 

Comment 2. How many participants were enrolled in the survival group and non-survival group in Table 2? Similarly, in Table 4, how many enrolled participants were used for statistical analysis?

In other words, Tables 2 and 4 also need to indicate the number of enrolled participants, like Tables 1 and 3, to accurately present the data.

Response 2: Thank you for pointing this out this omission. We have included the number of patients for each column in both Table 2 and Table 4. Whilst all were initially included for Table 2, only data of the non-survivors were analysed further.

 

Comment 3.  Line 354, whether due to system issues or a clerical error, the supplementary materials cannot be downloaded.

This means that the data between Lines 222-227 and Lines 237-243 are merely textual descriptions without supporting figures or tables.

Please provide the supplementary materials.

Response 3: We apologise for this as there is no supplementary material. During the editing process, the statement from the template remained, and has now been removed.

The textual data of the sets of lines 222-227 and 237-243 refer to Figure 2 and Figures 3-4, respectively.

 

4. Response to Comments on the Quality of English Language

Point 1:

Response: Thank You for this observation. We have submitted and received the edited version (changes seen as blue and green texts)

 

5. Additional clarifications

Statements on the number of patients analysed have been made in various parts of the document as these were not clear in the original version.

Author Response File: Author Response.pdf

Reviewer 3 Report

The manuscript presents a retrospective study evaluating the utility of the SOFA score, including its trends over time, in predicting mortality among critically ill COVID-19 patients in ICU settings. The topic is relevant and timely, particularly in the context of critical care triaging during pandemic surges. The study design is methodologically sound and data presentation is mostly clear.

I do have some comments:

  1. I suggest adding a limitations section, or at least a paragraph. 
  2. The classification into increased/maintained/decreased SOFA trends, although operationally defined, could be influenced by inter-observer variability and missing data. 
  3. My most major concern regards the 27 patients with data only for day 1. Specifically this may create a bias toward survivors, affecting the generalizability of the SOFA trend findings.
  4. I do suggest another look at the English language, there are many other awkward, informal or repetitive phrasings.

Detailed comments:

  1. Some common limitations include: the retrospective nature, the generalizability of data from only two centers, potential unmeasured confounding variables (lab values, type of treatments)
  2. Were these values confirmed by multiple raters or validated against a gold standard? If this is not applicable, please add a limitation in this regard.
  3. The 27 patients should have been excluded from the start, as this may inflate the SOFA score for day 1. It can be said that by default, the remaining 357 patients would have a lower score on day 3. Although a few, there is a risk of underestimating the prognostic value of an early high SOFA score, as the most fatal trajectories may occur within the first 48 hours AND overestimating the benefit of decreasing SOFA scores, since only those who survive longer are eligible to show improvement. At this point, you either exclude these 27 patients all together and redo calculations, or you can perform a comparative analysis on who patients included in SOFA trend analysis (survived past Day 3) with those excluded (died before Day 3) using Mann-Whitney to check if there are statistically significant differences between their day 1 values. If there are no differences, great, but if there are significant differences you will also have to address this limitation in the limitation section. For future reference, a Cox regression can be used.
  4. Some English edits: line 32: "stongly"; "evaluation point times" sounds awkward. I suggest "were recorded at three time points""; "An increased SOFA score was taken" sounds too informal, please use "defined as"; "the rise of SOFA score from Day 1", please use "increase" instead of rise". Please check throughout the document.

Author Response

Major Comment 1: I suggest adding a limitations section, or at least a paragraph. 

Response: Thank you for pointing this out. We have added the limitations [Lines 353-358].

 

Major Comment 2. The classification into increased/maintained/decreased SOFA trends, although operationally defined, could be influenced by inter-observer variability and missing data. 

Response: We relied on retrospective data. After screening, 16 patients who had incomplete or missing data were excluded [changes as reflected in Figure 1. Consort diagram & lines 154-155]

 

Major Comment 3. My most major concern regards the 27 patients with data only for day 1. Specifically this may create a bias toward survivors, affecting the generalizability of the SOFA trend findings.

Response: Thank you for your concern. We have removed these 27 from analysis. [changes as reflected in Figure 1. Consort diagram, lines 156-157 & Table 1].

 

Major Comment 4. I do suggest another look at the English language, there are many other awkward, informal or repetitive phrasings.

Response: Thank You for this observation. We will submit the amended draft for English editing process.

 

Detailed comment 1. Some common limitations include: the retrospective nature, the generalizability of data from only two centers, potential unmeasured confounding variables (lab values, type of treatments)

Response: Thank you for pointing this out. We have added these as the limitations [Lines 353-358].

 

Detailed comment 2. Were these values confirmed by multiple raters or validated against a gold standard? If this is not applicable, please add a limitation in this regard.

Response: Thank you for pointing this out. The values were derived from our findings and rightfully require validation. Thus, ithas been mentioned as such in the discussion [Lines 351-352].

 

Detailed comment 3. The 27 patients should have been excluded from the start, as this may inflate the SOFA score for day 1. It can be said that by default, the remaining 357 patients would have a lower score on day 3. Although a few, there is a risk of underestimating the prognostic value of an early high SOFA score, as the most fatal trajectories may occur within the first 48 hours AND overestimating the benefit of decreasing SOFA scores, since only those who survive longer are eligible to show improvement. At this point, you either exclude these 27 patients all together and redo calculations, or you can perform a comparative analysis on who patients included in SOFA trend analysis (survived past Day 3) with those excluded (died before Day 3) using Mann-Whitney to check if there are statistically significant differences between their day 1 values. If there are no differences, great, but if there are significant differences you will also have to address this limitation in the limitation section. For future reference, a Cox regression can be used.

Response: Thank you for your concern. We have removed these 27 from analysis. [changes as reflected in Figure 1. Consort diagram, lines 156-157 & Table 1].

 

Detailed comment 4. Some English edits: line 32: "stongly"; "evaluation point times" sounds awkward. I suggest "were recorded at three time points""; "An increased SOFA score was taken" sounds too informal, please use "defined as"; "the rise of SOFA score from Day 1", please use "increase" instead of rise". Please check throughout the document.

Response: Thank you for the observations. We have amended as suggested.

 

 

4. Response to Comments on the Quality of English Language

Point 1:

Response: Thank You for this observation. We have submitted and received the edited version (changes seen as blue and green texts)

 

5. Additional clarifications

Statements on the number of patients analysed have been made in various parts of the document as these were not clear in the original version.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors have addressed all of my previous comments. I have no further questions at this time.

The authors have addressed all of my previous comments. I have no further questions at this time.

Reviewer 2 Report

The statistical results of the paper indicate that the trend in SOFA scores can predict the prognosis of severely ill COVID-19 patients, potentially providing valuable reference for patients, clinicians, and public health policy-making departments, particularly in terms of resource management in intensive care units.

Thank the authors for addressing each point of the previous review comments with necessary revisions. Their ability to revise the statistical analysis and content of the paper within a short timeframe demonstrates their thorough understanding of the research data and background.

1. The limitations of SOFA scoring in real-world applications have been rigorously revised in the Discussion and Conclusion sections.

2. Key information such as enrollment numbers in the tables has been revised.

Reviewer 3 Report

No further comments. Thank you for addressing the previous ones.

-

Back to TopTop