Improved Prognostic Accuracy of NEWS2 Score with Triage Data in Adults with Bacterial Sepsis: A Retrospective Cohort Study
Hendrik Napierala
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis is a retrospective cohort study of septic patients to examine if a combination of NEWS2 score with lactate and age would significantly improve predictive power for septic patients at triage. The described study is well executed and has aspects of interest. there are however some questions that need to be answered.
- For Table 2 and Table 3 (i.e. after "rebalancing of groups) how many of the "deceased" group were "septic shock" patients? How many of the surviving group were septic shock patients? The "septic shock" categorized patients have the highest mortality and therefore theoretically could influence your result.
- For NEWS2 scoring - are all the various parameters measured i.e. respiratory rate, oxygen saturation etc of equal weight? Could the authors provide more detail on the "weighting" of these parameters?
- SOFA scores appear to have comparable predictive power with NEWS2, so I am a little confused at why the authors did not examine SOFA scores together with age and lactate? I think this needs a little clarification.
- Minor points:
- Line 63 should be "defined:
- Line 139 should be "assessed"
Author Response
Dear reviewer,
thank you.
Reviewer 2 Report
Comments and Suggestions for AuthorsBrief summary
The authors present an analysis of a model utilizing data acquired at triage to assess mortality at 28 and 90 days after admission to the emergency department due to suspected infection with a bacterial origin. The analysis is based on a retrospective single center cohort.
While the research objective is sound and the underlying cohort seems like a good basis for further analysis, there are some problems that need to be adressed.
The following comments should be taken into account before handing in a revised version.
General concept comments
1) The objective is to determine the best predictor. While a number of scores are mentioned in the data collection section only one is used in the analysis. Why did you not analyse and compare with the other score? At the moment, the research question would rather be around the predictive value of NEWS2 then about the best predictor. On the other and you wanted to improve existing triage criteria. I did not find information on that aspect in the manuscript.
2) Please refer to relevant reporting guidelines (i.e. TRIPOD).
3) I am missing several important aspects for the analysis:
a) how did you compare the different models and constructed the nomogram?
b) how did you validate the nomogram (i.e. internal validation using Bootstrap)?
c) how did you analyse the performance: discrimination (concordance, i.e. Harrels C; ...)?
d) Optimism?
e) Calibration?
If necessary, consult a statistician with experience in developing prognostic models.
4) Missing data and how you dealt with it should be mentioned in the methods and results section.
5) Prognostic factors should be chosen by clinical relevance and not by significance, especially in the multivariate models (L. 265)
6) The results should be shown in a more concise way by moving several tables to the supplement section and focussing on the most important results (i.e. Table 1, nomogram(s), calibration plots...).
7) The limitation section should be more specific.
Specific comments
1) The different scores should be compared in a table (Line 136 following). Currently, it is difficult to assess their differences.
2) While line 97 mentions that it only used pre-2020 data, line 157 states that it was mainly before the pandemic.
3) The argumentation concering using logistic regression in favor of Cox regression is not stringent. The same predictors could be used in survival analysis. Is it rather that censored data and day of death is not available? Otherwise Cox regression would yield higher power.
4) Do you know why the mortality is comparably low in your sample (line 188 f.)? In other studies (for example 10.3238/arztebl.2020.0775, eTable5) it seems higher.
5) What is the definition of sepsis and septic shock (line 222)?
6) Why did you categorize the lactate level instead of using it as a metric variable?
7) Sepsis can also have fungal origin (l. 58)
8) If there are 11 Mio. sepsis-related deaths it is not the leading cause of death (https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1). Please search the literature to better embed this information.
9) The Inclusion criterion was 18 years and older, but min age in Table 1 was 16. How do you explain this difference?
Summary
Summarizing the above mentioned comments you should completely rework the analysis to answer your research hypotheses. TRIPOD should serve as a basis for communicating the results in a transparent but also concise manner. The rest of the comments should be adressed simultaneously to avoid confusions for the reader concering oppsing information (ie. cohort, age).
I believe that your manuscript could greatly benefit from adressing these comments. I am looking forward to reading a revised version.
Comments on the Quality of English Language
The manuscript should be more concise.
The results section should not contain parts of the discussion.
In addition, there are several typos and grammatical errors that should be reworked (i.e. line 103).
Author Response
In replay to “The objective is to determine the best predictor. While a number of scores are mentioned in the data collection section only one is used in the analysis. Why did you not analyse and compare with the other score? At the moment, the research question would rather be around the predictive value of NEWS2 then about the best predictor. On the other and you wanted to improve existing triage criteria. I did not find information on that aspect in the manuscript.” Thank you for your question. In the data analysis, we considered all the scores taken into consideration among the prognostic factors (see Tables 2 through 5), which were then subjected to a logit procedure to determine which of these independently influenced the survival of septic patients. (line 260–265). However, not all scores can be applied in triage, and among these, news2 was found to be the most prognostically valid. In the introduction (line 70-90) it is explained why the NEWS2 score was chosen to be considered compared to the other points.
In replay to “2) Please refer to relevant reporting guidelines (i.e. TRIPOD). Thank you for your question the authors have using the suggestion and tried to follow the proposed guidelines
In replay to 3) I am missing several important aspects for the analysis:
- a) how did you compare the different models and constructed the nomogram?
We first adopted Cox model; however, it requires a preventive check of the proportionality of risks in all the subgroups studied. As we could not verify this proportionality, we opted for another model suitable for evaluating the binary event of survival at the pre-set thresholds. We used a logistic regression model (or "logit regression") with a dichotomous response, which inherently allows the study of categorical factors (nominal or, at most, ordinal modalities) in a similar way to the Cox model and quantitative factors.
You can see also lines 200-204: Several logit regression models were applied to the two re-proportioned subsamples, using the stepwise elimination method to remove less significant variables, with a significance level of α=0.05. The goodness-of-fit of those models was assessed using both Nagelkerke R2 and Cox-Snell R2, as well as the models' predictive power, which was provided by the confusion matrices (standard output of the procedure)
In replay to “b) how did you validate the nomogram (i.e. internal validation using Bootstrap)? by the literature in the presence of small datasets, our analyses were validated with internal bootstrap validation:
- Harrell, F. E. (2015). Regression modeling strategies: With applications to linear models, logistic and ordinal regression, and survival analysis (2nd ed.). Springer. https://doi.org/10.1007/978-3-319-19425-7
- Steyerberg, E. W. (2019). Clinical prediction models: A practical approach to development, validation, and updating (2nd ed.). Springer. https://doi.org/10.1007/978-3-030-21836-2
In replay to c) how did you analyse the performance: discrimination (concordance, i.e. Harrels C; ...)? d) Optimism? e) Calibration? If necessary, consult a statistician with experience in developing prognostic models. Thanks for the comment but the authors want to underline that at lines 198-202: “Several logit regression models were applied to the two re-proportioned subsamples, using the stepwise elimination method to remove less significant variables, with a significance level of α=0.05. The goodness-of-fit of those models was assessed using both Nagelkerke R2 and Cox-Snell R2, as well as the models' predictive power, which was provided by the confusion matrices (standard output of the procedure).”
In replay to 4) Missing data and how you dealt with it should be mentioned in the methods and results section. Thanks for the suggestion, we have inserted what you requested (lines 162-163): All variables considered (p < 0.05; p<0.01; p<0.005 ; p<0.001) were selected for subsequent analysis, excluding those presenting more than 10% of missing data.
5) In replay to Prognostic factors should be chosen by clinical relevance and not by significance, especially in the multivariate models (L. 265)” The authors highlight that the goal of our analysis was to consider all factors that can be recorded in triage and then identify those that significantly influence the prognosis of patients with infections of varying severity. For this reason, we considered statistical significance rather than clinical relevance.
In replay to 6) The results should be shown in a more concise way by moving several tables to the supplement section and focussing on the most important results (i.e. Table 1, nomogram(s), calibration plots...). Thanks for the suggestion, we have moved Tables 4 and 5 with their explanations to the supplements section.
7) in replay to “The limitation section should be more specific.” The authors highlight that the high percentage of elderly patients may increase comorbidities in our population and could impact data analysis. To avoid this, we analyzed the data using logit analysis. For this reason, we excluded patients with missing data and those with infections of undetermined origin from the analysis. This, however, impacted the amount of data to be analyzed, which is why we advocate for a multicenter study with a larger population. This was reported in the limitations section of the manuscript
Specific comments
1) in replay to “The different scores should be compared in a table (Line 136 following). Currently, it is difficult to assess their differences.” The authors have highlighted that the different scores of the scores in tables 2 and 3 were compared. To avoid creating many tables we recommended making only 2 based on survival at 28 and 90 days with a comparison of all the variables in a single table.
2) in replay to “While line 97 mentions that it only used pre-2020 data, line 157 states that it was mainly before the pandemic.” The authors have highlighted that the COVID pandemic began in 2020, so the two statements coincide as they consider 2020 as the beginning of the pandemic.
In replay to 3) The argumentation concering using logistic regression in favor of Cox regression is not stringent. The same predictors could be used in survival analysis. Is it rather that censored data and day of death is not available? Otherwise Cox regression would yield higher power. The authors the authors want to highlight that We first adopted Cox's model; however, it requires a preventive check of the proportionality of risks in all the subgroups studied. As we could not verify this proportionality, we opted for another model suitable for evaluating the binary event of survival at the pre-set thresholds. We used a logistic regression model (or "logit regression") with a dichotomous response, which inherently allows the study of categorical factors (nominal or, at most, ordinal modalities) in a similar way to the Cox model and quantitative factors. As reported in literature “Agresti A (2013). Categorical data analysis, 3rd Edition, John Wiley & Sons, Inc., New Jersey. “ and “Hosmer DW, Lemeshow S (1989). Applied Logistic Regression. Wiley, New York.”
4) in replay to “Do you know why the mortality is comparably low in your sample (line 188 f.)? In other studies (for example 10.3238/arztebl.2020.0775, eTable5) it seems higher.” The authors have highlighted that the data we presented are the result of timely management and a therapeutic approach that was as faithful as possible to the latest SSC guidelines. The limited number of patients affected by septic shock may have probably positively influenced the mortality of our sample.
5) In replay to “What is the definition of sepsis and septic shock (line 222)?” The authors have highlighted that the definition of sepsis and septic shock was described in lines 110-118
6) In replay to Why did you categorize the lactate level instead of using it as a metric variable? The authors have highlighted that To utilize as much information as possible, lactate was categorized following an exploratory analysis which confirmed the relevance of the established categorization threshold. This is also supported by previous studies: (Lines 411-419) A previous study reported that the optimal cut-off values for predicting short-term survival are <3.7mmol/L for the second lactate measurement and ≥32% for lactate clearance [36]. According to another study, lactate ≥3.5 mmol/L at 6 hours and its clearance at 6 hours <24.4% are useful for predicting 30-day mortality.[37].
Our study aimed at evaluating the importance of the various parameters in triage was necessarily limited to evaluating the initial value of lactates and according to our cases, an initial value ≥ 2 is associated with a 28-day-mortality of 49.2% and a 90-day mortality of 62.9% (89/154 pts) among patients with bacterial sepsis. “
7) In replay to “Sepsis can also have fungal origin (l. 58) “The authors have changed in the test.
8) In replay to “If there are 11 Mio. sepsis-related deaths it is not the leading cause of death (https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1). Please search the literature to better embed this information”. The authors have integrated in the test
9) In replay to “The Inclusion criterion was 18 years and older, but min age in Table 1 was 16. How do you explain this difference?” The authors thank the reviewer and have corrected the highlighted inaccuracy. Indeed, there is a 16-year-old patient and a 17-year-old patient, both with non-septic infections. We will modify the inclusion criteria by lowering the age to 16, not 18.
Summary
In replyto."Summarizing the above mentioned comments you should completely rework the analysis to answer your research hypotheses. TRIPOD should serve as a basis for communicating the results in a transparent but also concise manner. The rest of the comments should be adressed simultaneously to avoid confusions for the reader concering oppsing information (ie. cohort, age).
I believe that your manuscript could greatly benefit from adressing these comments. I am looking forward to reading a revised version."
Thank you for your suggestions. We hope it is in line with your suggestions.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThank you for answering all my comments to the first version.
I think that the argument for using news2 is now clearer.
Please add Odds Ratios for the prognostic factors, including 95% Confidence intervals and interpret accordingly.
Comments on the Quality of English LanguagePlease rework the manuscript to be more concise, especially in the methods and results sections.
l. 246: prospective. There are more typos in the manuscript. Please go trough the document carefully.
Author Response
In reply to “Please add Odds Ratios for the prognostic factors, including 95% Confidence intervals and interpret accordingly.” The authors thanks you for your feedback. The odds ratios for prognostic factors, including 95% confidence intervals, are reported in the manuscript: In particular in Table 4.a. and 4.b. (survival at the 28-day threshold) and tables 5.a. and 5.b. (survival at the 90-day threshold). Furthermore, the authors have detailed the results obtained in the results paragraph. “The 28-day threshold model For the first model without the terminal state (Table 4.a), for each additional year of age, the odds of survival decrease by approximately 4.9%. Lactate ≥2.5 mmol/L significantly reduces the odds of survival (approximately a 76% reduction), and each additional point in NEWS2 reduces the odds of survival by approximately 20%. For the second model that includes the terminal state (Table 4.b), terminally ill patients have a survival odds reduced by approximately 70% compared to non-terminal patients. Each additional year reduces the odds of survival by approximately 5.4%. Lactate ≥2.5 mmol/L significantly reduces the odds of survival (approximately a 76% reduction), and each additional point in NEWS2 reduces the odds of survival by approximately 18.6%. The 90-day threshold model For the model without the terminal state (Table 5.a), for each additional year of age, the odds of survival decrease by approximately 4.1%. Lactate levels ≥2.5 mmol/L significantly reduce the odds of survival (approximately a 58% reduction), and each additional point in NEWS2 reduces the odds of survival by approximately 17.9%.Because the model includes the terminal state (Table 5.b), terminally ill patients have a survival probability reduced by approximately 74.9% compared to non-terminal patients. Each additional year reduces the odds of survival by approximately 4.3%. Lactate levels ≥2.5 mmol/L significantly reduce the odds of survival (approximately a 56.2% reduction), and each additional point in NEWS2 reduces the odds of survival by approximately 16.4%.
In reply to “Comments on the Quality of English Language” The authors have used "Rapid English Editing" to improve the English language, as suggested by the editor. The final version will be edited by the program provided by the journal.
In reply to “Please rework the manuscript to be more concise, especially in the methods and results sections.” The authors better integrated the results and tried to synthesize them, without altering the objectives of the study in accordance with what was requested by the other reviewer.
In reply to L. 246: prospective. There are more typos in the manuscript. Please go through the document carefully. “The authors state that this word has been removed from the submitted version (highlighted in red ). They also carefully reread the manuscript and improved the writing”.
The figures are extracted from the statistics program and therefore it is not possible to improve the image quality.
