Predictive Scores for Late-Onset Neonatal Sepsis as an Early Diagnostic and Antimicrobial Stewardship Tool: What Have We Done So Far?

Background: Late-onset neonatal sepsis (LOS) represents a significant cause of morbidity and mortality worldwide, and early diagnosis remains a challenge. Various ‘sepsis scores’ have been developed to improve early identification. The aim of the current review is to summarize the current knowledge on the utility of predictive scores in LOS as a tool for early sepsis recognition, as well as an antimicrobial stewardship tool. Methods: The following research question was developed: Can we diagnose LOS with accuracy in neonates using a predictive score? A systematic search was performed in the PubMed database from 1982 (first predictive score published) to December 2021. Results: Some (1352) articles were identified—out of which, 16 were included in the review. Eight were original scores, five were validations of already existing scores and two were mixed. Predictive models were developed by combining a variety of clinical, laboratory and other variables. The majority were found to assist in early diagnosis, but almost all had a limited diagnostic accuracy. Conclusions: There is an increasing need worldwide for a simple and accurate score to promptly predict LOS. Combinations of the selected parameters may be helpful, but until now, a single score has not been proven to be comprehensive.


Introduction
The term 'neonatal sepsis' (NS) is used to describe the systemic condition caused by bacteria, viruses or fungi in the first four weeks of life and is a clinical syndrome that may include signs of systemic infection, circulatory shock and multisystem organ failure [1]. Depending on the age of onset and timing of the sepsis episode, NS has been classified as either early-onset (EOS) (<72 h of life) or late-onset (LOS) (>72 h of life) [2].
Neonatal sepsis remains one of the most common causes of neonatal morbidity and mortality in developed and developing countries [3,4]. Widespread antimicrobial use for neonatal sepsis is associated with the development of resistant microorganisms, as well as adverse clinical outcomes [5][6][7]. It is estimated that 31% of global annual neonatal deaths related to sepsis could be attributed to antimicrobial resistance [8]. Simultaneously, the inability of neonates, and especially the premature ones, to moderate an inflammatory response makes them more susceptible to infectious diseases than older children or adults [9]. Notwithstanding major advances in neonatal care and increasing research, the early diagnosis of neonatal septicemia is a vexing problem and remains a great challenge to pediatricians due to multiple reasons [10]. First of all, the clinical picture is nonspecific [11,12], the signs of NS may be absent or minimal and hard to detect [13] and many noninfectious syndromes have initial clinical presentations similar to severe infections [14]. Undoubtedly, the gold standard for the diagnosis of a systemic infection is the isolation of pathogens from the

Information Sources
This review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (PRISMA) [25,26]. We performed a systematic search of all available literature in the PubMed database from 1982 to December 2021.

Search Strategy
The following research question was developed: Can we diagnose late-onset sepsis with accuracy in neonates using a predictive score?

Inclusion and Eligibility Criteria
We limited our search results to articles written in English and articles published from 1982 onwards, because the first predictive score (Töllner et al.) was published in 1982. No other limits were applied. All publications were eligible for review, with particular emphasis on research and observational studies.

Selection Process
We screened 1352 articles for relevance. We emphasized studies creating or validating a predictive score for the diagnosis of LOS. Therefore, we excluded studies on EOS (<72 h). Furthermore, we set the presence of a score as a perquisite for study inclusion. As a result, we excluded articles with models or algorithms that provide a prediction for LOS but not a score or a stratification of possibility for LOS. References from the screened articles were further reviewed for additional articles. Inferentially, 16 articles were included in this review, as they met all the above-mentioned criteria ( Figure 1). 1982. No other limits were applied. All publications were eligible for review, with particular emphasis on research and observational studies.

Selection Process
We screened 1352 articles for relevance. We emphasized studies creating or validating a predictive score for the diagnosis of LOS. Therefore, we excluded studies on EOS (<72 h). Furthermore, we set the presence of a score as a perquisite for study inclusion. As a result, we excluded articles with models or algorithms that provide a prediction for LOS but not a score or a stratification of possibility for LOS. References from the screened articles were further reviewed for additional articles. Inferentially, 16 articles were included in this review, as they met all the above-mentioned criteria ( Figure 1).

Results
For this systematic review, 1352 articles were identified, 33 were retrieved and 16 were included. Of them, fifteen were original studies, and one was a review/meta-analysis that included nine studies focusing predominantly on clinical parameters that might predict  [14]. Of the 16 studies included in the present review, 5 were conducted in India; 2 in Belgium; 2 in Thailand and 1 from each of the following countries: the USA, Germany, Australia, Canada, Turkey and Bangladesh. The review/meta-analysis was conducted in Belgium. Nine were prospective, four retrospective and two mixed (retrospective and prospective). Eight studies referred to original scores (new scores), five were validations of already existing scores and two were mixed (new scores and validations of previous scores). A total of 2664 neonates were included in the above studies.
To facilitate the presentation of the studies included in this review, we categorized them according to the type of variables used for each score. For example, if a predictive model used solely clinical parameters to predict LOS, this was categorized as clinical. On the other hand, if a predictive model used only laboratory parameters to predict LOS, this was categorized as laboratory. If a combination of parameters was used, the model was classified in the clinical/laboratory group, etc. We therefore created six different categories as follows:  Table 2 (scores with exclusively laboratory variables) and Table 3 (scores with combined variables). The diagnostic accuracy of each score is presented in Table 4.
First bedside clinical score for very premature neonates in a low-resource setting. This external validation performed significantly lower Sensitivity than the original study.
As the number of sings presented within 48 h of sepsis evaluation was increased, Se and NPV were reduced, while Sp and PPV were augmented. Sensitivity reducing when more than 1 signs were present.
Analysis was divided into 3 phases: onset, at the beginning and at the peak of the illness. Each phase gave different results: as the illness evolved, the scores got higher. Changes in skin coloration: the most frequent sign of NS. Septic neonates performed high scores (47% at the beginning of the illness and 92% in seriously ill infants), in contrast with non-septic neonates.

Scores with Clinical Variables
The predictive scores based on Clinical Variables are shown in Table 1. First of all, in 2003, Singh et al. created the first scoring system based exclusively on clinical variables [27]. This score was validated twice; the first validation took place in 2008 by Kudawla et al. (same team) [29]. Investigators also attempted to estimate if a repetition of the assessment at different time periods would be useful. The clinical score was calculated at 0 h and at 24 h after the onset of the illness. An additional validation was held in 2010 by Rosenberg et al., with the exception that the signs of chest retraction and pre-feed aspirates were replaced by respiratory distress and poor feeding, respectively [30]. The team also sought to create a new score. The study ended up to the first bedside clinical score for nosocomial sepsis in preterm neonates, mainly addressed as low-resource settings. Therefore, the score originally created by Singh et al. was further validated prospectively, which may increase any future diagnostic utility. Few years later, in 2005, Dalgic et al. presented a comparative study in which they contrasted the Nosocomial Sepsis Predictive Score (NOSEP) (clinical, laboratory and risk factors) with a clinical score made by their NICU [28]. All patients were evaluated for sepsis both by the NOSEP score and the team's clinical score. This was also a very useful approach, as it directly compared a score with actual clinical practice.

Scores with Laboratory Variables
The predictive scores based on Laboratory Variables are shown in Table 2. First of all, in 1988, Rodwell et al. created the first hematologic scoring system (HSS) for neonatal sepsis (both EOS and LOS), consisting exclusively of laboratory variables [23]. The HSS was then prospectively evaluated in 2011 by Narasimha and Harendra and again in 2013 by Makkar et al. [31,32]. Of note, the use of laboratory values in the score created by Rodwell et al. was found to have high sensitivity but low specificity in LOS sepsis diagnosis (Table 4).

Scores with Clinical and Laboratory Variables
The predictive scores based on both Clinical and Laboratory Variables are shown in Table 3. In 1982, U. Töllner created the first predictive score for the early diagnosis of septicemia in newborns [24]. The analysis was divided into three phases: symptoms before (when the patient showed no changes in their clinical and hematological values), at the beginning (upon the initial presentation of symptoms of septicemia or hematological changes) and at the peak of the illness (with all clinical symptoms of septicemia/septic shock present). The score was also tested in a prospective cohort on not only septic and healthy neonates but also on neonates with other clinical conditions. This score was, at that point, the most comprehensive one in terms of variable inclusion and set the background for the development of further scores.
A more detailed approach was made by Griffin et al. in 2007, where the Heart Rate Characteristics (HRC) were analyzed in addition to other known clinical and laboratory findings [33]. They also created a score containing variables connected to NS. The researchers recorded signs and symptoms before, at the time and after the BC. The calculated HRC index adjunctive to the clinical information was found to be useful in LOS predictions.
Finally, the first predictive model for bacterial LOS was presented by Husada et al. in 2020, incorporating a variety of parameters, and was found to have high sensitivity, as well as specificity (Table 4) [34].

Scores Based on Risk Factors
In 1994, Singh et al. created the only scoring system for the prediction of NS, using solely perinatal risk factors [39]. The investigators examined the interdependence of each variable, categorized them as dependent or independent factors and subsequently developed a score not only for EOS but LOS as well. This score was found to have high sensitivity but very low sensitivity (Table 4). No further predictive scores were created solely based on the risk factors.

Scores Based on Clinical, Laboratory and Risk Factors
In 2000, Mahieu et al. presented a bedside scoring system named NOSEP for NICUs, targeting particular nosocomial sepsis (occurring >48 h of admission) [35]. The NOSEP-2 score was composed by adding into the NOSEP score the culture results of central vascular catheters. An internal and external validation of the NOSEP score in six NICUs were displayed by Mahieu et al. in 2002 [36]. Additionally, in order to increase the predictive performance of the score, the investigators developed NOSEP-NEW-I and NOSEP-NEW-II scores by changing the cut-off values and including additional variables, respectively. All the above scores, shown in Table 3, were generally more comprehensive and well-validated in larger cohorts of neonates. Their sensitivity was high, and their negative predictive value was higher compared to other categories of scores (Table 4).

Clinical, Laboratory and Management
Few scores were based on clinical and laboratory variables as well as management decisions (Table 3). In particular, in 2005, Okascharoen et al. presented the first bedside scoring system for hospitalized neonates, after examining multiple variables associated with proven LOS [37]. In 2007, the same team presented an external validation of this scoring system [38]. Simultaneously, clinicians were asked to complete a questionnaire and rate the probability of true sepsis after obtaining basic laboratory results while they were not aware of the criteria of the LOS score. The researchers concluded that clinicians may predict LOS as accurately as the scoring system but tend to overestimate the possibility of LOS and the score performed better in prediction compared to clinicians' judgment. This score was also found to have high sensitivity and a high negative predictive value (Table 4).

Discussion
Late-onset neonatal sepsis is one of the most challenging areas in neonatal medicine today. While it is crucial to diagnose NS early, it is equally important not to overuse antibiotics.

Predictive Scores: Clinical vs. Laboratory Parameters
Predictive scores are powerful tools to improve clinical decision-making; they simplify the decision-making procedure and assist the clinicians in increasing the accuracy of the diagnostic assessment [40,41]. Predictive models may facilitate medical judgment through a more evidence-based procedure. In order to make predictive scores the cornerstone of the early diagnosis of NS, we are in need of an accurate and easy-to-use model. A guide for how these prediction models should be structured was published as a protocol form [42].
Scores based solely on clinical symptoms and signs could be easily used and are of major importance in low-/middle-resource settings and in primary care, where a laboratory assessment may be inaccessible or unaffordable. Additionally, the evaluation of these scores saves time. However, a significant limitation of exclusively clinical scores is the subjectivity needed for the assessment of many clinical parameters (such as lethargy, pallor and hepatomegaly), which makes it more demanding for the inexperienced clinician.
On the other hand, scores containing exclusively laboratory variables provide a more objective and thus accurate tool for the decision-making process. The undemanding implementation by the inexperienced physician makes them a more standard basis for clinical practice. Nonetheless, laboratory data demand a requisite period of time, sometimes of vital importance, such as in emergency situations, and may be scarce in developing countries and primary care. In addition, it must be taken into consideration that hematologic response varies, according to multiple factors (gestational and postnatal age, time between onset of infection and the blood sample, etc.). It should also be underlined that, because of the rapid changes in the process of the illness, it is crucial for the score to be able to be repeated at short intervals of time.
As a result, we consider that the golden ratio for an ideal score may be the combination of clinical and laboratory variables. Clinical points will give a suspicion of LOS without waiting for the lab results (clinician can give empirical therapy), and the lab results will confirm this suspicion or not (clinician continues or withdraws treatment). Adding extra indicators such as risk factors for NS or management variables helps clinicians to keep in mind children with a higher risk for NS. Generally, the knowledge that hematologic parameters change rapidly, in comparison with clinical signs or management factors, is pivotal for the assessment of the variables when composing the score.

Prematurity and Low Birth Weight (LBW) Infants
Plenty of the scores were tested in preterm/very preterm and/or LBW/VLBW neonates [27,29,30,32,33,37]. Three studies pointed out that NS was more common in preterm and/or LBW patients [23,33,39]. The assessment in this population is extremely helpful, because premature and LBW patients consist of a significant part of neonates and especially groups with a higher risk for LOS. Nevertheless, these babies do not respond to sepsis as full-term infants. For example, a fever is rarer in premature babies, and hematologic responses vary according to age and BW. Additionally, symptoms such as apnea, chest retractions and grunting were more common in premature babies because of lung immaturity, and as a result, we should consider them as less appropriate to predict NS in all age neonates (signs less specific but highly sensitive). For example, the external validation of Singh et al. held by Rosenberg et al. in 2010 performed significantly lower than the original study, and this fact was explained by the investigators due to the differences in the natures of the population; the original score was addressed to all neonates admitted to the NICU (although it gave emphasis on preterm and LBW), while the validation study involved only very preterm neonates (≤33 weeks gestational age, admitted in NICU under 72 h of life), which are generally more prone to respiratory symptoms because of a more severe lung immaturity. Moreover, in the study of Okascharoen et al. in 2005, in the validation set, there was a significant increase in the amount of preterm (from 40% to 69%) and LBW infants (18% to 49%) compared to a derivation study. Therefore, the scores tested in both groups in tandem may leave a clouded picture, concerning their accuracy in the diagnosis of LOS. A score applicable to all newborns or at least a score composed, tested and used in a particular group (only preterm, only term, etc.) is something to be discussed.

Clinical and Laboratory Parameters of High Diagnostic Value
Taking into consideration the results reported in each study of our review, overall, apnea, lethargy, tachycardia, pre-feed aspirates, changes in skin coloration, abnormal temperature and abnormal HR seemed to be the most sensitive, while grunting, hypothermia and chest retractions the most specific clinical signs. Feeding intolerance, hypotonia, lethargy and fever were clinical signs highly predictive of NS. One study and its validation indicated UVC usage as a sensitive parameter [37,38]. Some studies pointed out fever as a significant symptom of NS [34,35,37]. One should bear in mind though that neonates, especially the preterm ones, cannot develop a febrile response similar to the term ones, due to a lack of immune system development. Hence, hypothermia is also a worrying clinical sign.
As for the laboratory findings, the immature to total (I:T) neutrophil count ratio, admit immature PMN count, immature to mature (I:M) neutrophil count ratio (I:M ratio) and total PMN count appeared as the most sensitive, whereas the PLT count, degenerative changes, total WBC count, I:T ratio and I:M ratio are the most specific variables. The I:T ratio, PLT count, I:M ratio and neutrophil fraction were found to be highly predictive laboratory signs. Three studies indicated that the I:T ratio may be the most reliable laboratory sign for prediction [31][32][33].

New Diagnostic Techniques
Besides the above-mentioned and discussed scores, new diagnostic techniques try to approach the challenge of the accurate and early diagnosis of NS. Machine learning is a subfield of artificial intelligence and uses particular electronic data to train and validate artificial Neural Networks in order to create diagnostic models and algorithms. A great number of data (signs, symptoms and especially monitoring and laboratory data) are gathered by septic and non-septic infants and assemble a computer-based algorithm that, based on precise coefficients of the variables, can diagnose or predict NS. The outcome of these models can be sepsis/no sepsis or/and a stratification of the possibility for NS. The main purpose of this approach is to introduce a more personalized basis for the diagnosis of neonatal sepsis, relying on precise and continuous information. A significant number of such studies reflect the interest on this new wrinkle, with impressive results in their diagnostic power [43][44][45][46][47][48]. Undoubtedly, this idea provides a pioneering perspective on the field, not only for the time being where NICUs count on continuous monitoring, but also for the near future when computing methodologies will play a crucial role in medical decision-making generally. In our review, these studies were excluded because we did set a strict limit to studies containing arithmetical scores only. However, the preexisting data presented in this manuscript can reveal trends that are missed by cursory or even more detailed analysis. The use of high-speed computing where multiple factors are incorporated could determine the optimal combination of the clinical and laboratory-based criteria. The reported sensitivity and specificity values in Table 4, along with the positive/negative predictive values, could be used in this type of approach. Instead of a single score, a two-tiered scoring system could be developed. One that integrates the intrinsic value of each defined measurement for early screening, and the latter for a definitive diagnosis. For instance, the early score is based on high sensitivity and for screening infants for NS using combined clinical and laboratory data suggested by computer analysis. Sensitivity is given priority over specificity when screening for NS so that false negatives could be avoided. However, this will lead to an increased level of false positives. Consequently, the definitive score would be used to rule out any false positives. This score would need to be highly specific but less sensitive. This approach should make the final determination of LOS more objective and lead to appropriate antimicrobial therapy.

Strengths and Limitations
This study is the first systematic review of all the existing scores containing all possible variables for the early diagnosis of LOS. We acknowledge though that our study also has limitations. The main one is that different definitions were used for LOS in the studies included. There were studies in which sepsis was defined as positive blood or a CSF culture, studies that defined sepsis as two positive blood cultures for the same organism and other studies in which a positive culture combined with clinical findings suggestive of sepsis were required for definition. In addition, as LOS definitions requires culture proven infection the true disease incidence might have been underestimated which in turn may limit the diagnostic accuracy of the scores. Moreover, we performed our search on the PubMed database, and hence, we might have missed some scores published in other databases. Finally, three of the scores in our review included few data on EOS; hence, their diagnostic ability may not be as accurate for LOS compared to the rest of the studies that have included only LOS cases in their predictive models.

Conclusions
Predictive scores for late-onset neonatal sepsis have the potential to represent a useful tool for early diagnosis and for guiding whether an individual patient needs antimicrobials. Inferentially, a future goal is to find the golden ratio between objective clinical, basic laboratory and other pivotal variables for composing the ideal score so as to improve the diagnostic accuracy and rationalize the antibiotic use. Lessons learnt from the studies until now will be vital for the introduction of new diagnostic scores not only for NICUs and Emergency Centers but also for low-source settings. So far, as we do not have the congruous score or model, we must continue the efforts to determine the optimum course of action without reckoning the thesis that any prediction model should play an adjunctive and supplementary role in medical judgment and not supersede the fundamental clinical opinions.