Predicting Complicated Appendicitis in Children: Pros and Cons of a New Score Combining Clinical Signs, Laboratory Values, and Ultrasound Images (CLU Score)

Background: This retrospective study aimed to combine the clinical signs, laboratory values, and ultrasound images of 199 children with acute appendicitis in order to create a new predictive score for complicated appendicitis in children. Methods: The study included children who had clinical examination of abdominal pain (description of pain, anorexia, body temperature, nausea or vomiting, duration of symptoms), laboratory findings on admission (white blood cell, platelets, neutrophils, C-reactive protein), preoperative abdominal ultrasound, and histopathological report after an operation for appendicitis in their records during the period from January 2016 to February 2022. Results: According to the statistical analysis of the values using multivariate logistic regression models, the patients with appendiceal diameter ≥ 8.45 mm, no target sign appearance, appendicolith, abscess, peritonitis, neutrophils ≥ 78.95%, C-reactive protein ≥ 1.99 mg/dL, body temperature ≥ 38 °C, pain migration to right lower quadrant, and duration of symptoms < 24 h were more likely to suffer from complicated appendicitis. The new score was comprised of the 10 variables that were found statistically significant in the multivariate logistic model. Each of these variables was assigned a score of 1 due to the values that were associated with complicated appendicitis. Conclusions: A cutoff value of ≥4 has been a good indicator of the final score. The sensitivity with the usage of this score is 81.1%, the specificity 82.4%, the PPV 73.2%, the NPV approaches 88% and finally the accuracy is 81.9%. Also, the pros and cons of this score are discussed in this study.


Introduction
Acute appendicitis (AA) in the pediatric population is the most common reason of acute abdominal pain requiring surgical treatment. AA in children is still a diagnostic challenge for even experienced emergency physicians and pediatric surgeons [1,2]. The rate of errors in the primary diagnosis of AA ranges from 28% to 57% (children aged 2-12 years old) and is almost 100% in the ages <2 years, leading to an increased rate of complications, such as perforation, abscess, peritonitis, or sepsis [3,4]. Also, according to the literature, the percentage of negative appendectomies is stated as 3%, 10%, and sometimes 20% [2,5].
There is a tendency in recent years to not only establish a diagnosis of the AA, but to also distinguish preoperatively the uncomplicated cases of appendicitis from the complicated ones. The goal is either an early operation to prevent complications or conservative management to reduce the risk of a negative appendectomy [6]. Although the management of different forms of AA still remains controversial, many examinations including ultrasonography (US), computed tomography (CT) scan, magnetic resonance imaging (MRI),

Materials and Methods
In this study, which is a continuation of our previous study, we retrospectively examined the records of patients aged 0-14 years who were hospitalized in the Pediatric Surgery Department of Alexandroupolis University Hospital, Democritus University of Thrace, in order to design a new predictive score for complicated appendicitis [22]. The study included children who had clinical examination of abdominal pain (description of pain, anorexia, body temperature, nausea or vomiting, duration of symptoms) laboratory findings on admission (white blood cell (WBC), platelets (PLT), neutrophils (NEUT), C-reactive protein (CRP)) preoperative abdominal US, and histopathological report after an operation for AA (between January 2016 and February 2022). Exclusion criteria for our study were the absence of data related to these parameters in the child's record, as well as cases of non-acute appendectomy, histopathologically confirmed normal appendix, and carcinoid or other pathology. In total, 52 out of 251 children were excluded from this study's statistical analysis. Specifically, patients were excluded due to missing data (n = 14), non-acute appendectomy (n = 5), histopathologically confirmed carcinoid or other pathology (n = 3) non identified appendix in US (n = 25), and negative histopathology (n = 5). Methods of histopathological and US examination of our patients are well described in our previous study [22].
The study was conducted according to the guidelines of the Declaration of Helsinki, and the original protocol was approved by the Medical Ethics Committee of Alexandroupolis University Hospital (approval number 6809/19-02-2021).
Categorical variables were expressed as absolute and relative frequencies (n, %). Quantitative variables were presented as mean (±SD) values. All continuous variables were tested for normality using the Kolmogorov-Smirnov test. Hence, non-parametric Mann-Whitney U tests were applied to analyze differences between groups. A Pearson chi-square test was used for the comparisons of groups. A recipient-operator curve (ROC), with a cal-culation of sensitivity and specificity of the best cut-off and the area under the curve (AUC), were used to measure the diagnostic value of the continuous variables. Univariate and multivariate logistic regression analyses were applied to explore the sonographic findings, the symptoms, and laboratory findings associated with histopathological diagnosis of ACA. Odds ratios with 95% confidence intervals were computed from the results of the logistic regression analyses. For the evaluation of the predictive value of CLU, the receiver operating characteristic (ROC) curves were analysed. All statistical analyses were performed using IBM SPSS Statistics v25.0. The aforementioned statistical tests were performed at a 0.05 significance level.

Results
Demographic and clinical characteristics are presented in Table 1, which shows that 37.2% of the patients were diagnosed with ACA and 62.8% with AUA. The results showed that the majority of the sample were male (60.8%) with a mean (±SD) age 9.44 (±2.69) years. Patients with ACA were significantly younger than those with AUA (8.77 vs. 9.83, p = 0.031). The mean appendiceal diameter was significantly higher in patients with ACA (10.23 vs. 8.32, p < 0.001). A usual anatomical position was detected in the majority of the sample (89.9%). There was a significant difference between ACA and AUA patients in terms of distinct appendiceal wall layers, as the ACA patients had distinct appendiceal wall layers at a higher percentage (62.2% vs. 35.2%, p < 0.001). Of the patients, 93% had a non-compressible appendix. The majority of the patients with AUA had an appearance of target sign (70.4% vs. 41.9%, p < 0.001). No hypervascularisation (96%) and no lymphadenitis (65.8%) were observed in most patients. A higher percentage of ACA patients had appendicolith (41.9% vs. 13.6%, p < 0.001). Periappendiceal fat inflammation and free abdominal fluid were observed in most patients with ACA (73% vs. 49.6%, p = 0.001; 73% vs. 59.2%, p = 0.050, respectively). Diffuse free intraperitoneal fluid (DFIF) was detected at a higher percentage in the patients with ACA (20.3% vs. 9.6%, p = 0.034). Similar results were also found for free intraperitoneal fluid in the periappendiceal region (PFIF) (51.4% vs. 36%, p = 0.034). Free intraperitoneal fluid in Douglas's pouch (DPFIF) was found in 24.1% of patients. Abscess (17.6% vs. 0.8%, p < 0.001) and peritonitis (16.2% vs. 1.6%, p < 0.001) were more common in patients with ACA. Statistically significant differences were also found in inflammatory laboratory markers, with the WBC count, NEUT, PLT, and CRP all being significantly higher in the group of ACA patients (p = 0.010, p < 0.001, p = 0.032, p < 0.001, respectively). RLQ tenderness to percussion, coughing, and hopping were observed in all ACA patients (100% vs. 50.4%, p < 0.001), and pain migration to the right lower quadrant (RLQ) was observed in most ACA patients compared to AUA (45.9% vs. 19.2%, p < 0.001). A significant higher percentage of ACA patients had anorexia (73% vs. 56.8%, p = 0.023), body temperature up to 38 • C (44.6% vs. 28%, p = 0.017), and nausea/vomiting (40.5% vs. 26.4%, p = 0.038). All patients had tenderness over right iliac fossa (RIF). Finally, 43.2% of patients had symptoms lasting 24-48 h. No statistically significant differences were found in terms of gender, anatomical position, non-compressible appendix, hypervascularisation, DPFIF, lymphadenitis, and duration of symptoms between the two groups.
A cutoff value of ≥15.96 has proven to be a good indicator of the WBC. The corresponding values for the detection of the acute complicated appendicitis, were 51.4% for sensitivity, 61.6% for specificity, PPV 44.2%, and NPV approaching 68.1% (AUC = 0.61; 95% CI: 0.528-0.691; p = 0.010) ( Figure 2).  A cutoff value of ≥8.45 has proven to be a good indicator of the appendiceal diameter (mm). The corresponding values for the detection of acute complicated appendicitis, were 71.6% for sensitivity, 60% for specificity, PPV 51.5%, and NPV approaching 78.1% (AUC = 0.69; 95% CI: 0.609-0.769; p < 0.001) ( Figure 1). A cutoff value of ≥15.96 has proven to be a good indicator of the WBC. The corresponding values for the detection of the acute complicated appendicitis, were 51.4% for  A cutoff value of ≥78.95 has proven to be a good indicator of the ΝEUT. The corresponding values for the detection of the acute complicated appendicitis, were 70.3% for sensitivity, 51.2% for specificity, PPV 46%, and NPV approaching 74.4% (AUC = 0.64; 95% CI: 0.559-0.717; p < 0.001) (Figure 3).     A cutoff value of ≥1.99 has proven to be a good indicator of the CRP. The corresponding values for the detection of the acute complicated appendicitis, were 77% for sensitivity, 58.4% for specificity, PPV 51.8%, and a NPV approaching 80.9% (AUC = 0.74; 95% CI: 0.666-0.811; p < 0.001) ( Figure 5).
According to the multivariate logistic regression models, the patients with appendiceal diameter equal or higher than 8 CRP equal or higher than 1.99 (OR = 3.46, p = 0.018), body temperature equal or higher than 38 • C (OR = 3.15, p = 0.038), pain migration to RLQ (OR = 4.17, p = 0.009), and duration of symptoms lower than 24 h (OR = 7.76, p = 0.015) were more likely to suffer from ACA ( Table 2). The CLU score was comprised of the 10 variables that were found to be statistically significant in the multivariate logistic model. To construct the score, each of the above variables was assigned a value of 1 for those values that were associated with ACA. More specifically, if patients had an appendiceal appendix diameter equal or greater than 8.45, no appearance of target sign, appendicolith, abscess, peritonitis, NEUT equal or greater than 78.95, CRP equal or greater than 1.99, body temperature equal or greater than 38 • C, pain migration to the RLQ, and symptom duration less than 24 h, the CLU score was assigned a value of 10.  Table 2). The CLU score was comprised of the 10 variables that were found to be statistically significant in the multivariate logistic model. To construct the score, each of the above variables was assigned a value of 1 for those values that were associated with ACA. More specifically, if patients had an appendiceal appendix diameter equal or greater than 8.45, no appearance of target sign, appendicolith, abscess, peritonitis, NEUT equal or greater than 78.95, CRP equal or greater than 1.99, body temperature equal or greater than 38 °C, pain migration to the RLQ, and symptom duration less than 24 h, the CLU score was assigned a value of 10.   A cutoff value of ≥4 has proven to be a good indicator of the final score. The corresponding values for the detection of the acute complicated appendicitis, were 81.1% for sensitivity, 82.4% for specificity, PPV 73.2%, a NPV approaching 88%, and an accuracy of 81.9% (AUC = 0.879; 95% CI: 0.830-0.928; p < 0.001) ( Figure 6). A cutoff value of ≥4 has proven to be a good indicator of the final score. The corresponding values for the detection of the acute complicated appendicitis, were 81.1% for sensitivity, 82.4% for specificity, PPV 73.2%, a NPV approaching 88%, and an accuracy of 81.9% (AUC = 0.879; 95% CI: 0.830-0.928; p < 0.001) (Figure 6).

Discussion
The diagnosis of AA in children is not an easy task, especially in younger ones who often cannot describe their pain, or they present with nonspecific signs of abdominal pain.

Discussion
The diagnosis of AA in children is not an easy task, especially in younger ones who often cannot describe their pain, or they present with nonspecific signs of abdominal pain. Doctors usually face a difficulty in deciding the course of treatment and the time of surgical intervention. Delaying the diagnosis of AA may be associated with increased recovery periods and hospitalization costs, risk of in-hospital infections, and higher morbidity and mortality [23,24].
The diagnostic pathway for acute abdominal pain in the emergency department of hospitals varies and mostly depends on doctor's clinical experience. Recent studies tried to create algorithms to approach the right diagnosis, and finally, in 2015, the World Society of Emergency Surgery (WSES) organized in Jerusalem the first consensus conference producing evidence-based guidelines for the diagnosis and treatment of AA in adult patients, and in 2020 they updated the guidelines for adult and pediatric populations. In these studies, the usefulness of scores for the diagnosis of AA is discussed, but they recommend not making the diagnosis based on the already known scores alone, especially in children [25][26][27].
Recent studies try to differentiate preoperatively AUA from ACA, since the treatment for AUA is safe and can be non-operative, while the treatment of ACA is more complicated, especially in children younger than three years, as it is reported that the perforation rate of acute appendicitis is 80-100% for them, while it is approximately 38% in older children [28]. The accurate diagnosis of AA has been improved by using various scores [14]. However, it is still a challenge, especially in children, to predict preoperatively complicated appendicitis in order to decide the right management.
In the present retrospective study, a new score combining clinical, laboratory, and US findings is proposed to preoperatively distinguish AUA from ACA in children. This score comprised three clinical (body temperature equal or higher than 38 • C, pain migration to RLQ, duration of symptoms lower than 24 h), two laboratories (NEUT% equal or higher Diagnostics 2023, 13, 2275 9 of 13 than 78.95 and CRP equal or higher than 1.99), and five US (appendiceal diameter equal or higher than 8.45 mm, presence of an appendicolith, no target sign appearance, peritonitis, and abscess) findings. A CLU ≥ 4 yielded an accuracy of 81.9%, a PPV of 73.2% and a NPV of 88% to predict complicated appendicitis, with sensitivity and specificity reaching 81.1% and 82.4% respectively. All of these parameters selected for the CLU score are easily accessible in daily practice. To our knowledge, this score is one of the two scores related to children using ultrasound, clinical, and laboratory findings in order to distinguish AUA from ACA. The other one was designed by Hao et al., who combined the US findings with the PAS score and found that the combination raised the specificity of ultrasound and PAS score relative to if they were calculated individually [20]. There is a third study in which a scoring system is made to diagnose AA from non-appendicitis in children using imaging-laboratory and clinical criteria, but in this study, there is not any concern for the diagnosis of AUA and ACA [5]. We created this score in order to achieve greater accuracy in the preoperative differentiation of AUA from ACA after combining more parameters.
Surprisingly, and although statistically significant differences were found in WBC and PLT values, which were higher in the ACA patient group individually, and even though they are used as part of many appendicitis scores in the literature [2,[29][30][31], they were excluded in the multivariate analysis and the CLU calculation of this study. This could be partly justified by the fact that laboratory markers have limited diagnostic utility by themselves because they are elevated in many infectious diseases, especially in children.
In other studies, it is also reported that during the progression of inflammation of AA, the WBC count decreases after an initial higher value than the normal limits, so there is a possibility many of the children in this study with ACA would have WBC within normal limits [32]. Also, in our study, the ACA group contained many younger children, and it is widely accepted that the classical laboratory findings that seem to be the rule in older children or in adolescents are missing in younger children [3,30,33].
Other laboratory markers such as CRP seem to be useful for the diagnosis of AA, as CRP levels increase rapidly in the acute phase of inflammation. CRP alone as a marker does not have a high accuracy for diagnosing AA, but in combination with other markers its accuracy is much greater. Also, a low CRP value should be explained with caution if symptoms have only developed recently, as it seems to increase after 10-12 h of the initial symptoms [32,34,35]. In the present study, a CRP value equal or higher than 1.99 was statistically significant enough to be in the multivariate analysis of the model and became one of the indicators that make up the CLU score. In recent years, other laboratory markers have begun to be investigated, however, the published results are ambiguous. One of these is hyponatremia as a predictor marker for ACA [28]. Although this marker is easy to measure, we did not include it in our study because inflammation, dehydration, vomiting, and diarrhea, which may be present in other diseases besides AA, may cause sodium chloride deficiency.
Many of our patients (42.8%) had symptoms lasting 24-48 h. However, although someone could believe that the inflammatory response in AA is progressive (the longer the duration of symptoms the worse the severity of AA) [36], in our multivariate analysis it was observed that patients with duration of symptoms less than 24 h were more likely to suffer from ACA, hence why this was included in our new score. This result could be explained by the fact that in small children AA is an uncommon disease that has a varied presentation and complications that can develop rapidly, and in our study our patients with ACA were statistically younger than the patients with AUA [37]. Also, in our study, the ACA patients not only had perforated, gangrenous appendicitis or diffuse peritonitis, but also the majority had an appendicolith. Appendicolith in AA is referred as an independent prognostic risk factor although it is associated with appendiceal perforation. In recent literature, appendicolith appendicitis seems to have similar histopathological lesions as ACA, but most of the children with appendicolith appendicitis were associated with a shorter duration of symptoms [38]. In our study, in the ACA group there was a significant number of patients with appendicolith appendicitis (n = 13 in total n = 31 patients) with shorter duration of symptoms.
Many hospitals, especially in adult patients, rely on CT to diagnose appendicitis, as CT has a sensitivity ranged from 88 to 100% for adults and about 95% for the pediatric population. However, the use of CT for diagnosing ACA has lower sensitivity and specificity (62-81% respectively). A recent study describes that the combination of CT with laboratory and clinical findings increased the accuracy of CT for ACA [28,39]. However, the possibility of the future development of malignancy due to the radiation is the major problem with the use of CT as a diagnostic method in children. US is the method of choice for this condition, as it avoids radiation exposure, but its accuracy is widely varied as it depends on the operator's training and experience [5]. Even in a center with well-trained radiologists, there is a low sensitivity when using ultrasound to diagnose complications of the appendix. According to Carpenter's research, US has a sensitivity of 44.0%, a specificity of 93.1%, a PPV of 74.8%, and a NPV of 78.1% in diagnosing perforated appendicitis [40]. In our study, the problem with the operator's diagnostic ability has been overcome since, in the recognition of AA, a specific imaging approach protocol has been created by a specialist pediatric radiologist so that specific structures are sought during imaging by every operator. According to this, all US operators should work on classical real-time with gradually elevated compression US [41]. The US protocol started by "looking" at the whole abdominal cavity (free fluid and/or other pathology) leaving the right lower quadrant (RLQ) as the last part for evaluation (to avoid early onset of irritability). In this latter position (RLQ), graded compression was gently applied in order to gradually display the bowel loops and reveal the compressibility of the appendix. This compressibility or non-compressibility of the appendix was the major direct sign of normal appendix (or even when perforation could be present) or acute appendicitis, respectively. The maximum diameter of the appendix (normal < 6 mm) and the wall thickness (normal < 3 mm) were recorded and the presence of appendicolith (hyperechoic area with posterior shadowing) were evaluated. It was crucial to attempt to highlight the well-known target sign (hypoechoic fluid-filled lumen, hyperechoic mucosa/submucosa, and hypoechoic muscularis layer). Finally, images using colour Doppler US were obtained (hypervascularity in early stages of acute appendicitis, hypo-to avascularity in abscess and necrosis). This step-by-step approach can be followed in all examinations in order to record the so-called direct signs of acute appendicitis. Furthermore, indirect signs were recorded. These signs were free fluid surrounding the appendix, local abscess formation, increased echogenicity and/or uncompressible local mesenteric fat, enlarged local mesenteric lymph nodes, signs of secondary small bowel obstruction, and thickening of the peritoneum [22]. However, in the present study, all cases where the appendix was not recognized were removed.
The major strength of our study is that it is one of only two studies in the literature where the score was designed to distinguish complicated from simple appendicitis in children by combining clinical, laboratory, and ultrasound findings. However, there were some limitations that could be mentioned. Although in our hospital there is an imaging protocol for appendicitis that every radiologist must apply, there were 25 (9.9%) cases where the appendix was not recognized, and these were excluded from our retrospective study. Also, more cases and studies are needed in order to calibrate this score.

Conclusions
A new score for distinguishing ACA from AUA in pediatric patients with abdominal pain was designed. It combines clinical, laboratory, and US findings in order to increase US accuracy in those cases. The proposed score (named CLU) is simple to understand and remember. Its parameters are easily requested and it is cost-effective. It helps not only to diagnose AA, but it may also indicate the course of treatment for AA, as with this score someone can distinguish the ACA. With a value equal or greater than 4, its accuracy reaches 81.9% and a NPV of 88%.
More studies could be organized according to this score in order to calibrate the method and increase its accuracy.
Author Contributions: K.B. wrote the article and collected the data; K.K. designed and revised the article; A.G. analyzed the data and edited and revised the article; S.F. analyzed the data and edited the article; M.K. collected and analyzed the data; M.A. collected and analyzed the data and S.D. designed the study and revised and edited the article. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki 10 and the original protocol was approved by the ethics committee of the Alexandroupolis University Hospital (approval number 6809/19-02-2021).

Informed Consent Statement:
Written informed consent has been obtained from the patient(s) to publish this paper.

Data Availability Statement:
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.