Faecal Calprotectin in Assessment of Mucosal Healing in Adults with Inflammatory Bowel Disease: A Meta-Analysis

Achieving mucosal healing in patients with inflammatory bowel disease is related to a higher incidence of sustained clinical remission and it translates to lower rates of hospitalisation and surgery. The assessment methods of disease activity and response to therapy are limited and mainly rely on colonoscopy. This meta-analysis reviews the effectiveness of using faecal calprotectin as a marker for mucosal healing in inflammatory bowel disease. Two meta-analyses were conducted in parallel. The analysis on the use of faecal calprotectin in monitoring mucosal healing in colonic Crohn’s disease is based on 16 publications (17 studies). The data set for diagnostic values of faecal calprotectin in ulcerative colitis is composed of 35 original publications (total 49 studies). The DOR for the use of faecal calprotectin in Crohn’s disease is estimated to be 11.20 and the area under the sROCis 0.829. In cases of ulcerative colitis, the DOR is 14.48, while the AUC sROC is 0.858. Heterogeneity of the studies was moderatetosubstantial. Collected data show overall good sensitivity and specificity of the faecal calprotectin test, as well as a good DOR. Thus, monitoring of mucosal healing with a non-invasive faecal calprotectin test may represent an attractive option for physicians and patients with inflammatory bowel disease.


Introduction
Inflammatory bowel disease (IBD) is an umbrella term for ulcerative colitis (UC) and Crohn's disease (CD), which are lifelong, severe conditions of the gastrointestinal tract, characterized by heterogeneous clinical presentation, a relapsing-remitting course and a wide spectrum of complications including extraintestinal manifestations [1][2][3].
Because of the complex and unclear pathogenesis of IBD, an effective causative treatment strategy of the disease is still missing. This, in turn, intensifies clinical trials not only of novel therapeutic agents, but also on therapeutic goals-especially important, considering the chronic and unpredictable course of IBD. A treat-to-target strategy aims to limit IBD progression and improve outcomes by adjusting therapy according to the achievement of predefined treatment response targets. The STRIDE (Selecting Therapeutic Targets in Inflammatory Bowel Disease) program identified the therapeutic targets for both UC and CD, as clinical/patient-reported outcome (PRO) remission and endoscopic remission. In addition, for CD, the resolution of inflammation-related findings on imaging in patients who cannot be adequately assessed with ileocolonoscopy [4,5]. The concept of deep remission (DR) is complex, and a standardised definition of DR is still missing [6]. Deep remission in UC refers to complete disease quiescence regarding endoscopic activity, rectal bleeding, and bowel movement [7].
Identification of mucosal healing (MH) as a new therapeutic goal, different from clinical remission, has revolutionised the approach to IBD management. One weakness of MH is the need to use endoscopy for its evaluation, which is an invasive, time-consuming, and expensive technique. Moreover, there is no commonly accepted definition of MH, although several scales or indices for objective classification of endoscopic findings have been devised. At the moment, the Mayo endoscopic score (MES) [8,9], is frequently usedin the evaluation of treatment efficacy in clinical trials, with MH defined as MES ≤ 1. However, the guidelines of boththe European Crohn's and Colitis Organisationand the Japanese Society of Gastroenterologyrestrict complete endoscopic remission to a score of zero (normal or completely healed mucosa) [10][11][12]. The only two indices that received formal validation in UC, are the Ulcerative Colitis Endoscopic Index of Severity (UCEIS) and the Ulcerative Colitis Colonoscopic Index of Severity (UCCIS). In the UCEIS, an index value corresponding to endoscopic remission has not been defined, although MH is most often described as 0-1 point. Endoscopic activity of Crohn's disease may be reliably scored using either the Crohn's Disease Endoscopic Index of Severity (CDEIS) [13] or the Simple Endoscopic Score for Crohn's Disease (SES-CD) [14]. Both scales have been prospectively validated and are highly reproducible, with excellent inter-observer agreement.
Indeed, a catalogue of benefits associated with achieving MH in IBD justifies repeated endoscopic examinations, as it encompasses a more favorable course of the disease and is related to fewer surgeries and hospitalisations, as well as with a long-term clinical remission. For instance, mucosal healing in UC is accompanied by a lower risk of immunosuppression, colectomy, and colitis-associated neoplasia [15][16][17][18]. In turn, MH in CD is related with less severe inflammation after five years, decreased risk of future steroid treatment, and lower rates of surgical resection [5,[19][20][21]. Successful control of intestinal inflammation may lead to improvement of extraintestinal manifestations of IBD, such as peripheral arthralgia [22], as well as relief in IBD-concomitant anxiety, depression, fatigue, and sleep disturbances [23]. Therefore, endoscopy, which together with pathological examination, serves as the key diagnostic tool in IBD, is further found useful in the monitoring of the disease activity. As mentioned before, endoscopy for MH evaluation is an invasive, time-consuming, and expensive technique. Non-invasive diagnostic biomarkers, able to at least limit the number of endoscopies, are searched for intensively. The urgent requirement for non-invasive indices in IBD is additionally augmented by the recent global growth in IBD incidence rates [24]. Furthermore, the current COVID-19 pandemic highlighted the significance of non-invasive point-of-care tests, that may be integrated with telemedicine in IBD patient care [25][26][27].
From among numerous potential biomarkers which have been evaluated in IBD, only faecal calprotectin (FC) has the potential to serve as an indicator of IBD activity [28].
Calprotectin is a cytosolic protein which is present in high concentrations in human neutrophils. Lower concentrations are found in monocytes and reactive macrophages. It has antimicrobial activity, as it sequesters zinc ions, helping to outcompete bacteria and yeast for this element and thus limiting their growth. The release of calprotectin is most likely related with death of neutrophils at the site of inflammation [29]. In the context of IBD, calprotectin release in the intestine translates into elevated concentrations in stool. As such, FC can be regarded as a measure of neutrophil infiltration of the intestinal mucosa and a marker of the overall severity of gut inflammation.
Faecal calprotectin is stable for 4-7 days [30] and the sampling does not involve uncomfortable procedures. In addition to specialised diagnostic laboratory protocols for FC determination, there are tests designed for low throughput point-of-care analysis, as well as self-testingthat can be performed at home by a patient. The latter assay kits are supported by a mobile device application, which helps to read out, calculate and communicate the result to a physician. This could be a useful tool for disease monitoring, prediction of relapses and for therapy optimisation.
Therefore, we conducted this meta-analysis to answer the questions: what is the diagnostic accuracy of the FC test as a biomarker of mucosal healing in IBD, and could it be applied in disease monitoring as well as in the assessment of the effectiveness of an ongoing therapy.

Materials and Methods
The search for studies on the use of faecal calprotectin as a biomarker of mucosal healing in ulcerative colitis and Crohn's disease utilised the following strategy: (faecal calprotectin) AND (mucosal healing). In each round, the latter term was replaced with one of the following: healing, endoscopy, colonoscopy, Crohn's disease, ulcerative colitis, IBD, or inflammatory bowel disease. Spelling variants were included. The publication dates were constrained to studies published between 1 January 2009 and 31 August 2020. The searched databases were PubMed and Scopus.
Query results were cross-searched and cleaned of duplicates. The selection process was composed of the following steps: title search, abstract screening, full text search, and eligible study data extraction. At each step, study selection was verified by a second investigator. The inclusion criteria were: original publication, publication in English language, mucosal healing was diagnosed according to specific standards (indices); studies provided sufficient data to reconstruct contingency matrices. Exclusion criteria were as follows: letters, editorials, comments, conference papers; paediatric or adolescent patients; non-IBD related; non-colonic Crohn's disease; studies performed on animals or tissue cultures; experimental studies.
We have extracted information from eligible studies based on the first author, year of publication, research object, definition of mucosal healing, population size, prevalence, cut-off values, FC measurement method, and true and false positives and negatives. In two cases authors were asked to rectify their data before being included in the data set (Table S1).
The analysis of data was performed with R (version 4.0.2, downloaded: 22 June 2020). The packages, meta (univariate random effects meta-analysis, funnel plots) and mada (sROC, bivariate effects analysis with Reitsma model) were used. The summary statistics (sensitivity, specificity, and DOR) were analysed by performing univariate analysis with the meta package. The "metabin" function was used to calculate the DOR from contingency tables with the inverse variance method for weights of individual studies. Forrest and funnel plots were generated with the meta package. The mada package for bivariate analysis was applied in the estimation of the sROC curve. For this purpose, the "reitsma" function was used; with correction for zero values in rows only. The function estimated variance components by the restricted maximum likelihood method. We calculated the pooled sensitivity, specificity, and diagnostic odds ratio with their 95% CI. Statistical heterogeneity was assessed with Cochrane's Q statistic and Higgins' I 2 value.

Results
This study was performed and reported according to the statement on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [31]. More details are presented in the supplementary PRISMA Checklist (Table S2). The two literature databases contained 2305 unique records which fit the search strategy. This formed a title search pool from which 562 publications were selected for the abstract screening. Out of these, 106 were found eligible after the screening of abstracts. Of that number, 52 publications contained information required for the meta-analysis ( Figure S1: data acquisition flow diagram). Eventually, data from 16 publications were included in the data set on the use of FC in Crohn's disease. The diagnostic test for FC in ulcerative colitis was described in 35 original publications. In several cases, more than one data set was extracted from a publication. This was the case for studies in which different mucosal healing definitions or different FC measurement kits were compared.
In Figure 1 the total effect sizes of all 17 applications are shown. The diagnostic OR of the random effect model is 13.8 (95% CI, 9.1 to 20.9) and the p-value < 0.0001. With the use of FC in the diagnosis of MH in Crohn's disease, the odds of apositive result among persons with the disease in remission is approximately 14 times higher than the odds for positive results among persons with still active disease. The Higgins' I 2 of all studies is 37%, and the p-value of the Cochrane Q statistic is 0.07, indicating that there is moderate heterogeneity.  Table 1 shows the calculated sensitivity of the FC diagnostics of mucosal healing, with grouping according to the applied definition. The summary sensitivity with the random effect model is 0.828 (95% CI, 0.769 to 0.874). The Higgins' I 2 = 51.7% and the Cochrane Q statistic is 36.18 (p-value = 0.0027) which suggest the existence of moderatetosubstantial heterogeneity. There is no significant difference in sensitivity between the SES-CD-based definitions of mucosal healing. The highest cumulative sensitivity was calculated for SES-CD ≤ 2, that is 0.841 (5 studies). The summary specificity of 17 compared applications was estimated to be 0.759 (95% CI, 0.683 to 0.821). The Higgins' I 2 = 80.2% and the Cochrane Q statistic is 75.41 (p-value < 0.0001) which is indicative of substantial heterogeneity between the studies. Figure 2 presents the sROC curve for the application of FC in the diagnosis of MH in Crohn's disease. The summary's AUC is 0.829 and the bivariate model-based estimation of DOR = 11.20.
To verify if there might be a publication bias, the DOR was plotted against the standard error of the effect estimate (data not shown). In the obtained funnel plot, three out of 17 included in this meta-analysis of applications of FC to diagnose MH, are found outside of the 95% (p-value 0.05) boundaries. These were Jusué et al. [38] with a High Range test kit, Lobatón et al. [34], and Iwamoto et al. [37]. A tendency for a higher DOR in studies with a smaller tested population was observed. Figure 3 shows DOR sizes of 49 applications of FC in the diagnosis of mucosal healing in ulcerative colitis. The summary diagnostic OR of the univariate random effect model is 16.0 (95% CI, 12.2 to 21.1) and the p-value < 0.0001. The calculated cumulative OR for the use of FC in the diagnosis of MH in UC means that the odds of a positive result among persons with the disease in remission is 16 times higher than the odds for positive results among persons with still active inflammation. The Higgins' I 2 of all studies is 61%, and the p-value of the Cochrane Q statistic is <0.0001, indicating that there is moderatetosubstantial heterogeneity in the compared studies.
In Table 2, the FC diagnostic test sensitivity and specificity values in UC calculated from contingency matrix data are presented. The summary sensitivity with the univariate random effect model is 0.804 (95% CI, 0.757 to 0.843). The Higgins' I 2 = 87.5% and the Cochrane Q statistic is 363.28 (p-value < 0.0001) which suggest the existence of substantial heterogeneity. The MES = 0 as the definition of MH was applied in 24 studies for which the specificity was estimated to be 0.798 (95% CI 0.743; 0.843). Eleven studies applied MES ≤ 1 and for those studies the test's specificity was 0.766 (95% CI 0.697; 0.823). The summary specificity of 49 compared applications of the FC was estimated to be 0.817 (95% CI, 0.780 to 0.848). The Higgins' I 2 = 78.6% and the Cochrane Q statistic is 209.42 (p-value < 0.0001), indicating substantial heterogeneity between the studies of FC in the detection of the mucosal healing.    Figure 4 shows a summary ROC curve (plot of sensitivity against 1-specificity) which was estimated with the use of the bivariate random effects meta-analysis model. The summary ROC curve was generated with 49 studies/applications of FC in the diagnosis of mucosal healing in ulcerative colitis. The summary test sensitivity is 0.783 (95% CI 0.738 to 0.822), while the specificity of the use of faecal calprotectin was estimated to be 0.799 (95% CI 0.769 to 0.829). The summary's AUC is 0.858 and the bivariate model-based estimation of DOR = 14.48. This is less than the estimation based on the univariate random effects model. To study possible bias in reporting, a funnel plot of the diagnostic OD against the standard error of the effect estimate was generated (data not shown). Eleven studies out of 49 included in this meta-analysis are found outside the 95% (p-value 0.05) boundaries of the funnel plot. The distribution was skewed towards the studies with a relatively small sample size and high DOR. Four of those are still within the boundaries. Most of the studies lie in the high significance region, suggesting that the asymmetry is not due the publication selection. Studies outside the funnel area: Schoepfers et al. [ [69], and all application studies contained in the publication of Stevens et al. [62].

Discussion
The diagnostic odds ratio is the main measure for comparison of the diagnostic tests in this meta-analysis. Different though similar results could be observed with the use of two DOR calculation models. The univariate effects model predicts the DOR to be higher than that estimated with the use of the bivariate model based on the Reitsma's method [77]. This is true for the application of FC as the biomarker of MH in both CD and UC. A DOR estimated with either method can be found within a confidence interval of the other. Both DOR values generally represent good diagnostic accuracy of the test. Thus, results of our meta-analysis support the use of faecal calprotectin as a non-invasive biomarker of mucosal healing in IBD.
The calculated summary DOR of faecal calprotectin-based determination of a patient's mucosal healing status in CD, shows that the odds of a positive result (low faecal calprotectin) among patients with the disease in remission is 13.8 (95% CI 9.1-20.9) times higher than the odds for positive results among patients with still active inflammation ( Figure 1). The Reitsma's model estimates the DOR to be 11.48 (Figure 3). The DOR in the case of FC applied in UC is higher: 16.0 and 14.48 (for univariate and bivariate models, respectively). Only in one study, by Yamaguchi et al. [65], did the estimated DOR's lower boundary of the 95% confidence interval drop below 1 (Figure 3). This was observed when MES = 0 was used as the definition of MH. On the other hand, the summary DOR could be artificially elevated/biased due to a high DOR (DOR = 1560), as obtained from the study of Nakov et al. [59]. It is worth noting four applications of FC as a marker of mucosal healing were presented in the publication by Stevens et al. [62]. Authors collected data from hundreds of UC patients participating in a phase 4 trial and performed a post-hoc analysis with two MH definitions (MES = 0, MES ≤ 1). These were applied to two datasets; endoscopic evaluation of mesalamine treatment at week 8, and endoscopic evaluation of mesalamine maintenance treatment effects at week 52. Based on these data from Stevens et al. [62], all four individual test DOR values were rather low (between 4.4 and 7.3). In fact, none of the DOR confidence intervals even reached the cumulative value (DOR = 16.0) estimated in this study. This can be interpreted as follows: DOR estimated for tests with larger populations might result in lower values of the summary DOR than estimated here. This notion is supported by our bias analysis; studies with high SE tend to have a higher estimated OR.
The overall good diagnostic test accuracy can be seen from the summary ROC of the FC diagnostic test. In patients with CD the area under the summary ROC covers 82.9% of the plot area, whereas in patients with UC the area under the sROC = 85.8%.
One of the issues which limits this meta-analysis is the lack of agreement on the definition of mucosal healing in IBD patients. In our analysis there was no difference of the DOR between studies classified by their MH definition (both for UC and CD; data not shown). The MH in ulcerative colitis in the compared studies was determined with the use of 6scales/indices which took 13 different values. The most commonly used was the Mayo Endoscopic Score. It was applied in 37 studies with the majority (24 studies) defining MH as MES = 0. In one study it was defined as MES ≤ 2 [46]. Other MH definitions used were the modified Baron Score (mBS = 0, mBS ≤ 1), the modified PICaSSO ≤ 3 [42], the Rachmilewitz Index (RI ≤ 1 [74], RI ≤ 2 [44], RI ≤ 4 [75]); the Simple Endoscopic Score for Ulcerative Colitis (SES-UC ≤ 3 [64]), and the Ulcerative Colitis Endoscopic Index of Severity (UCEIS = 0 [63], UCEIS ≤ 1 [42,60,76], UCEIS ≤ 3 [33]).
In the case of CD in included studies, MH was defined as SES-CD = 0, SES-CD ≤ 2, SES-CD ≤ 3, as well as CDEIS < 3, CDEIS ≤ 3, and CDEIS < 6 ( Table 1 and references therein). SES-CD was the most often used index (total 13 studies). This could be explained by the fact that SES-CD is a simplified, easier to calculate version of CDEIS. Travis et al. performed a validation test in which the SES-CD and CDEIS were compared. Both scores showed a strong positive correlation (r = 0.92) [78].
The studies included in this meta-analysis show the wide range of cut-off values (13.9 to 251 µg/g) which were chosen by authors studying FC as a marker of mucosal healing in ulcerative colitis. The most common thresholds are found between 150-250 µg/g (31/49 studies). Similarly, there was no universal cut-off value for FC in analysed studies on CD. The reported range is very wide: 54-918 µg/g FC, mean cut-off 205 µg/g. The cut-off between 150 µg/g and 250 µg/g was used in 7 out of 17 studies. The authors of studies with the extreme cut-off values used different definitions of endoscopic mucosal healing (SES-CD = 0 and CDEIS <6, respectively), as well as applied different types of tests for FC in stool samples (rapid test and ELISA, respectively) [33,38]. According to Moniuszko et al.,rapid FC tests which could be performed at the point-of-care yielded results in high agreement with ELISA-assays [79].
IBD cannot be seen as the exclusive cause of elevated faecal calprotectin. In a study by Meuccion et al., 36% of patients with normal colonoscopy had elevated FC levels, and the marker was elevated in a similar proportion of those with trivial endoscopic findings [80]. Moreover, faecal calprotectin above 50 µg/g has been found in 85% of patients with colonic cancer, 81% of those with "inflammatory conditions" (active CD or in remission, active UC, ischaemic colon), and 50% of those with UC in remission [80]. Furthermore, common medications are associated with increased FC. Lundgren et al. observed that among patients with a normal colonoscopy, FC above 50 µg/g was shown for 55% patients using acetylsalicylic acid, 24% using non-steroidal anti-inflammatory drugs, and 52% treated with proton pump inhibitors [81]. On the other hand, a recent multicenter study on pregnant women concluded that physiological changes due to pregnancy do not affect FC levels [82]. Therefore, Julsgaard et al. suggested the use of FC in the monitoring of IBD during pregnancy [82]. The lack of a universal FC cut-off is one of the biggest limitations of the use of this biomarker in either UC or CD. The issue has several factors influencing it: the methodological and technical differences between FC assays, the MH definition applied by physicians, and inter-individual variability in FC values. As for the latter, a potential explanation for the wide ranges in reporting cut-off values might be residual inflammation that may still remain at a microscopic (histologic; neutrophiles infiltrating the mucosa) level despite the remission and mucosal healing being reported by an endoscopist.
The following limitations of this meta-analysis should be taken in consideration. Firstly, due to apparent fluctuations of cut-off values for FC in analysed studies, this meta-analysis could not provide a clinical recommendation on one definite FC cut-off value which could help in the assessment of IBD activity without an endoscopy. Secondly, obvious heterogeneity existed across the included studies. The different MH definitions, methodological differences between FC assays, as well as sizes of tested populations might be the sources of heterogeneity in the meta-analysis.

Conclusions
In the light of the obtained results, a positive answer to the main question of the meta-analysis can be given. The DOR for the use of FC in CD is estimated to be 11.20. In the case of UC, the DOR is 14.48. The bivariate model applied to the estimation of sensitivity of FC in the detection of mucosal healing resulted in the summary sensitivity of 0.807 and 0.783 for CD and UC, respectively. Despite good sensitivity and specificity of the FC test in the determination of IBD activity, we suggest caution in clinical decisionmaking on a single FC result. Nevertheless, FC is the most widely used and the most supported tool in the assessment of mucosal healing in UC and CD without the need for endoscopy. The odds of confirming mucosal healing with a non-invasive FC test support their further use in the management of IBD. We recommend that future studies reporting on the topic should include data (e.g., contingency tables) for various cut-off values. Further research into the establishment of a universal cut-off for FC should accompany work into the interchangeability of FC diagnostic tests.

Conflicts of Interest:
The authors declare no conflict of interest.