2. Materials and Methods
2.1. Study Design and Participants
This was a prospective, single-center, observational cross-sectional study conducted at the Hepato-Gastroenterology Unit of the Naval Hospital of Athens, Greece. Screening of potentially eligible patients began in March 2025 among individuals scheduled to undergo clinically indicated colonoscopy. However, all patient enrollment, informed consent, and study-related procedures were conducted only after ethics approval was obtained in April 2025, with recruitment continuing through August 2025.
The source population consisted of adult patients with established inflammatory bowel disease (IBD) undergoing routine clinical evaluation during this period. All eligible patients meeting predefined inclusion and exclusion criteria were invited to participate following ethics approval.
Eligible participants included adult patients (≥18 years) with a confirmed diagnosis of IBD, including ulcerative colitis and Crohn’s disease, established on the basis of clinical, endoscopic, radiologic, and histopathologic criteria according to international guidelines [
19,
20,
21,
22]. Patients were eligible irrespective of disease duration, clinical disease stage, or prior treatment exposure, provided that they were clinically stable and able to undergo colonoscopy at the time of enrollment. Clinically stable was defined as absence of acute disease flare, hospitalization, or escalation of systemic corticosteroid therapy within the preceding four weeks.
Ongoing maintenance therapies, including 5-aminosalicylic acid (5-ASA), biologic therapy, and combination regimens, were recorded and incorporated into the analytical plan as covariates to account for potential immunomodulatory effects.
A control group consisted of individuals undergoing screening colonoscopy for preventive purposes, without known inflammatory, autoimmune, or malignant disease and with macroscopically and histologically normal colonic mucosa.
Exclusion criteria included pregnancy, active infection, known malignancy, severe organ failure, antibiotic use within four weeks prior to enrollment, and prior exposure to monoclonal antibodies targeting the IL-23 pathway within the preceding 12 months. Individuals with significant uncontrolled systemic comorbidities that could influence inflammatory biomarker levels were also excluded.
Participants were enrolled consecutively to reduce the risk of selection bias. The study was conducted and reported in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines. No missing data were observed for primary variables (histology, IL-10, IL-23, and CRP), reflecting a complete-case dataset achieved through prospective data collection and predefined data quality control procedures.
2.2. Ethical Approval
The study was conducted in accordance with the Declaration of Helsinki (1975, revised in 2013) and approved by the Institutional Review Board of the Naval Hospital of Athens (protocol code 3366, approved on 11 April 2025) and by the Institutional Review Board of Attikon University Hospital (protocol code 219, approved on 8 April 2025). Written informed consent was obtained from all participants prior to enrollment. All participant data were handled in accordance with applicable data protection and confidentiality regulations. Personal identifiers were removed prior to analysis, and data were anonymized to ensure participant privacy.
2.3. Blood Sampling and Processing
Peripheral venous blood samples were obtained immediately prior to colonoscopy and before any endoscopic manipulation. Serum was separated by centrifugation within two hours of collection and stored at −80 °C until analysis.
Samples were stored for comparable durations across study groups. The first collected sample was analyzed approximately six months after collection, and the overall median storage duration was approximately three months. All samples were analyzed in batch during the same study period to minimize inter-assay variability. Each sample underwent a single freeze–thaw cycle prior to cytokine quantification, a procedure considered acceptable for minimizing protein degradation based on prior methodological studies. Although storage durations were comparable across study groups, the potential impact of storage time on cytokine stability cannot be fully excluded and should be considered when interpreting the results.
2.4. Cytokine Quantification
Serum interleukin-10 (IL-10) and interleukin-23 (IL-23) concentrations were measured using commercially available sandwich enzyme-linked immunosorbent assay (ELISA) kits (Human IL-10 Quantikine ELISA Kit, catalog no. D1000B; Human IL-23 Quantikine ELISA Kit, catalog no. D2300B; R&D Systems, Minneapolis, MN, USA) according to the manufacturer’s instructions.
All samples and standards were measured in duplicate, and mean values were used for statistical analysis. Optical density was measured at 450 nm with wavelength correction using a Bio-Tek ELx800 microplate reader (BioTek Instruments, Winooski, VT, USA). Cytokine concentrations were calculated using four-parameter logistic (4PL) curve fitting with GEN5 software (version 3.11). Calibration curves were generated for each assay plate using manufacturer-provided standards and applied to sample quantification within the same plate to ensure consistency of measurement across plates.
The lower limits of detection (LOD) were 3.9 pg/mL for IL-10 and 16.3 pg/mL for IL-23. The dynamic quantifiable ranges were 7.8–500 pg/mL for IL-10 and 39–2500 pg/mL for IL-23.
Observed assay precision in our laboratory demonstrated intra-assay coefficients of variation (CVs) ranging from 5.3% to 7.2% and inter-assay CVs ranging from 8.2% to 10.1% for IL-10, and intra-assay CVs ranging from 5.5% to 7.8% and inter-assay CVs ranging from 7.1% to 10.5% for IL-23. These values were consistent with manufacturer-reported performance characteristics.
All cytokine measurements were performed in batch during the same study period, and laboratory personnel were blinded to clinical, endoscopic, and histologic data, minimizing the risk of measurement bias.
2.5. Handling of Detection Limits
Given the presence of left-censored cytokine measurements below the assay limit of detection (LOD), prespecified analytical strategies were applied with emphasis on minimizing overinterpretation of values below the detection threshold.
For IL-10, a substantial proportion of measurements fell below the LOD (3.9 pg/mL), particularly among patients with active histologic inflammation. Accordingly, IL-10 was analyzed primarily as a detectability-based variable (detectable ≥3.9 pg/mL vs. non-detectable <3.9 pg/mL), providing a clinically interpretable representation of the biomarker signal that does not rely on distributional assumptions.
For descriptive and exploratory purposes, values below the LOD were additionally imputed as LOD/2 (1.95 pg/mL), and IL-10 was examined as a log-transformed continuous variable. However, given the high proportion of non-detectable values, continuous analyses were interpreted cautiously and considered supportive rather than primary.
For IL-23, given the very high proportion of non-detectable values, analyses were based primarily on detectability status (non-detectable <16.3 pg/mL vs. detectable ≥16.3 pg/mL). Sensitivity analyses further examined ordinal categorization (<LOD; 16.3–38.9 pg/mL; ≥39 pg/mL) to account for the lower limit of quantification (LLOQ).
The use of LOD/2 imputation represents a pragmatic approach commonly applied in exploratory biomarker studies to retain censored observations; however, in the present study, results derived from continuous modeling are interpreted with caution due to the high degree of left-censoring. Detectability-based analyses were therefore prioritized for primary interpretation.
This approach avoids treating imputed LOD values as precise quantitative measurements in groups with near-complete non-detectability.
In the present dataset, IL-10 values were below the detection limit in 67.8% of IBD patients overall (97.1% in active histologic inflammation), whereas IL-23 values were below detection in 89.8% of the IBD cohort. These distributions informed the analytical strategy emphasizing detectability-based interpretation.
2.6. Endoscopic and Histologic Assessment
Colonoscopy was performed according to standard clinical practice. Bowel preparation quality was assessed using the Boston Bowel Preparation Scale (BBPS), and only procedures with adequate preparation were included in the analysis.
Mucosal biopsies were obtained during colonoscopy for histologic evaluation. In the control group, biopsies were collected from endoscopically normal-appearing colonic mucosa. In patients with inflammatory bowel disease, biopsies were obtained either from endoscopically inflamed mucosa or from macroscopically normal mucosa when no visible inflammatory lesions were present.
Endoscopic activity in ulcerative colitis was assessed using the Mayo endoscopic subscore, and in Crohn’s disease using the Simple Endoscopic Score for Crohn’s Disease (SES-CD).
Histologic evaluation was performed using the Geboes scoring system by two experienced gastrointestinal pathologists who were blinded to clinical, endoscopic, and cytokine data. Discrepancies were resolved by consensus.
Histologic activity was graded using the Geboes score (range 0–5.4), which evaluates architectural changes and inflammatory features, including chronic inflammatory infiltrate, lamina propria neutrophils and eosinophils, cryptitis, crypt abscesses, and epithelial damage, erosion, or ulceration. Histologic mucosal healing was defined a priori as a Geboes score < 2.0.
2.7. Treatment Classification
At the time of colonoscopy, patients were categorized according to ongoing maintenance therapy into four groups: no treatment, treatment with 5-aminosalicylic acid (5-ASA), advanced biologic therapy, or combination therapy consisting of a biologic agent plus azathioprine. No patients were receiving monoclonal antibodies targeting the IL-23 pathway at the time of enrollment.
2.8. Laboratory Measurements
C-reactive protein (CRP) was measured using standard automated laboratory methods at the hospital’s accredited clinical laboratory.
2.9. Outcomes
The primary analytical framework of the study focused on histologic disease status within the IBD cohort, while also examining cytokine patterns across the broader spectrum of disease states, including healthy controls.
Within the IBD cohort, the primary comparison evaluated cytokine levels according to histologic mucosal healing (Geboes score < 2) versus histologic activity (Geboes score ≥ 2). In parallel, analyses were performed using continuous histologic severity (Geboes score) to assess associations across the full spectrum of microscopic inflammation.
The healthy control group was included as an external reference to characterize baseline circulating cytokine levels and to contextualize cytokine patterns across health, histologic healing, and active disease. Controls were not included in primary within-cohort analyses but were incorporated in descriptive and comparative analyses across disease states.
Secondary outcomes included the association between cytokine levels or detectability and histologic status, the correlation of cytokine levels with continuous histologic severity, and the evaluation of independent associations in multivariable models. Additional exploratory analyses assessed cytokine behavior using alternative modeling approaches (continuous versus detectability-based) and according to treatment exposure.
Comparisons between ulcerative colitis and Crohn’s disease were performed as secondary exploratory analyses to assess potential disease-specific patterns.
2.10. Statistical Analysis
All analyses were conducted using R version 4.4.0 (R Foundation for Statistical Computing, Vienna, Austria). Continuous variables are presented as median (interquartile range), and categorical variables as number (percentage). All tests were two-sided, and p-values < 0.05 were considered statistically significant.
The primary analytical approach focused on describing cytokine patterns across histologic disease states and evaluating associations with both binary histologic status and continuous histologic severity.
Between-group comparisons of IL-10 concentrations were performed using Kruskal–Wallis tests followed by pairwise Mann–Whitney U tests with Holm correction. Associations with continuous Geboes scores were assessed using Spearman’s rank correlation. Group comparisons for IL-23 detectability were performed using Fisher’s exact test or Pearson’s chi-square test as appropriate, with Monte Carlo simulation applied for sparse contingency tables.
Multivariable logistic regression was used to evaluate independent associations with histologic mucosal healing. Given the limited number of outcome events and potential quasi-complete separation, penalized logistic regression using Firth’s correction (logistf package) was prespecified to mitigate small-sample bias. IL-10 was entered as a log-transformed continuous variable, and IL-23 as detectable versus non-detectable. Covariates were prespecified based on biological plausibility and clinical relevance (age, sex, CRP, treatment category), and no data-driven variable selection procedures were performed.
Receiver operating characteristic (ROC) curves were used as a secondary descriptive measure of discrimination. Areas under the curve (AUCs) with 95% confidence intervals were calculated using DeLong’s method, and optimal cut-off values were estimated using the Youden index. Internal validation was performed using bootstrap resampling (2000 iterations) to assess model stability and estimate optimism-corrected performance. Calibration was evaluated using calibration slope and intercept derived from logistic regression of observed outcomes on predicted probabilities, and overall predictive accuracy was quantified using the Brier score.
For ROC analyses, IL-23 was modeled using non-detectability as the healing-favoring direction to reflect the absence of circulating IL-23 in histologic remission, whereas detectability (≥16.3 pg/mL) was retained in regression models to preserve biological interpretability.
2.11. Sample Size Justification
The final sample comprised 59 patients with IBD, including 24 (40.7%) with histologic mucosal healing. Given the exploratory nature of the study and the absence of prior effect size estimates for circulating IL-10 in relation to histologic remission, a formal a priori sample size calculation was not feasible.
With 24 healing events and 8 parameters included in the multivariable model, the events-per-variable (EPV) ratio was 3.0. To address potential small-sample bias and quasi-complete separation, penalized logistic regression using Firth’s correction was prespecified.
For discrimination analysis, assuming a null AUC of 0.70 and an observed AUC of 0.871 for continuous IL-10, the available sample corresponds to a large apparent discrimination effect within this dataset. However, given the limited sample size and exploratory design, these estimates should be interpreted descriptively rather than as precise measures of performance.
Accordingly, the study was sufficiently powered to detect moderate-to-large effects within this cohort but should be considered hypothesis-generating. External validation in independent cohorts is required to assess generalizability.
2.12. Data Availability
The dataset generated and analyzed during the current study is available from the corresponding author upon reasonable request.
3. Results
3.1. Study Population and Histologic Classification
Potentially eligible participants were identified between March and August 2025 at the Hepato-Gastroenterology Unit of the Naval Hospital of Athens. Formal recruitment, informed consent, and all study-related procedures were conducted consecutively between April and August 2025, following institutional IRB approval.
The participant recruitment and inclusion process is illustrated in
Figure 1.
A total of 95 individuals were included, comprising 59 patients with IBD and 36 healthy controls. Among patients with IBD, 24 (40.7%) achieved histologic mucosal healing (Geboes < 2), while 35 (59.3%) had active microscopic inflammation.
Baseline demographic characteristics were comparable across study groups, including healthy controls and IBD subgroups. Age (
p = 0.18) and sex distribution (
p = 0.21) did not differ significantly, indicating no evidence of demographic imbalance. In contrast, CRP levels differed significantly across groups, with higher values observed in patients with active histologic inflammation (
p < 0.001). These data are summarized in
Table 1.
3.2. Endoscopic–Histologic Relationship
Endoscopic scores differed significantly according to histologic classification. In ulcerative colitis, the Mayo endoscopic subscore was lower in the histologic healing group (p = 0.0007). In Crohn’s disease, SES-CD scores were also significantly lower among patients with histologic healing (p = 0.0072).
Despite this association, histologic inflammation was observed in a subset of patients with low endoscopic scores, indicating incomplete concordance between endoscopic and microscopic disease activity (
Table 2).
3.3. Treatment Distribution
Treatment exposure at the time of colonoscopy showed a borderline association with histologic status (Fisher’s exact test with Monte Carlo simulation,
p = 0.0549; Cramér’s V = 0.359). Untreated patients were observed exclusively in the active histology group, whereas rates of biologic therapy were comparable between groups (
Table 3).
3.4. Serum IL-10 and Histologic Activity
Serum IL-10 concentrations showed apparent separation across disease states, primarily driven by clustering at the detection threshold in active disease (
Figure 2).
Median (interquartile range) IL-10 concentrations showed marked separation across groups, with higher values in healthy controls (12.1 [8.8–18.7] pg/mL), intermediate values in IBD patients with histologic mucosal healing (4.5 [3.5–5.4] pg/mL), and values clustered at the assay detection threshold in patients with active histology (1.95 [1.95–1.95] pg/mL) (Kruskal–Wallis H = 81.454,
p < 0.0001;
Table 4).
When examined across disease states, patients with IBD exhibited markedly lower circulating IL-10 concentrations compared with healthy controls (median 3.3 pg/mL [IQR 2.5–4.3] vs. 12.1 pg/mL [8.8–18.7]; Mann–Whitney U test, p < 0.0001), reflecting a progressive loss of IL-10 detectability from health to intestinal inflammation rather than a fully continuous quantitative gradient.
A substantial proportion of IL-10 measurements fell below the assay limit of detection. IL-10 values were below the lower limit of detection (3.9 pg/mL) in 67.8% of IBD patients overall, including 97.1% of patients with active histologic inflammation compared with 25% of those with histologic healing. The uniform values in the active group reflect clustering at the detection threshold rather than measurable biological variability, consistent with near-complete non-detectability in this subgroup.
Accordingly, IL-10 was interpreted primarily as a detectability-based marker, with detectable IL-10 predominantly observed in patients with histologic healing and non-detectable IL-10 strongly enriched among patients with active microscopic inflammation.
Continuous analyses of IL-10 were therefore considered exploratory and supportive, given the high proportion of left-censored values. Within this context, IL-10 concentrations showed a directional inverse association with histologic severity, primarily driven by detectability differences (Spearman ρ = −0.766, p = 1.61 × 10−12), reflecting directional association rather than a precise dose–response relationship.
In regression analyses, higher IL-10 levels (log-transformed, LOD-adjusted) were associated with histologic status. After multivariable Firth penalized adjustment for age, sex, CRP, IL-23 detectability, and treatment category, IL-10 remained independently associated with histologic mucosal healing (adjusted OR 47.25; 95% CI 5.79–385.46;
p = 0.00032) (
Table 5). The large effect estimates and wide confidence intervals reflect near-complete separation driven by the high prevalence of non-detectable IL-10 in active disease and should be interpreted as indicative of a strong directional signal rather than a precise quantitative effect estimate.
The consistency between detectability-based and continuous exploratory analyses supports the presence of a robust association, although this should not be interpreted as evidence of precise quantitative discrimination in the setting of high LOD-related clustering.
In analyses stratified by treatment exposure, circulating IL-10 concentrations did not differ significantly between patients receiving biologic therapy (advanced or combination regimens) and those receiving non-biologic treatment (Mann–Whitney U test, p = 0.54).
In sensitivity analyses excluding untreated patients, the association between IL-10 concentrations and histologic status remained consistent (p < 0.001), further supporting the robustness of the detectability-driven pattern across treatment strata.
3.5. Sensitivity Analysis: IL-10 Detectability
Given the high proportion of values below the assay limit of detection, IL-10 was interpreted primarily as a detectability-based biomarker, independent of continuous modeling and LOD-based imputation.
Detectable IL-10 (≥3.9 pg/mL) was present in 18/24 patients (75.0%) with histologic healing and in only 1/35 patients (2.9%) with active histology, demonstrating near-complete separation between histologic groups based on detectability status.
In univariable analysis, IL-10 detectability was strongly associated with histologic status (OR 102.0; 95% CI 11.4–913.9). In a reduced multivariable model adjusted for age, sex, and CRP, IL-10 detectability remained independently associated with histologic mucosal healing (adjusted OR 74.9; 95% CI 8.0–700.1; p = 0.00015).
The magnitude of these estimates reflects near-complete separation between groups and should be interpreted as indicative of a strong directional association rather than a precise effect size.
Continuous IL-10 analyses were therefore considered exploratory and supportive, given the high degree of left-censoring.
Discriminatory performance of IL-10 models is presented in
Section 3.6. The consistency of findings across detectability-based and continuous exploratory approaches supports the robustness of the observed association, while reinforcing detectability as the primary clinically interpretable signal.
3.6. IL-23 Detectability and Histologic Severity
A large proportion of IL-23 measurements fell below the assay limit of detection (16.3 pg/mL). Within the IBD cohort, 53 of 59 patients (89.8%) had IL-23 values below the detection threshold.
Detectable IL-23 (≥16.3 pg/mL) was observed in 6 of 35 patients (17.1%) with active histologic inflammation and in none of the patients with histologic healing or healthy controls.
Ordinal categorization of IL-23 concentrations (<LOD; detectable below the lower limit of quantification; ≥LLOQ) demonstrated a statistically detectable pattern across histologic groups (Monte Carlo exact test,
p = 0.006;
Table 6,
Figure 3), although this observation is based on a very small number of detectable cases. Notably, all detectable IL-23 values were confined to patients with active histologic inflammation.
Within the IBD cohort, IL-23 detectability was confined to patients with Geboes scores ≥ 3.1, corresponding to moderate-to-severe microscopic inflammation. No patient with Geboes < 3.1 had detectable IL-23 (Fisher’s exact p = 0.0338).
All detectable IL-23 values occurred in patients with Geboes ≥ 3.1 (6/6), whereas none were observed in those with lower histologic scores (0/25).
In direct comparisons of histologic healing versus active inflammation, IL-23 detectability did not demonstrate graded separation across disease states, consistent with its low prevalence outside more advanced histologic activity. These findings support a pattern of severity-dependent emergence of circulating IL-23 rather than a linear association across the full spectrum of microscopic inflammation. Given the limited number of detectable observations, these results should be interpreted as descriptive and hypothesis-generating rather than as evidence of independent clinical utility.
3.7. Discriminatory Performance
Receiver operating characteristic (ROC) analyses for discrimination of histologic mucosal healing (Geboes < 2) within the IBD cohort are shown in
Figure 4, with performance metrics summarized in
Table 7. These analyses were performed as a secondary descriptive assessment of cytokine behavior within the cohort.
Using the Youden index, the optimal IL-10 threshold within this dataset corresponded to 4.0 pg/mL (sensitivity 75.0%, specificity 100.0%). The combined model did not significantly improve separation between histologic groups compared with IL-10 alone (DeLong p = 0.055). Internal validation using bootstrap resampling (2000 iterations) indicated stable performance estimates across models.
3.8. Internal Validation and Calibration
In addition to discrimination, calibration characteristics of the multivariable model were evaluated. The apparent AUC was 0.954, with minimal optimism observed following bootstrap correction (optimism-corrected AUC 0.928). The Brier score was 0.093, indicating good overall model fit within the dataset.
Calibration analysis demonstrated a slope of 1.62 and an intercept of 0.16, suggesting acceptable agreement between predicted and observed outcomes, without evidence of substantial overfitting. The slope >1 may reflect shrinkage effects related to penalization (
Table 8;
Figure 5).
4. Discussion
In this prospective clinical cohort, circulating IL-10 and IL-23 exhibited distinct systemic associations with histologic mucosal status in inflammatory bowel disease. IL-10 demonstrated a detectability-driven inverse association with microscopic inflammation, whereas IL-23 detectability was confined to moderate-to-severe disease without graded discrimination across histologic states [
20,
21,
22]. Together, these findings suggest complementary cytokine patterns reflecting different dimensions of microscopic inflammation.
Previous studies have examined circulating cytokines in inflammatory bowel disease, including investigations of IL-10 and related regulatory pathways. For example, prior work has reported altered IL-10 signaling and circulating cytokine patterns in association with disease activity and immune dysregulation in IBD [
23,
24,
25,
26,
27,
28]. However, most available studies have focused on clinical or endoscopic activity rather than histologic mucosal status, and few have evaluated systemic cytokine patterns specifically in relation to microscopic healing. In this context, the present study extends existing literature by characterizing circulating cytokine behavior across the spectrum of histologic disease activity.
A key observation was the near absence of detectable IL-10 in patients with active histologic inflammation, indicating a threshold-like separation between histologic states rather than a fully continuous quantitative gradient. Although an inverse correlation with continuous Geboes score was observed, this should be interpreted as a directional association arising from detectability differences rather than as evidence of a precise dose–response relationship. In addition, IL-10 concentrations demonstrated apparent separation across clinical groups; however, this pattern is primarily driven by clustering at the assay detection threshold in active disease. These findings suggest that circulating IL-10 may reflect aspects of systemic regulatory immune tone rather than direct quantitative secretion from inflamed tissue. Given that IL-10 is produced by multiple regulatory immune populations—including regulatory T cells, macrophages, and tolerogenic dendritic cells—and functions to constrain antigen-presenting cell activation and downstream effector cytokine amplification, reduced circulating IL-10 detectability in active microscopic inflammation may be compatible with diminished systemic regulatory containment, although this interpretation remains speculative within the cross-sectional design [
29,
30].
Alternatively, reduced circulating IL-10 concentrations may also reflect assay-related sensitivity limitations or compartmentalization of cytokine production within the intestinal mucosa, reinforcing that circulating measurements should be interpreted as detectability-based signals rather than precise quantitative surrogates of mucosal activity [
31,
32]. Accordingly, these findings should not be interpreted as supporting the use of IL-10 as a quantitative clinical biomarker in its current form, but rather as a detectability-based signal requiring further validation.
The magnitude of the observed odds ratios should be interpreted with caution. The dataset demonstrated near-complete separation between histologic groups, as the vast majority of patients with active histologic inflammation had IL-10 concentrations below the assay detection limit. Under such conditions, even penalized estimation may yield large point estimates accompanied by wide confidence intervals. Consequently, the reported odds ratios should not be interpreted as reliable quantitative estimates of effect magnitude, but rather as indicators of the direction of association under conditions of near-complete separation.
An additional methodological consideration relates to the handling of IL-10 values below the assay detection limit. Because a large proportion of measurements in the active histologic inflammation group fell below the lower limit of detection, these values were imputed as LOD/2 according to the prespecified analytical strategy. Although sensitivity analyses using detectability-based modeling yielded directionally consistent results, the possibility that LOD-based imputation influenced the magnitude of statistical associations cannot be fully excluded [
33,
34,
35].
The high proportion of values below the assay detection limit, particularly for IL-23 and to a lesser extent IL-10, represents an important methodological constraint. In contrast to IL-10, circulating IL-23 did not demonstrate a graded association across the full histologic spectrum [
36,
37,
38,
39]. Detectable IL-23 was observed exclusively in patients with moderate-to-severe microscopic inflammation and was absent in histologic healing and in controls. All detectable cases occurred in individuals with Geboes ≥ 3.1, whereas none were observed below this threshold [
40,
41,
42,
43]. This pattern supports a severity-dependent threshold signal rather than a continuous biomarker across disease states. From an immunologic perspective, this may be compatible with IL-23 acting more as an amplifier of established inflammatory circuits than as a sensitive marker of early or residual microscopic activity.
The interpretation of IL-23 findings should be made in the context of the limited number of detectable cases. In the present cohort, IL-23 concentrations were below the limit of detection in the majority of patients, with only a small subset demonstrating detectable values. Consequently, IL-23 was modeled as a binary variable (detectable versus non-detectable), representing a pragmatic approach for heavily left-censored data. Importantly, the present dataset is not sufficient to establish the independent clinical or predictive value of IL-23, and findings should be interpreted as descriptive within this cohort.
Addition of IL-23 detectability to IL-10 in combined modeling resulted in a numerically higher AUC, with borderline statistical comparison (DeLong
p = 0.055). However, this difference did not reach statistical significance, and therefore the combined model should not be interpreted as providing a statistically significant improvement in discriminatory performance over IL-10 alone within this dataset. Accordingly, IL-23 should not be interpreted as providing incremental predictive value beyond IL-10 in the present analysis [
44,
45,
46,
47]. Within this dataset, IL-10 appeared to account for most of the separation between histologic states, whereas IL-23 reflected a higher inflammatory threshold rather than additive graded discrimination and should be interpreted as a threshold-dependent signal rather than a graded biomarker [
48,
49].
From a clinical perspective, histologic mucosal healing is increasingly recognized as a meaningful therapeutic endpoint, yet its assessment remains invasive and resource-intensive. In the current clinical context, fecal calprotectin remains an established non-invasive biomarker for assessing intestinal inflammation, although it does not consistently discriminate histologic healing from residual microscopic activity. In this regard, circulating cytokines such as IL-10 and IL-23 should be considered as complementary rather than competing biomarkers. The present findings do not demonstrate superiority over existing tools but instead suggest that multidimensional immune profiling may provide additional biological insight into the spectrum of histologic disease activity. Direct comparative and longitudinal studies will be required to determine their potential additive clinical value. The association between circulating IL-10 and histologic status observed here supports further investigation as a potential correlate of microscopic disease activity [
50]. These findings primarily support a mechanistic and associative interpretation rather than immediate diagnostic or prognostic application, given the cross-sectional design and lack of longitudinal validation.
A key consideration for clinical interpretation is the relationship of the present findings to established non-invasive biomarkers, particularly fecal calprotectin. In this context, the cytokine patterns observed in the present study may be viewed as complementary rather than competing biomarkers. Circulating IL-10 appeared to reflect a detectability-based dimension of systemic regulatory immune activity, whereas IL-23 detectability was associated with a higher inflammatory threshold. These features differ conceptually from fecal calprotectin, which represents a downstream marker of mucosal inflammation rather than upstream immune regulation.
Importantly, the present findings do not establish that circulating cytokines outperform or replace existing biomarkers. Rather, they suggest that multidimensional biomarker approaches integrating inflammatory and regulatory signals may provide additional insight into disease activity beyond conventional tools. However, direct comparative studies incorporating fecal calprotectin and other established markers will be required to determine the incremental clinical value of cytokine-based assessments.
The healthy control group was not included in regression modeling because the primary analytical objective focused on within-cohort associations between circulating cytokines and histologic status among patients with IBD. Inclusion of controls in regression analyses would have altered the outcome structure, effectively shifting the model toward discrimination between IBD and non-IBD states rather than between histologic healing and active microscopic inflammation. Therefore, healthy controls were used as an external reference to contextualize cytokine distributions across disease states, while regression analyses were restricted to the IBD cohort to preserve alignment with the primary biological question.
Importantly, the present findings derive from a cross-sectional analysis and therefore demonstrate associations only.
Sensitivity analyses excluding untreated patients did not materially alter the association between IL-10 and histologic status, suggesting that the findings were broadly consistent across treatment strata [
51,
52,
53]. Moreover, circulating IL-10 concentrations did not significantly differ between biologic and non-biologic treatment groups, suggesting that the observed association was not solely driven by therapeutic exposure.
Nevertheless, the potential influence of drug-specific mechanisms of action on circulating cytokine levels should be considered. Treatment exposure was explored descriptively and through sensitivity analyses; however, it was not included as a covariate in multivariable models due to the limited events-per-variable ratio and the associated risk of model overfitting and instability. Different therapeutic classes, including biologic agents and immunomodulators, may differentially modulate cytokine pathways, which could affect circulating IL-10 and IL-23 concentrations. Although no clear treatment-related differences were observed within this dataset, the relatively limited sample size and heterogeneity of treatment regimens may have reduced the ability to detect such effects. Therefore, residual confounding related to treatment-specific immunomodulation cannot be fully excluded.
Internal validation suggested reasonably stable model behavior with limited optimism after bootstrap correction. Calibration characteristics were acceptable within the present cohort. Although derived from a single prospective cohort, the present analysis addresses a distinct biological question focusing on systemic cytokine patterns in relation to histologic disease spectrum rather than previously reported clinical or endoscopic endpoints. The absence of an independent external validation cohort represents an important limitation, as the generalizability and potential clinical applicability of these findings cannot be established within a single dataset. Accordingly, the observed associations should be interpreted as cohort-specific and require confirmation in independent populations before any broader inference can be made. Taken together, these methodological constraints—including limited sample size, high proportion of values below the limit of detection, and potential model instability—should be considered jointly, as they may collectively influence the robustness and generalizability of the observed associations. In particular, the high degree of left-censoring for IL-10 and IL-23 constrains quantitative interpretation, supporting a detectability-based and threshold-oriented framework rather than continuous biomarker inference. Similarly, the limited number of detectable IL-23 observations restricts the ability to draw conclusions regarding its independent clinical or predictive value.
Strengths and Limitations
This study has several methodological strengths, although these should be interpreted in the context of its exploratory design and sample-size constraints. The study employed prospective enrollment with consecutive sampling, blinded histologic assessment using standardized Geboes scoring, and uniform pre-analytical handling of serum samples. Cytokine quantification was performed in duplicate under controlled laboratory conditions, and specimens were stored at −80 °C for limited and comparable durations across groups (median approximately three months), minimizing the likelihood of differential degradation bias. Statistical analyses incorporated detection-limit–aware modeling, Firth-penalized regression to address quasi-complete separation, and bootstrap resampling to assess model stability.
Several limitations warrant consideration. In addition, the overall sample size was relatively small, which may limit statistical power for detecting moderate associations and reduce the precision of effect estimates. First, given the cross-sectional design, causal inference and temporal relationships between circulating cytokines and histologic change cannot be established. Second, the single-center design and recruitment from a tertiary referral military hospital may limit external validity, and validation in independent multi-center cohorts with broader treatment distributions and disease phenotypes will be required to determine generalizability. No independent external validation cohort was available.
Serum cytokine concentrations may not fully reflect compartmentalized mucosal immune dynamics. In addition, circulating IL-10 should not be interpreted as a precise quantitative biomarker in this dataset, as a substantial proportion of values—particularly in active histologic inflammation—fell below the assay limit of detection, resulting in near-complete non-detectability in this subgroup.
Under these conditions, IL-10 is more appropriately interpreted as a detectability-based or threshold-like signal rather than a fully continuous variable, and statistical associations derived from continuous modeling should be interpreted as directional rather than quantitative. Accordingly, the primary interpretation of IL-10 in this study is based on detectability status rather than LOD-imputed continuous values.
The absence of systematic assessment of established non-invasive biomarkers, including fecal calprotectin, as well as additional cytokines involved in IBD pathogenesis (such as TNF-α, IL-6, IL-12, and IFN-γ), precluded direct comparison with established inflammatory markers and broader immune profiling approaches, limiting the ability to fully contextualize these findings within current clinical and biological frameworks.
The number of histologic healing events (n = 24) relative to the number of predictors included in multivariable modeling resulted in a modest events-per-variable ratio (approximately 3–4 events per predictor). To mitigate potential small-sample bias and quasi-complete separation, Firth-penalized logistic regression was prespecified. Nevertheless, residual model instability and the possibility of overfitting cannot be fully excluded, and effect estimates should therefore be interpreted with appropriate caution. In this context, model-derived performance metrics, including discrimination estimates, should be interpreted with particular caution, as they may reflect within-sample separation rather than stable predictive performance. Accordingly, these findings should be considered descriptive and hypothesis-generating rather than indicative of clinically applicable predictive accuracy.
A substantial proportion of cytokine measurements fell below the assay limit of detection, particularly for IL-23, resulting in partially censored biomarker distributions. Although pragmatic LOD-based imputation and detectability-based analyses were applied to address left-censored data, such approaches may influence variance structure and regression estimates. In particular, high degrees of left-censoring may introduce clustering at the detection threshold, limiting the ability to infer continuous biological gradients.
For IL-23 specifically, the very small number of detectable observations further limits statistical power and precludes reliable assessment of independent associations. Accordingly, IL-23 findings should be interpreted as descriptive and threshold-based within this cohort rather than as evidence of independent clinical or predictive utility.
Accordingly, the present findings should be interpreted as exploratory and hypothesis-generating, reflecting within-cohort patterns rather than established clinical biomarkers. Confirmation in larger independent cohorts will be required to determine the robustness and generalizability of these observations.
5. Conclusions
Circulating IL-10 and IL-23 exhibited distinct systemic associations with histologic mucosal status in inflammatory bowel disease. Reduced IL-10 detectability was associated with active microscopic inflammation, indicating a threshold-like separation between histologic states rather than a fully graded quantitative relationship within this cohort. In contrast, detectable IL-23 was observed only in a subset of patients with more advanced microscopic inflammation, consistent with a threshold pattern rather than a graded association and limited by the small number of detectable observations.
These findings provide exploratory evidence that circulating cytokine profiles may reflect complementary dimensions of microscopic disease activity, with IL-10 reflecting a detectability-based regulatory signal and IL-23 representing a severity-dependent inflammatory threshold rather than an independently validated biomarker within this dataset.
However, given the cross-sectional design, limited sample size, and the high proportion of values below the assay detection limit, these observations should be interpreted as hypothesis-generating. Further validation in larger independent cohorts and longitudinal studies will be required to determine the robustness, generalizability, and potential clinical relevance of these findings.
These observations may inform future biomarker development strategies by supporting the investigation of composite or multidimensional immune signatures that capture both regulatory and inflammatory axes of disease activity. In particular, approaches that integrate detectability-based and threshold-driven biomarker signals may provide improved characterization of heterogeneous inflammatory states.
These findings should be considered exploratory and cohort-specific, requiring validation in independent external populations before any clinical application and should not be interpreted as establishing independent clinical utility for IL-23 in its current form.