1. Introduction
Systemic lupus erythematosus (SLE) is a clinically and biologically heterogeneous autoimmune disease characterised by variable organ involvement, immune dysregulation, and disease course [
1,
2]. This heterogeneity limits the ability of conventional clinical or serological measures to fully resolve underlying molecular variation [
3,
4]. Peripheral-blood transcriptomic profiling has emerged as an attractive strategy for capturing disease-relevant differences, with increasing translational relevance for molecular classification, biomarker development, and disease-activity contextualisation [
3,
4,
5,
6].
Despite this promise, molecular subgroup structures identified in discovery datasets often lack stability across platforms, independent cohorts, and downstream validation layers [
3,
4,
5,
7]. This problem is particularly evident when feature selection, biological interpretation, and validation are not clearly separated. Single-cell data can improve cellular interpretability but should not automatically displace robust whole-blood discovery when the primary aim is cross-cohort stratification [
8,
9]. A translationally useful blood-based score, therefore, requires not only biological plausibility but also a study design that maintains a clear boundary between discovery, adjudication, and validation [
4,
6].
To address this challenge, we designed a strictly staged framework for peripheral-blood transcriptomic stratification in SLE. Discovery was restricted to public whole-blood bulk transcriptome cohorts. Candidate compression and final panel selection were adjudicated through independent public whole-blood validation cohorts. Single-cell analysis was used solely for biological anchoring, and local orthogonal validation comprised a PBMC RT-qPCR/flow-cytometric cohort and an expanded, additional whole-blood RT-qPCR validation set. Our aims were: to identify working whole-blood transcriptomic endotypes in SLE; to derive and formally lock a compact transcriptomic score; and to assess whether the locked score retained biological interpretability and local clinical relevance across distinct validation layers. The novelty of the present work lies not in the identification of previously unknown immune programmes—type I interferon-inducible and myeloid/neutrophil-associated signatures have been extensively characterised in SLE—but rather in the staged analytical design itself: final panel selection was adjudicated through public-stage validation before any local testing, single-cell data were reserved for biological anchoring only, and local validation cohorts were used for orthogonal confirmation without reverse refinement of the locked panel. This design was intended to reduce the risk of circular feature selection that commonly limits the transportability of transcriptomic biomarker studies.
2. Materials and Methods
2.1. Study Design and Cohort Architecture
This study used a staged multi-layer design integrating public transcriptomic resources with local orthogonal validation. Public datasets were retrieved from the Gene Expression Omnibus (GEO; with supplementary screening in BioStudies and ArrayExpress) and organised into: discovery whole-blood cohorts, public external validation whole-blood cohorts, a single-cell reference/resource layer, and a reserve layer. Local orthogonal validation comprised a PBMC-based RT-qPCR and flow-cytometric validation cohort and an expanded, additional whole-blood RT-qPCR validation set. The public resource set comprised three discovery cohorts, one primary validation cohort, three extended validation cohorts, and two single-cell reference/resource datasets; a reserve cohort was listed separately (
Figure 1 and
Table S1).
2.2. Public Cohort Preprocessing and BloodGen3 Aggregate-Level Profiling
Discovery was restricted to three public whole-blood bulk transcriptome cohorts (GSE72509, GSE112087, GSE49454), each processed separately with cohort-specific normalisation, log-transformation where appropriate, gene-symbol harmonisation, and duplicate-gene collapsing. No unified cross-cohort batch correction was applied prior to compact-score computation; instead, each gene was z-standardised to zero mean and unit variance within each cohort independently (cohort-wise z-standardisation), ensuring that the compact score reflects relative within-cohort expression variation rather than cross-cohort absolute differences (see
Supplementary Methods S1 for full preprocessing details). The final discovery analysis set comprised 233 samples (170 SLE, 63 HC); GSE49454 contributed 56 samples after locked analytic-subset filtering (architecture-level display: 78 samples). Blood transcriptional activity was quantified using the BloodGen3 aggregate-level framework [
10,
11,
12]. Single-sample aggregate scores were generated within each cohort and standardised relative to HC to produce cohort-specific healthy-control-referenced z-score matrices. Endotype discovery was performed exclusively in SLE samples using the aggregate-level matrix as clustering input. The k = 2 solution was retained as the working endotype configuration based on aggregate-level stability assessments (
Figure S2).
2.3. Candidate Source Pool, Panel Compression, and Compact Score Derivation
Panel derivation was constrained to a locked biologically informed source pool derived from the whole-blood discovery stage (6233 genes) and filtered by cross-cohort directional consistency. Candidate modules retained biological provenance labels (interferon, lymphoid B/T/NK, myeloid/monocyte, neutrophil-inflammatory) to preserve interpretability. Redundancy control used pairwise gene-gene correlation with greedy pruning. Stepwise compression through locked intermediate shortlists (12, 10, 8 genes) followed by leave-one-gene-out analysis yielded two locked 6-gene candidates: pre6-any (
MMP9,
MYL9,
MME,
HAL,
CTLA4,
CD40LG) and pre6-balanced (
MMP9,
MYL9,
HAL,
CTLA4,
CD40LG,
VPREB3). For score construction, each locked gene was z-standardised across samples within each cohort, and predefined direction coefficients were applied so that higher sign-aligned values represented the myeloid/neutrophil-inflammatory side of the axis. For the locked pre6-balanced panel,
MMP9,
MYL9 and
HAL were positively aligned, whereas
CTLA4,
CD40LG and
VPREB3 were negatively aligned. The compact score was calculated as the unweighted arithmetic mean of the six sign-aligned gene-level z-scores. The transparent derivation workflow is summarised in
Figure S3 and Methods S1.
2.4. Public Validation and Final Panel Locking
GSE138458 served as the prespecified primary validation cohort. At the architecture level, this cohort comprised 336 samples, including 312 SLE and 24 HC samples. The locked primary-validation compact-score analysis used an outlier-removed sample-level subset of 330 samples, including 307 SLE and 23 HC samples; within the SLE subset, the activity-related comparison included 156 high-activity and 151 low-activity samples. Six GSE138458 samples (1 HC and 5 SLE) had been designated as outliers in the source processing record and were excluded prior to normalisation; therefore, outlier removal preceded compact-score calculation and endpoint comparisons. Extended validation was performed in GSE65391 (main extended-validation adjudication cohort, one-subject-one-visit design), GSE110685 (cross-platform whole-blood RNA-seq cohort), and GSE61635 (supportive molecular cohort only; reliable sample-level activity annotations could not be restored, so it was excluded from principal final locking). Final panel locking was based on integrated public-stage validation evidence rather than discovery-stage performance alone. For the major extended-validation cohorts, retained endpoint domains were summarised as locked cohort-level adjudication outputs (total_score/rank) without additional post hoc reweighting after final locking. Additional detail is provided in
Tables S2 and S3, and Methods S1. A post hoc sensitivity analysis rerunning the GSE138458 primary-validation workflow using all 336 architecture-level samples was performed after panel locking and did not participate in the locking decision or alter the final locked panel definition.
2.5. External Immune-State Comparison in GSE224705
After the final panel had been locked, the locked pre6-balanced compact score was applied to the independent longitudinal SLE/LN transcriptomic cohort GSE224705 as an external immune-state comparison. This analysis was performed without model refitting, threshold optimisation, or gene-panel modification. Predefined IFN, neutrophil/myeloid, and lymphoid T/B-cell signature scores were calculated using prespecified marker sets. The objective was to evaluate whether the locked compact score simply reproduced a canonical IFN-high/low classification or instead captured an immune-state axis more closely aligned with myeloid–lymphoid variation. Compact-score states and IFN-high/low states were defined using median splits for descriptive overlap analysis. Sample-level and patient-level sensitivity analyses were performed to assess robustness. These analyses were not used for panel derivation, public-stage adjudication, final panel selection, or local validation. Additional methodological detail is provided in
Methods S3.
2.6. Single-Cell Biological Anchoring
Single-cell transcriptomic analysis was performed exclusively for biological anchoring and was not used for bulk discovery, public validation, or panel refinement [
8,
9]. GSE135779 served as the formal anchoring dataset. For each annotated cell type, panel-level scores were summarised within SLE and HC groups; the cell-type-specific panel shift was defined as Δscore = mean(SLE) − mean(HC). No cell-level inferential testing was applied. Anchoring results are presented descriptively in the
Section 3.
Methods S2 provides additional detail.
2.7. Local Validation Cohorts
The local PBMC validation cohort comprised 30 HC and 60 SLE patients sampled under a standardised clinical collection framework, with demographic, disease activity, serological, and treatment data recorded at PBMC sampling (
Table 1). An expanded, additional local whole-blood RT-qPCR validation set included 30 HC and 60 patients with SLE. This set was used for compartment-matched orthogonal validation of the locked score, without gene replacement, model refitting, or threshold optimisation. Participants were enrolled under the same predefined inclusion criteria and were independent of the PBMC cohort, with no overlapping participants. Local RT-qPCR validation used the final locked 6-gene panel (
MMP9,
MYL9,
HAL,
CTLA4,
CD40LG,
VPREB3) in PBMC samples. The compact score was calculated using the same sign-aligned z-score averaging rule as in public validation. Flow-cytometric anchoring correlated the score with prespecified monocyte/myeloid readouts (classical, intermediate, nonclassical, and CD169-positive monocyte proportions) using Spearman correlation.
For the local PBMC RT-qPCR validation, PBMCs were isolated from EDTA-anticoagulated peripheral blood using human peripheral blood lymphocyte separation medium (Tianjin Haoyang Biological Manufacture Co., Ltd., Tianjin, China; Cat. No. LTS1077) according to the manufacturer’s instructions. Total RNA from PBMCs was extracted using TRIzol reagent (Life Technologies, Carlsbad, CA, USA; Cat. No. 15596018CN). Reverse transcription was performed using PrimeScript™ RT reagent Kit with gDNA Eraser (Takara Bio Inc., Shiga, Japan; Cat. No. RR047A). Quantitative PCR was performed using NovoStart SYBR qPCR SuperMix Plus (Novoprotein Scientific Inc., Suzhou, China; Cat. No. E096-01B) on a PIKOREAL 96 real-time PCR system (Thermo Fisher Scientific, Waltham, MA, USA). PikoReal Software v2.1 was used for qPCR data acquisition and analysis. β-actin was used as the internal reference gene, relative expression was calculated using the 2−ΔΔCt method, and each reaction was performed in duplicate technical wells. Primer sequences are available from the corresponding author upon reasonable request.
For the expanded whole-blood RT-qPCR validation set, total RNA was extracted from EDTA-anticoagulated whole blood using RNAprep Pure Hi-Blood Total RNA Kit (Tiangen Biotech Co., Ltd., Beijing, China; Cat. No. DP443). The reverse transcription kit, qPCR reagents, instrument, internal reference gene, primer sequences, and relative expression calculation method were the same as those used for the PBMC RT-qPCR validation.
Flow cytometry was performed on whole-blood samples after red blood cell lysis. Cells were stained using a service-provider monocyte antibody panel including CD14, CD16, and CD169/SIGLEC1, together with viability staining, and incubated in the dark for 15 min. Classical monocytes were defined as CD14++CD16−, intermediate monocytes as CD14++CD16+, and nonclassical monocytes as CD14+CD16++. CD169-positive monocytes were quantified according to the predefined service-provider gating template. Flow-cytometric data were acquired on a CytoFLEX flow cytometer using CytExpert v2.4.0.28 (Beckman Coulter, Brea, CA, USA) and analysed using FlowJo v10.8.1 (BD Biosciences, Ashland, OR, USA). Because detailed antibody clone and catalogue information was not available in the archived author-side method record, flow-cytometric findings were interpreted as supportive exploratory evidence.
2.8. Statistical Analysis
Compact-score comparisons used unpaired
t-tests or Mann–Whitney U tests selected according to distributional characteristics. High disease activity was defined as SLEDAI > 6, and low/moderate activity was defined as SLEDAI ≤ 6. ROC analysis provided AUC estimates with 95% confidence intervals. Spearman correlation assessed score–clinical variable associations. Treatment-adjusted sensitivity analyses were adjusted for prespecified treatment exposure variables as reported in
Tables S5 and S7. Benjamini–Hochberg false discovery rate correction was applied across PBMC-based statistical tests as a supplementary transparency check (
Table S4). For the external GSE224705 immune-state comparison, Spearman correlations were calculated at both sample and patient levels, and Fisher’s exact test was used to evaluate overlap between compact-score states and IFN-high/low states. Exploratory response-stratified analyses in GSE224705 were reported descriptively in
Supplementary Table S10 and were not used for treatment-response prediction or panel selection. Because the compact score was locked through public validation before local testing, local analyses were interpreted as prespecified supportive assessments. All tests were two-sided;
p < 0.05 was considered statistically significant.
All statistical analyses and figure generation were performed in R v4.5.1. Key R packages included GEOquery v2.76.0, GSVA v2.2.0, ConsensusClusterPlus v1.72.0, limma v3.64.3, edgeR v4.6.3, Seurat v5.4.0, ComplexHeatmap v2.24.1, circlize v0.4.17, patchwork v1.3.2, ggpubr v0.6.3, data.table v1.18.2.1, tidyverse v2.0.0, pROC v1.19.0.1, ggplot2 v4.0.2, readxl v1.4.5, showtext v0.9-8, showtextdb v3.0, sysfonts v0.8.9, dplyr v1.2.0, RColorBrewer v1.1-3, scales v1.4.0, and tibble v3.3.1. Public code and additional analysis dependencies are available in the code repository cited in the Data Availability Statement.
2.9. Post Hoc Benchmarking Analyses
Post hoc benchmarking analyses were performed to contextualise the locked compact score against established immune and clinical readouts. These analyses did not alter the locked final panel, score-construction formula, endpoint definitions, or public-stage adjudication framework. The locked compact score remained the pre6-balanced panel, calculated within each cohort by gene-wise standardisation across samples, locked sign alignment, and unweighted averaging of the six aligned values.
Benchmarking analyses included comparisons against standard clinical markers in the local PBMC cohort, a fixed 6-gene IFN score in GSE224705 (IFI27, IFI44L, IFIT1, ISG15, MX1, and OAS1), individual locked panel genes, and reduced myeloid- and lymphoid-component scores. These analyses were used only to assess robustness, interpretability, and non-redundancy relative to established readouts, and did not participate in panel selection, public-stage locking, threshold optimisation, or reverse refinement of the final score. No formal cross-validation procedure was applied during discovery-stage feature selection. Instead, overfitting risk was addressed by restricting the candidate source pool through cross-cohort directional consistency, locking the final panel through independent public-stage validation before local testing, and evaluating robustness through withheld local validation and post hoc reduced-model benchmarking.
2.10. Ethics Statement
The local validation samples were collected under an approved SLE clinical study protocol. The study was approved by the Medical Ethics Committee of the First Affiliated Hospital of Anhui University of Chinese Medicine (approval No. 2024AH-08; approved 31 January 2024; validity period 31 January 2024 to 31 January 2026). Participants in the local PBMC validation cohort and the expanded whole-blood RT-qPCR validation set were prospectively recruited between 1 February 2024 and 30 January 2025 at the First Affiliated Hospital of Anhui University of Chinese Medicine. All participants provided written informed consent before sample collection. Publicly available transcriptomic datasets were obtained from GEO, BioStudies, and ArrayExpress; the authors had no access to information that could directly identify individual participants in these public datasets.
4. Discussion
In this study, we established a staged framework for peripheral-blood transcriptomic stratification in SLE in which whole-blood discovery, public-stage panel adjudication, single-cell biological anchoring, and local orthogonal validation served distinct and non-interchangeable roles. We identified two working whole-blood endotypes characterised by lymphoid versus myeloid/neutrophil-inflammatory polarisation and translated that structure into a compact 6-gene score locked through public validation before any local testing [
3,
4,
5]. The expanded, additional whole-blood RT-qPCR validation set strengthened the validation structure by providing compartment-matched support, including strong activity-related associations with SLEDAI, for a score originally derived and adjudicated in public whole-blood cohorts.
The panel-selection process illustrates why staged public locking matters. In pooled discovery data, pre6-any showed a marginal advantage over pre6-balanced (AUC 0.961 vs. 0.952; Cohen’s d 2.228 vs. 2.170). Had panel selection been driven by discovery-stage optimisation alone, pre6-any might have been carried forward as the leading signature. However, when evaluated in the predefined outlier-removed primary-validation subset of GSE138458 and across major extended-validation cohorts, the integrated public-stage adjudication favoured pre6-balanced. A post hoc all-336 sensitivity analysis in GSE138458 showed that this primary-cohort preference was not fully stable to inclusion of the six source-designated outliers, with reversal of the case–control endpoint and only weakly similar activity-related direction. Locking pre6-balanced, therefore, reflected integrated public-stage adjudication rather than a single-cohort or single-endpoint conclusion, and the final panel decision should be interpreted accordingly rather than as uniquely driven by the GSE138458 primary-validation analysis. This underscores the importance of prioritising multi-cohort external evidence when deriving compact molecular classifiers intended for downstream translational use [
4,
6].
A key biological interpretation is that the observed whole-blood structure reflects a myeloid–lymphoid activity axis rather than a formally interferon-defined split. This requires careful qualification. Blood transcriptomic studies in SLE have long identified type I interferon-inducible and granulopoiesis/neutrophil-associated signatures as major peripheral-blood immune programmes [
13,
14]. The framework rests on aggregate-level BloodGen3 profiling, and different analytical approaches may yield different interpretations. Formal testing confirmed a statistically detectable interferon difference between discovery clusters, but this was numerically smaller than myeloid/monocyte and neutrophil/inflammatory contrasts. IFN-high and myeloid-high immune states are partially co-expressed in SLE, and the present framework does not argue that interferon signals are absent or unimportant—rather, they did not emerge as the dominant axis of unsupervised endotype separation [
15,
16,
17]. The myeloid–lymphoid interpretation was supported descriptively across validation layers: the locked score mapped most prominently to monocyte/myeloid and progenitor-like compartments in single-cell anchoring and showed concordant relationships with monocyte-related immunophenotypes in the local PBMC cohort [
8,
18]. This interpretation was further supported by the external GSE224705 immune-state comparison, in which the locked compact score showed only weak association with an IFN signature and near-even overlap with IFN-high/low states, while aligning strongly with neutrophil/myeloid and inversely with lymphoid T/B-cell signatures. This analysis was intended to contextualise the immune-state meaning of the locked score and should not be interpreted as treatment-response prediction. Post hoc benchmarking in GSE224705 indicated that a fixed 6-gene interferon score (
IFI27,
IFI44L,
IFIT1,
ISG15,
MX1,
OAS1) showed stronger association with SLEDAI-defined disease activity (ρ = 0.417) and stronger SLE/LN-versus-healthy-control discrimination (AUC = 0.783) than the locked compact score in this cohort; these results are reported transparently in
Supplementary Table S11. The two scores were only weakly correlated (ρ = 0.102), supporting complementary rather than redundant biological information. The compact score should therefore not be interpreted as a replacement for canonical interferon activity measures; its value lies in capturing a myeloid–lymphoid activity axis that is largely non-redundant with IFN-high/low classification.
The local validation results support the biological and disease-context relevance of the locked compact score. At the PBMC level, the score discriminated SLE from HC, separated high from low/moderate disease activity, and showed coherent associations with SLEDAI and complement consumption—associations that remained significant after treatment adjustment. It also correlated with monocyte-related immunophenotypes, including the CD169-positive monocyte compartment [
19,
20,
21]. These findings do not establish clinical utility or suggest the score should replace standard serological indices. Post hoc benchmarking against standard clinical markers in the local PBMC cohort (
Supplementary Table S11) showed that the compact score (SLEDAI ρ = 0.281, high versus low/moderate activity AUC = 0.719) captured activity-related information numerically within a comparable range to C4 (ρ = −0.422, AUC = 0.731) and C3 (AUC = 0.754), whereas anti-dsDNA showed no meaningful activity-related association in this setting (ρ = 0.008, AUC = 0.503). These results do not establish clinical superiority or equivalence and are reported as post hoc contextualisation only. Because the compact score is calculated using cohort-wise z-standardisation, clinically interpretable individual-level thresholds cannot be derived from the current data. Future clinical translation would require prospective validation with fixed calibration references, formal evaluation of additive predictive value over standard clinical parameters, and assessment of score behaviour across distinct disease manifestations. The expanded whole-blood RT-qPCR validation set provided compartment-matched corroboration: the locked score showed strong SLE-versus-HC discrimination and a strong positive association with SLEDAI across the expanded set. Although the C3 association observed in the PBMC layer was not reproduced in the expanded whole-blood set, the activity-related results support the view that the compact score captures a disease-context signal in the whole-blood compartment. The difference in activity-related discrimination between the PBMC layer (AUC = 0.719) and the expanded whole-blood validation set (AUC = 0.918) warrants comment. Several complementary explanations are plausible. First, the compact score was derived from whole-blood discovery cohorts, and the whole-blood compartment represents the discovery-matched measurement context; PBMC sampling introduces a cell-separation step that depletes granulocytes and alters the relative proportions of myeloid cell populations, potentially attenuating the myeloid-dominant signal. Second, the myeloid-component genes in the locked panel—particularly
MMP9 and
MYL9—are highly expressed in granulocytes and monocytes, which are substantially enriched in whole blood relative to PBMC fractions. Third, RT-qPCR in the whole-blood compartment may capture a broader myeloid transcriptional signal. Together, these considerations suggest that the performance difference is plausibly related to compartment-specific biological and technical factors; however, cohort-specific effects cannot be fully excluded and are acknowledged as a limitation. In the expanded whole-blood cohort, several individual genes showed borderline or non-significant between-group differences, whereas the locked composite score showed stronger activity-related performance. This pattern should not be interpreted as local re-selection evidence for individual genes. Rather, the score was designed as a predefined sign-aligned multi-gene axis, and its interpretation depends on the composite signal rather than on each gene reaching significance in every validation compartment. Post hoc benchmarking against individual panel genes and reduced composite scores (
Supplementary Table S12) showed that the strongest simplified readouts approached the performance of the full compact score in some local validation tasks. In the expanded whole-blood cohort, the full compact score retained the highest activity-related AUC and strongest SLEDAI correlation among all tested readouts (AUC = 0.918, ρ = 0.819), although
MMP9 alone (AUC = 0.899, ρ = 0.617) and the myeloid half-score (AUC = 0.907, ρ = 0.711) approached its performance, indicating that the myeloid component contributes substantially to the whole-blood activity signal. In the PBMC compartment, individual gene performance was more heterogeneous, and no single gene or reduced score uniformly reproduced the cross-layer rationale of the full locked panel. These findings indicate that the full score provides an interpretable composite representation of the pre-specified myeloid–lymphoid axis. Accordingly, the added value of the full composite score should not be interpreted as uniform predictive superiority over every simplified or single-gene readout in all compartments. Rather, its value lies in providing a pre-specified, locked, and biologically interpretable composite representation of the myeloid–lymphoid axis across validation layers. Importantly, these post hoc results should not be interpreted as evidence for retrospective panel revision; the panel was locked through public-stage adjudication before any local testing. Robustness analyses further suggested that the whole-blood score was not driven by a single sample or by a single locked gene, although prospective validation remains necessary.
Several limitations should be acknowledged. First, public datasets differed in platform, sampling context, and phenotypic annotation depth, constraining clinical harmonisation across validation layers. Second, the PBMC cohort represents a different blood compartment from the whole-blood discovery setting, and included substantial treatment exposure; local PBMC findings should therefore be interpreted as biological corroboration rather than direct replication of the whole-blood discovery layer. Although treatment-adjusted sensitivity analyses confirmed that associations with SLEDAI and complement levels remained statistically significant after adjustment for prespecified treatment variables, residual confounding cannot be excluded. In particular, glucocorticoids and other immunosuppressive agents are known to modulate myeloid transcriptional programmes, and their influence on the compact score signal cannot be fully disentangled in the current cross-sectional design. Larger prospective studies with treatment-stratified designs will be required to clarify the relative contributions of disease activity and treatment exposure to the observed transcriptomic axis. Third, the single-cell layer was restricted to biological anchoring and cannot serve as an independent panel for adjudication. Fourth, serological correlations were not uniformly reproduced across compartments—the C3 association observed in the PBMC layer was not confirmed in the expanded whole-blood set, which underscores the compartment-specific nature of complement associations and the need for careful interpretation across different measurement contexts. Fifth, the GSE224705 analysis was exploratory and external to the locking framework; response-stratified analyses by treatment stratum were underpowered and did not support treatment-response prediction or drug-selection claims. Sixth, in GSE138458, the predefined primary-validation analysis used a source-designated outlier-removed subset (n = 330). A post hoc all-336 sensitivity analysis showed that the GSE138458 case–control preference for pre6-balanced was not maintained after including these six samples, although the activity-related comparison remained weakly directionally similar. This underscores that the final panel-locking decision should be interpreted as integrated public-stage evidence across multiple cohorts rather than reliance on a single primary-validation endpoint. Seventh, formal cross-validation procedures were not applied within the discovery-stage feature selection process; although the multi-cohort public-stage adjudication and withheld local validation were intended to mitigate overfitting risk, a degree of analytical flexibility inherent in the multi-step compression workflow cannot be entirely eliminated. Finally, although the compact score showed coherent associations with disease activity and monocyte-related phenotypes, its value for longitudinal immune-state monitoring and prospective clinical stratification remains to be established. Because the compact score was calculated using cohort-wise standardisation, it should be interpreted as a research-grade relative activity score rather than a ready-to-use individual clinical assay. Future prospective studies should establish fixed calibration references and clinically interpretable cut-offs. In addition, although the framework integrates multi-layer transcriptomic, single-cell, and clinical information, it should not be interpreted as a formal systems-biology model in the strict sense. The study does not model dynamic feedback loops, regulatory networks, or mechanistic causal interactions. Rather, it represents a staged, systems-oriented biomarker-reduction framework that uses public validation and biological anchoring to derive a compact, interpretable transcriptomic readout.
5. Conclusions
We established a staged whole-blood transcriptomic framework for SLE and derived a compact myeloid–lymphoid activity score (pre6-balanced: MMP9, MYL9, HAL, CTLA4, CD40LG, VPREB3) supported by public validation, single-cell biological anchoring, and local orthogonal validation. Compartment-matched whole-blood RT-qPCR validation further supported score reproducibility and activity-related relevance in the same blood compartment as discovery, including a strong positive association with SLEDAI. However, the compact score should be interpreted as a cohort-normalised, research-grade relative activity readout rather than a calibrated clinical assay. The present study did not establish fixed calibration references, clinically interpretable thresholds, or decision rules for individual patient classification. In addition, although staged public validation, withheld local validation, bootstrap resampling, leave-one-out analysis, and reduced-model benchmarking mitigate overfitting concerns, formal cross-validation was not applied during discovery-stage feature selection. Post hoc benchmarking also showed that simplified or single-gene readouts may approach the full score in selected settings; therefore, the added value of the composite score should be interpreted primarily as a locked, biologically interpretable representation of a myeloid–lymphoid transcriptomic axis rather than as uniform predictive superiority over simpler alternatives. Further prospective, multicentre, treatment-aware studies with fixed calibration references are required to determine whether this score has value for longitudinal immune-state monitoring or prospective clinical stratification in SLE.