In precision oncology research, achieving joint modeling of tumor grading and treatment response, together with interpretable mechanism analysis, based on multimodal medical imaging and clinical data remains a challenging and critical problem. From a sensing perspective, these imaging and clinical data can be
[...] Read more.
In precision oncology research, achieving joint modeling of tumor grading and treatment response, together with interpretable mechanism analysis, based on multimodal medical imaging and clinical data remains a challenging and critical problem. From a sensing perspective, these imaging and clinical data can be regarded as heterogeneous sensor-derived signals acquired by medical imaging sensors and clinical monitoring systems, providing continuous and structured observations of tumor characteristics and patient states. Existing approaches typically rely on invasive pathological grading, while grading prediction and treatment response modeling are often conducted independently. Moreover, multimodal fusion procedures generally lack explicit structural constraints, which limits their practical utility in clinical decision-making. To address these issues, a grade-guided multimodal collaborative modeling framework was proposed. Built upon mature deep learning models, including 3D ResNet-18, MLP, and CNN–Transformer, tumor grading was incorporated as a weakly supervised prior into the processes of multimodal feature fusion and treatment response modeling, thereby enabling an integrated solution for non-invasive grading prediction, treatment response subtype discovery, and intrinsic mechanism interpretation. Through a grade-guided feature fusion mechanism, discriminative information that is highly correlated with tumor malignancy and treatment sensitivity is emphasized in the multimodal joint representation, while irrelevant features are suppressed to prevent interference with model learning. Within a unified framework, grading prediction and grade-conditioned treatment response modeling are jointly realized. Experimental results on real-world clinical datasets demonstrate that the proposed method achieved an accuracy of
and a kappa coefficient of
in the tumor-grading prediction task, indicating a high level of consistency with pathological grading. In the treatment response prediction task, the proposed model attained an AUC of
, a precision of
, and a recall of
, significantly outperforming single-modality models, conventional early-fusion models, and multimodal CNN–Transformer models without grading constraints. In addition, treatment-sensitive and treatment-resistant subtypes identified under grading conditions exhibited stable and significant stratification differences in clustering consistency and survival analysis, validating the potential value of the proposed approach for clinical risk assessment and individualized treatment decision-making.
Full article