MDPI - Publisher of Open Access Journals

25 pages, 2963 KB

Open AccessArticle

LawLLM-DS: A Two-Stage LoRA Framework for Multi-Label Legal Judgment Prediction with Structured Label Dependencies

by Pengcheng Zhao, Chengcheng Han and Kun Han

Symmetry 2026, 18(1), 150; https://doi.org/10.3390/sym18010150 - 13 Jan 2026

Viewed by 97

Legal judgment prediction (LJP) increasingly relies on large language models whose full fine-tuning is memory-intensive and susceptible to catastrophic forgetting. We present LawLLM-DS, a two-stage Low-Rank Adaptation (LoRA) framework that first performs legal knowledge pre-tuning with an aggressive learning rate and subsequently refines [...] Read more.

Legal judgment prediction (LJP) increasingly relies on large language models whose full fine-tuning is memory-intensive and susceptible to catastrophic forgetting. We present LawLLM-DS, a two-stage Low-Rank Adaptation (LoRA) framework that first performs legal knowledge pre-tuning with an aggressive learning rate and subsequently refines judgment relations with conservative updates, using dedicated LoRA adapters, 4-bit quantization, and targeted modification of seven Transformer projection matrices to keep only 0.21% of parameters trainable. From a structural perspective, the twenty annotated legal elements form a symmetric label co-occurrence graph that exhibits both cluster-level regularities and asymmetric sparsity patterns, and LawLLM-DS implicitly captures these graph-informed dependencies while remaining compatible with downstream GNN-based representations. Experiments on 5096 manually annotated divorce cases show that LawLLM-DS lifts macro F1 to 0.8893 and achieves an accuracy of 0.8786, outperforming single-stage LoRA and BERT baselines under the same data regime. Ablation studies further verify the contributions of stage-wise learning rates, adapter placement, and low-rank settings. These findings demonstrate that curriculum-style, parameter-efficient adaptation provides a practical path toward lightweight yet structure-aware LJP systems for judicial decision support. Full article

(This article belongs to the Special Issue Symmetry and Asymmetry Study in Graph Theory)

► Show Figures

Figure 1

23 pages, 2532 KB

Open AccessArticle

A Time-Frequency Fusion Fault Diagnosis Framework for Nuclear Power Plants Oriented to Class-Incremental Learning Under Data Imbalance

by Zhaohui Liu, Qihao Zhou and Hua Liu

Computers 2026, 15(1), 22; https://doi.org/10.3390/computers15010022 - 5 Jan 2026

Viewed by 273

Abstract

In nuclear power plant fault diagnosis, traditional machine learning models (e.g., SVM and KNN) require full retraining on the entire dataset whenever new fault categories are introduced, resulting in prohibitive computational overhead. Deep learning models, on the other hand, are prone to catastrophic [...] Read more.

In nuclear power plant fault diagnosis, traditional machine learning models (e.g., SVM and KNN) require full retraining on the entire dataset whenever new fault categories are introduced, resulting in prohibitive computational overhead. Deep learning models, on the other hand, are prone to catastrophic forgetting under incremental learning settings, making it difficult to simultaneously preserve recognition performance on both old and newly added classes. In addition, nuclear power plant fault data typically exhibit significant class imbalance, further constraining model performance. To address these issues, this study employs SHAP-XGBoost to construct a feature evaluation system, enabling feature extraction and interpretable analysis on the NPPAD simulation dataset, thereby enhancing the model’s capability to learn new features. To mitigate insufficient temporal feature capture and sample imbalance among incremental classes, we propose a cascaded spatiotemporal feature extraction network: LSTM is used to capture local dependencies, and its hidden states are passed as position-aware inputs to a Transformer for modeling global relationships, thus alleviating Transformer overfitting on short sequences. By further integrating frequency-domain analysis, an improved Adaptive Time–Frequency Network (ATFNet) is developed to enhance the robustness of discriminating complex fault patterns. Experimental results show that the proposed method achieves an average accuracy of 91.36% across five incremental learning stages, representing an improvement of approximately 20.7% over baseline models, effectively mitigating the problem of catastrophic forgetting. Full article

(This article belongs to the Special Issue Generative Artificial Intelligence and Machine Learning in Industrial Processes and Manufacturing)

► Show Figures

Figure 1

34 pages, 7143 KB

Open AccessReview

Knowledge Distillation in Object Detection: A Survey from CNN to Transformer

by Tahira Shehzadi, Rabya Noor, Ifza Ifza, Marcus Liwicki, Didier Stricker and Muhammad Zeshan Afzal

Sensors 2026, 26(1), 292; https://doi.org/10.3390/s26010292 - 2 Jan 2026

Viewed by 496

Abstract

Deep learning models, especially for object detection have gained immense popularity in computer vision. These models have demonstrated remarkable accuracy and performance, driving advancements across various applications. However, the high computational complexity and large storage requirements of state-of-the-art object detection models pose significant [...] Read more.

Deep learning models, especially for object detection have gained immense popularity in computer vision. These models have demonstrated remarkable accuracy and performance, driving advancements across various applications. However, the high computational complexity and large storage requirements of state-of-the-art object detection models pose significant challenges for deployment on resource-constrained devices like mobile phones and embedded systems. Knowledge Distillation (KD) has emerged as a prominent solution to these challenges, effectively compressing large, complex teacher models into smaller, efficient student models. This technique maintains good accuracy while significantly reducing model size and computational demands, making object detection models more practical for real-world applications. This survey provides a comprehensive review of KD-based object detection models developed in recent years. It offers an in-depth analysis of existing techniques, highlighting their novelty and limitations, and explores future research directions. The survey covers the different distillation algorithms used in object detection. It also examines extended applications of knowledge distillation in object detection, such as improvements for lightweight models, addressing catastrophic forgetting in incremental learning, and enhancing small object detection. Furthermore, the survey also delves into the application of knowledge distillation in other domains such as image classification, semantic segmentation, 3D reconstruction, and document analysis. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

22 pages, 17135 KB

Open AccessArticle

A Ship Incremental Recognition Framework via Unknown Extraction and Joint Optimization Learning

by Yugao Li, Guangzhen Bao, Jianming Hu, Xiyang Zhi, Tianyi Hu, Junjie Wang and Wenbo Wu

Remote Sens. 2026, 18(1), 149; https://doi.org/10.3390/rs18010149 - 2 Jan 2026

Viewed by 202

Abstract

With the rapid growth of the marine economy and the increasing demand for maritime security, ship target detection has become critically important in both military and civilian applications. However, in complex remote sensing scenarios, challenges such as visual similarity among ships, subtle inter-class [...] Read more.

With the rapid growth of the marine economy and the increasing demand for maritime security, ship target detection has become critically important in both military and civilian applications. However, in complex remote sensing scenarios, challenges such as visual similarity among ships, subtle inter-class differences, and the continual emergence of new categories make traditional closed-world detection methods inadequate. To address these issues, this paper proposes an open-world detection framework for remote sensing ships. The framework integrates two key modules: (1) a Fine-Grained Feature and Extreme Value-based Unknown Recognition (FEUR) module, which leverages tail distribution modeling and adaptive thresholding to achieve precise detection and effective differentiation of unknown ship targets; and (2) a Joint Optimization-based Incremental Learning (JOIL) module, which employs hierarchical elastic weight constraints to differentially update the backbone and detection head, thereby alleviating catastrophic forgetting while incorporating new categories with only a few labeled samples. Extensive experiments on the FGSRCS dataset demonstrate that the proposed method not only maintains high accuracy on known categories but also significantly outperforms mainstream open-world detection approaches in unknown recognition and incremental learning. This work provides both theoretical value and practical potential for continuous ship detection and recognition in complex open environments. Full article

(This article belongs to the Special Issue Intelligent Interpretation of Remote Sensing Images and Intelligent Processing of Remote Sensing Information)

► Show Figures

Figure 1

22 pages, 1912 KB

Open AccessArticle

Privacy-Aware Continual Self-Supervised Learning on Multi-Window Chest Computed Tomography for Domain-Shift Robustness

by Ren Tasai, Guang Li, Ren Togo, Takahiro Ogawa, Kenji Hirata, Minghui Tang, Takaaki Yoshimura, Hiroyuki Sugimori, Noriko Nishioka, Yukie Shimizu, Kohsuke Kudo and Miki Haseyama

Bioengineering 2026, 13(1), 32; https://doi.org/10.3390/bioengineering13010032 - 27 Dec 2025

Viewed by 325

Abstract

We propose a novel continual self-supervised learning (CSSL) framework for simultaneously learning diverse features from multi-window-obtained chest computed tomography (CT) images and ensuring data privacy. Achieving a robust and highly generalizable model in medical image diagnosis is challenging, mainly because of issues, such [...] Read more.

We propose a novel continual self-supervised learning (CSSL) framework for simultaneously learning diverse features from multi-window-obtained chest computed tomography (CT) images and ensuring data privacy. Achieving a robust and highly generalizable model in medical image diagnosis is challenging, mainly because of issues, such as the scarcity of large-scale, accurately annotated datasets and domain shifts inherent to dynamic healthcare environments. Specifically, in chest CT, these domain shifts often arise from differences in window settings, which are optimized for distinct clinical purposes. Previous CSSL frameworks often mitigated domain shift by reusing past data, a typically impractical approach owing to privacy constraints. Our approach addresses these challenges by effectively capturing the relationship between previously learned knowledge and new information across different training stages through continual pretraining on unlabeled images. Specifically, by incorporating a latent replay-based mechanism into CSSL, our method mitigates catastrophic forgetting due to domain shifts during continual pretraining while ensuring data privacy. Additionally, we introduce a feature distillation technique that integrates Wasserstein distance-based knowledge distillation and batch-knowledge ensemble, enhancing the ability of the model to learn meaningful, domain-shift-robust representations. Finally, we validate our approach using chest CT images obtained across two different window settings, demonstrating superior performance compared with other approaches. Full article

(This article belongs to the Special Issue Modern Medical Imaging in Disease Diagnosis Applications)

► Show Figures

Figure 1

24 pages, 12538 KB

Open AccessArticle

DFFNet: A Dual-Branch Feature Fusion Network with Improved Dynamic Elastic Weight Consolidation for Accurate Battery State of Health Prediction

by Dan Ning, Bin Liu, Jinqi Zhu and Yang Liu

Energies 2026, 19(1), 6; https://doi.org/10.3390/en19010006 - 19 Dec 2025

Viewed by 287

Abstract

Accurately estimating the state of health (SOH) of lithium-ion batteries is vital for guaranteeing battery safety and prolonging their operational lifespan. However, current data-driven approaches often suffer from limited utilization of health indicators, weak noise suppression, and inefficient feature fusion across multiple branches. [...] Read more.

Accurately estimating the state of health (SOH) of lithium-ion batteries is vital for guaranteeing battery safety and prolonging their operational lifespan. However, current data-driven approaches often suffer from limited utilization of health indicators, weak noise suppression, and inefficient feature fusion across multiple branches. To overcome these limitations, this paper proposes DFFNet, a Dual-Branch Feature Fusion Network with Improved Dynamic Elastic Weight Consolidation (IDEWC) for battery SOH estimation. The proposed DFFNet captures both local degradation behaviors and global aging trends through two parallel branches, while a gated attention mechanism adaptively integrates multi-scale features. Moreover, IDEWC dynamically updates the Fisher Information Matrix to retain previously learned knowledge and adapt to new data, thereby mitigating catastrophic forgetting. Experimental validation on the NASA and CALCE datasets shows that DFFNet outperforms baseline methods in accuracy, demonstrating its robustness and generalizability for precise battery SOH estimation. Full article

► Show Figures

Figure 1

29 pages, 2363 KB

Open AccessArticle

Fine-Tuning a Local LLM for Thermoelectric Generators with QLoRA: From Generalist to Specialist

by José Miguel Monzón-Verona, Santiago García-Alonso and Francisco Jorge Santana-Martín

Appl. Sci. 2025, 15(24), 13242; https://doi.org/10.3390/app152413242 - 17 Dec 2025

Viewed by 475

Abstract

This work establishes a large language model (LLM) specialized in the domain of thermoelectric generators (TEGs), for deployment on local hardware. Starting with the generalist JanV1-4B model and Qwen3-4B-Thinking-2507 models, an efficient fine-tuning (FT) methodology using quantized low-rank adaptation (QLoRA) was employed, modifying [...] Read more.

This work establishes a large language model (LLM) specialized in the domain of thermoelectric generators (TEGs), for deployment on local hardware. Starting with the generalist JanV1-4B model and Qwen3-4B-Thinking-2507 models, an efficient fine-tuning (FT) methodology using quantized low-rank adaptation (QLoRA) was employed, modifying only 3.18% of the total parameters of thee base models. The key to the process is the use of a custom-designed dataset, which merges deep theoretical knowledge with rigorous instruction tuning to refine behavior and mitigate catastrophic forgetting. The dataset employed for FT contains 202 curated questions and answers (QAs), strategically balanced between domain-specific knowledge (48.5%) and instruction-tuning for response behavior (51.5%). Performance of the models was evaluated using two complementary benchmarks: a 16-question multilevel cognitive benchmark (94% accuracy) and a specialized 42-question TEG benchmark (81% accuracy), scoring responses as excellent, correct with difficulties, or incorrect, based on technical accuracy and reasoning quality. The model’s utility is demonstrated through experimental TEG design guidance, providing expert-level reasoning on thermal management strategies. This study validates the specialization of LLMs using QLoRA as an effective and accessible strategy for developing highly competent engineering support tools, eliminating dependence on large-scale computing infrastructures, achieving specialization on a consumer-grade NVIDIA RTX 2070 SUPER GPU (8 GB VRAM) in 263 s. Full article

(This article belongs to the Special Issue Large Language Models and Knowledge Computing)

► Show Figures

Figure 1

25 pages, 429 KB

Open AccessArticle

CALM: Continual Associative Learning Model via Sparse Distributed Memory

by Andrey Nechesov and Janne Ruponen

Technologies 2025, 13(12), 587; https://doi.org/10.3390/technologies13120587 - 13 Dec 2025

Viewed by 936

Abstract

Sparse Distributed Memory (SDM) provides a biologically inspired mechanism for associative and online learning. Transformer architectures, despite exceptional inference performance, remain static and vulnerable to catastrophic forgetting. This work introduces Continual Associative Learning Model (CALM), a conceptual framework that defines the theoretical base [...] Read more.

Sparse Distributed Memory (SDM) provides a biologically inspired mechanism for associative and online learning. Transformer architectures, despite exceptional inference performance, remain static and vulnerable to catastrophic forgetting. This work introduces Continual Associative Learning Model (CALM), a conceptual framework that defines the theoretical base and integration logic for the cognitive model seeking to establish continual, lifelong adaptation without retraining by combining SDM system with lightweight dual-transformer modules. The architecture proposes an always-online associative memory for episodic storage (System 1), as well as a pair of asynchronous transformer consolidate experience in the background for uninterrupted reasoning and gradual model evolution (System 2). The framework remains compatible with standard transformer benchmarks, establishing a shared evaluation basis for both reasoning accuracy and continual learning stability. Preliminary experiments using the SDMPreMark benchmark evaluate algorithmic behavior across multiple synthetic sets, confirming a critical radius-threshold phenomenon in SDM recall. These results represent deterministic characterization of SDM dynamics in the component level, preceding the integration in the model level with transformer-based semantic tasks. The CALM framework provides a reproducible foundation for studying continual memory and associative learning in hybrid transformer architectures, although future work should involve experiments with non-synthetic, high-load data to confirm scalable behavior in high interference. Full article

(This article belongs to the Special Issue Collaborative Robotics and Human-AI Interactions)

► Show Figures

Figure 1

17 pages, 691 KB

Open AccessArticle

Balancing Specialization and Generalization Trade-Off for Speech Recognition Models

by Sebastian Cygert, Piotr Despot-Mładanowicz and Andrzej Czyżewski

Electronics 2025, 14(24), 4792; https://doi.org/10.3390/electronics14244792 - 5 Dec 2025

Viewed by 502

Abstract

Recently, using foundation models pretrained on massive volumes of data that are finetuned for the downstream task has become a standard practice in many machine learning applications, including automatic speech recognition (ASR). In some scenarios, we are interested in optimizing performance for the [...] Read more.

Recently, using foundation models pretrained on massive volumes of data that are finetuned for the downstream task has become a standard practice in many machine learning applications, including automatic speech recognition (ASR). In some scenarios, we are interested in optimizing performance for the target domain (specialization)while preserving the general capabilities of the pretrained model. In this work, we study this effect for various finetuning strategies that aim to preserve pretrained model capabilities. We identify model merging as a promising strategy that performs well across diverse scenarios. However, our findings show that leveraging a small number of data points from the task we are interested in preserving the accuracy of significantly improves the balance between specialization and generalization. In this context, we demonstrate that combining a simplest finetuning strategy with a memory buffer yields highly competitive results, surpassing other more complicated approaches. Our analysis highlights the need for further research into methods that effectively utilize memory buffers, especially in low-resource scenarios. To encourage further exploration in this area, we have open-sourced our code. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 2418 KB

Open AccessArticle

D-Know: Disentangled Domain Knowledge-Aided Learning for Open-Domain Continual Object Detection

by Bintao He, Caixia Yan, Yan Kou, Yinghao Wang, Xin Lv, Haipeng Du and Yugui Xie

Appl. Sci. 2025, 15(23), 12723; https://doi.org/10.3390/app152312723 - 1 Dec 2025

Viewed by 369

Abstract

Continual learning for open-vocabulary object detection aims to enable pretrained vision–language detectors to adapt to diverse specialized domains while preserving their zero-shot generalization capabilities. However, existing methods primarily focus on mitigating catastrophic forgetting, often neglecting the substantial domain shifts commonly encountered in real-world [...] Read more.

Continual learning for open-vocabulary object detection aims to enable pretrained vision–language detectors to adapt to diverse specialized domains while preserving their zero-shot generalization capabilities. However, existing methods primarily focus on mitigating catastrophic forgetting, often neglecting the substantial domain shifts commonly encountered in real-world applications. To address this critical oversight, we pioneer Open-Domain Continual Object Detection (OD-COD), a new paradigm that requires detectors to continually adapt across domains with significant stylistic gaps. We propose Disentangled Domain Knowledge-Aided Learning (D-Know) to tackle this challenge. This framework explicitly disentangles domain-general priors from category-specific adaptation, managing them dynamically in a scalable domain knowledge base. Specifically, D-Know first learns domain priors in a self-supervised manner and then leverages these priors to facilitate category-specific adaptation within each domain. To rigorously evaluate this task, we construct OD-CODB, the first dedicated benchmark spanning six domains with substantial visual variations. Extensive experiments demonstrate that D-Know achieves superior performance, surpassing current state-of-the-art methods by an average of 4.2% mAP under open-domain continual settings while maintaining strong zero-shot generalization. Furthermore, experiments under the few-shot setting confirm D-Know’s superior data efficiency. Full article

► Show Figures

Figure 1

25 pages, 9162 KB

Open AccessArticle

Image-Based Threat Detection and Explainability Investigation Using Incremental Learning and Grad-CAM with YOLOv8

by Zeynel Kutlu and Bülent Gürsel Emiroğlu

Computers 2025, 14(12), 511; https://doi.org/10.3390/computers14120511 - 24 Nov 2025

Viewed by 763

Abstract

Real-world threat detection systems face critical challenges in adapting to evolving operational conditions while providing transparent decision making. Traditional deep learning models suffer from catastrophic forgetting during continual learning and lack interpretability in security-critical deployments. This study proposes a distributed edge–cloud framework integrating [...] Read more.

Real-world threat detection systems face critical challenges in adapting to evolving operational conditions while providing transparent decision making. Traditional deep learning models suffer from catastrophic forgetting during continual learning and lack interpretability in security-critical deployments. This study proposes a distributed edge–cloud framework integrating YOLOv8 object detection with incremental learning and Gradient-weighted Class Activation Mapping (Grad-CAM) for adaptive, interpretable threat detection. The framework employs distributed edge agents for inference on unlabeled surveillance data, with a central server validating detections through class verification and localization quality assessment (IoU ≥ 0.5). A lightweight YOLOv8-nano model (3.2 M parameters) was incrementally trained over five rounds using sequential fine tuning with weight inheritance, progressively incorporating verified samples from an unlabeled pool. Experiments on a 5064 image weapon detection dataset (pistol and knife classes) demonstrated substantial improvements: F1-score increased from 0.45 to 0.83, mAP@0.5 improved from 0.518 to 0.886 and minority class F1-score rose 196% without explicit resampling. Incremental learning achieved a 74% training time reduction compared to one-shot training while maintaining competitive accuracy. Grad-CAM analysis revealed progressive attention refinement quantified through the proposed Heatmap Focus Score, reaching 92.5% and exceeding one-shot-trained models. The framework provides a scalable, memory-efficient solution for continual threat detection with superior interpretability in dynamic security environments. The integration of Grad-CAM visualizations with detection outputs enables operator accountability by establishing auditable decision records in deployed systems. Full article

(This article belongs to the Special Issue Deep Learning and Explainable Artificial Intelligence (2nd Edition))

► Show Figures

Figure 1

24 pages, 59247 KB

Open AccessArticle

Pursuing Better Representations: Balancing Discriminability and Transferability for Few-Shot Class-Incremental Learning

by Qi Li, Wei Wang, Hui Fan, Bingwei Hui and Fei Wen

J. Imaging 2025, 11(11), 391; https://doi.org/10.3390/jimaging11110391 - 4 Nov 2025

Viewed by 674

Abstract

Few-Shot Class-Incremental Learning (FSCIL) aims to continually learn novel classes from limited data while retaining knowledge of previously learned classes. To mitigate catastrophic forgetting, most approaches pre-train a powerful backbone on the base session and keep it frozen during incremental sessions. Within this [...] Read more.

Few-Shot Class-Incremental Learning (FSCIL) aims to continually learn novel classes from limited data while retaining knowledge of previously learned classes. To mitigate catastrophic forgetting, most approaches pre-train a powerful backbone on the base session and keep it frozen during incremental sessions. Within this framework, existing studies primarily focus on representation learning in FSCIL, particularly Self-Supervised Contrastive Learning (SSCL), to enhance the transferability of representations and thereby boost model generalization to novel classes. However, they face a trade-off dilemma: improving transferability comes at the expense of discriminability, precluding simultaneous high performance on both base and novel classes. To address this issue, we propose BR-FSCIL, a representation learning framework for the FSCIL scenario. In the pre-training stage, we first design a Hierarchical Contrastive Learning (HierCon) algorithm. HierCon leverages label information to model hierarchical relationships among features. In contrast to SSCL, it maintains strong discriminability when promoting transferability. Second, to further improve the model’s performance on novel classes, an Alignment Modulation (AM) loss is proposed that explicitly facilitates learning of knowledge shared across classes from an inter-class perspective. Building upon the hierarchical discriminative structure established by HierCon, it additionally improves the model’s adaptability to novel classes. Through optimization at both intra-class and inter-class levels, the representations learned by BR-FSCIL achieve a balance between discriminability and transferability. Extensive experiments on mini-ImageNet, CIFAR100, and CUB200 demonstrate the effectiveness of our method, which achieves final session accuracies of 53.83%, 53.04%, and 62.60%, respectively. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

33 pages, 16842 KB

Open AccessArticle

Feature-Generation-Replay Continual Learning Combined with Mixture-of-Experts for Data-Driven Autonomous Guidance

by Bowen Li, Junxiang Li, Hongji Cheng, Tao Wu and Binhan Du

Drones 2025, 9(11), 757; https://doi.org/10.3390/drones9110757 - 31 Oct 2025

Viewed by 1158

Abstract

Continual learning (CL) is a key technology for enabling data-driven autonomous guidance systems to operate stably and persistently in complex and dynamic environments. Its core goal is to enable the model to continuously learn new scenarios and tasks after deployment, without forgetting existing [...] Read more.

Continual learning (CL) is a key technology for enabling data-driven autonomous guidance systems to operate stably and persistently in complex and dynamic environments. Its core goal is to enable the model to continuously learn new scenarios and tasks after deployment, without forgetting existing knowledge, and finally achieving stable decision-making in the different scenarios over a long period. This paper proposes a continual learning method that combines feature-generation-replay with Mixture-of-Experts and Low-Rank Adaptation (MoE-LoRA). This method retains the key features of historical tasks by feature repla and realizes the adaptive selection of old and new knowledge by the Mixture-of-Experts (MoE), which alleviates the conflict between knowledge while ensuring learning efficiency. In the comparison experiments, we compared the proposed method with the representative continual learning methods, and the experimental results show that our method outperforms the representative continual learning methods, and the ablation experiments further demonstrate the role of each component. This work provides technical support for the long-term maintenance and new task expansion of data-driven autonomous guidance systems, laying a foundation for their stable operation in complex, variable real-world scenarios. Full article

(This article belongs to the Special Issue Advances in Guidance, Navigation, and Control)

► Show Figures

Figure 1

25 pages, 2968 KB

Open AccessArticle

ECSA: Mitigating Catastrophic Forgetting and Few-Shot Generalization in Medical Visual Question Answering

by Qinhao Jia, Shuxian Liu, Mingliang Chen, Tianyi Li and Jing Yang

Tomography 2025, 11(10), 115; https://doi.org/10.3390/tomography11100115 - 20 Oct 2025

Viewed by 653

Abstract

Objective: Medical Visual Question Answering (Med-VQA), a key technology that integrates computer vision and natural language processing to assist in clinical diagnosis, possesses significant potential for enhancing diagnostic efficiency and accuracy. However, its development is constrained by two major bottlenecks: weak few-shot generalization [...] Read more.

Objective: Medical Visual Question Answering (Med-VQA), a key technology that integrates computer vision and natural language processing to assist in clinical diagnosis, possesses significant potential for enhancing diagnostic efficiency and accuracy. However, its development is constrained by two major bottlenecks: weak few-shot generalization capability stemming from the scarcity of high-quality annotated data and the problem of catastrophic forgetting when continually learning new knowledge. Existing research has largely addressed these two challenges in isolation, lacking a unified framework. Methods: To bridge this gap, this paper proposes a novel Evolvable Clinical-Semantic Alignment (ECSA) framework, designed to synergistically solve these two challenges within a single architecture. ECSA is built upon powerful pre-trained vision (BiomedCLIP) and language (Flan-T5) models, with two innovative modules at its core. First, we design a Clinical-Semantic Disambiguation Module (CSDM), which employs a novel debiased hard negative mining strategy for contrastive learning. This enables the precise discrimination of “hard negatives” that are visually similar but clinically distinct, thereby significantly enhancing the model’s representation ability in few-shot and long-tail scenarios. Second, we introduce a Prompt-based Knowledge Consolidation Module (PKC), which acts as a rehearsal-free non-parametric knowledge store. It consolidates historical knowledge by dynamically accumulating and retrieving task-specific “soft prompts,” thus effectively circumventing catastrophic forgetting without relying on past data. Results: Extensive experimental results on four public benchmark datasets, VQA-RAD, SLAKE, PathVQA, and VQA-Med-2019, demonstrate ECSA’s state-of-the-art or highly competitive performance. Specifically, ECSA achieves excellent overall accuracies of 80.15% on VQA-RAD and 85.10% on SLAKE, while also showing strong generalization with 64.57% on PathVQA and 82.23% on VQA-Med-2019. More critically, in continual learning scenarios, the framework achieves a low forgetting rate of just 13.50%, showcasing its significant advantages in knowledge retention. Conclusions: These findings validate the framework’s substantial potential for building robust and evolvable clinical decision support systems. Full article

► Show Figures

Figure 1

16 pages, 3378 KB

Open AccessArticle

Cosine Prompt-Based Class Incremental Semantic Segmentation for Point Clouds

by Lei Guo, Hongye Li, Min Pang, Kaowei Liu, Xie Han and Fengguang Xiong

Algorithms 2025, 18(10), 648; https://doi.org/10.3390/a18100648 - 16 Oct 2025

Viewed by 558

Abstract

Although current 3D semantic segmentation methods have achieved significant success, they suffer from catastrophic forgetting when confronted with dynamic, open environments. To address this issue, class incremental learning is introduced to update models while maintaining a balance between plasticity and stability. In this [...] Read more.

Although current 3D semantic segmentation methods have achieved significant success, they suffer from catastrophic forgetting when confronted with dynamic, open environments. To address this issue, class incremental learning is introduced to update models while maintaining a balance between plasticity and stability. In this work, we propose CosPrompt, a rehearsal-free approach for class incremental semantic segmentation. Specifically, we freeze the prompts for existing classes and incrementally expand and fine-tune the prompts for new classes, thereby generating discriminative and customized features. We employ clamping operations to regulate backward propagation, ensuring smooth training. Furthermore, we utilize the learning without forgetting loss and pseudo-label generation to further mitigate catastrophic forgetting. We conduct comparative and ablation experiments on the S3DIS dataset and ScanNet v2 dataset, demonstrating the effectiveness and feasibility of our method. Full article

(This article belongs to the Section Randomized, Online, and Approximation Algorithms)

► Show Figures

Figure 1

Search Results (140)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (140)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI