Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (873)

Search Parameters:
Keywords = pre-training–fine-tuning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 2618 KB  
Article
A Cascaded Batch Bayesian Yield Optimization Method for Analog Circuits via Deep Transfer Learning
by Ziqi Wang, Kaisheng Sun and Xiao Shi
Electronics 2026, 15(3), 516; https://doi.org/10.3390/electronics15030516 - 25 Jan 2026
Viewed by 55
Abstract
In nanometer integrated-circuit (IC) manufacturing, advanced technology scaling has intensified the effects of process variations on circuit reliability and performance. Random fluctuations in parameters such as threshold voltage, channel length, and oxide thickness further degrade design margins and increase the likelihood of functional [...] Read more.
In nanometer integrated-circuit (IC) manufacturing, advanced technology scaling has intensified the effects of process variations on circuit reliability and performance. Random fluctuations in parameters such as threshold voltage, channel length, and oxide thickness further degrade design margins and increase the likelihood of functional failures. These variations often lead to rare circuit failure events, underscoring the importance of accurate yield estimation and robust design methodologies. Conventional Monte Carlo yield estimation is computationally infeasible as millions of simulations are required to capture failure events with extremely low probability. This paper presents a novel reliability-based circuit design optimization framework that leverages deep transfer learning to improve the efficiency of repeated yield analysis in optimization iterations. Based on pre-trained neural network models from prior design knowledge, we utilize model fine-tuning to accelerate importance sampling (IS) for yield estimation. To improve estimation accuracy, adversarial perturbations are introduced to calibrate uncertainty near the model decision boundary. Moreover, we propose a cascaded batch Bayesian optimization (CBBO) framework that incorporates a smart initialization strategy and a localized penalty mechanism, guiding the search process toward high-yield regions while satisfying nominal performance constraints. Experimental validation on SRAM circuits and amplifiers reveals that CBBO achieves a computational speedup of 2.02×–4.63× over state-of-the-art (SOTA) methods, without compromising accuracy and robustness. Full article
(This article belongs to the Topic Advanced Integrated Circuit Design and Application)
Show Figures

Figure 1

18 pages, 321 KB  
Article
Instruction-Tuned Decoder-Only Large Language Models for Efficient Extreme Summarization on Consumer-Grade GPUs
by Attia Fathalla Elatiky, Ahmed M. Hamad, Heba Khaled and Mahmoud Fayez
Algorithms 2026, 19(2), 96; https://doi.org/10.3390/a19020096 - 25 Jan 2026
Viewed by 43
Abstract
Extreme summarization generates very short summaries, typically a single sentence, answering the question “What is the document about?”. Although large language models perform well in text generation, fine-tuning them for summarization often requires substantial computational resources that are unavailable to many researchers. In [...] Read more.
Extreme summarization generates very short summaries, typically a single sentence, answering the question “What is the document about?”. Although large language models perform well in text generation, fine-tuning them for summarization often requires substantial computational resources that are unavailable to many researchers. In this study, we present an effective method for instruction-tuning open decoder-only large language models under limited GPU resources. The proposed approach combines parameter-efficient fine-tuning techniques, such as Low-Rank Adaptation (LoRA), with quantization to reduce memory requirements, enabling training on a single consumer-grade GPU. We fine-tuned a pre-trained decoder-only model on the XSum dataset using an instruction-following format. Experimental results demonstrate that the proposed decoder-only approach achieves competitive performance on the XSum dataset under strict GPU memory constraints. On the full test set, the proposed 2G–1R pipeline attains ROUGE-1/2/L F1 scores of 46.0/22.0/37.0 and a BERTScore F1 of 0.917, outperforming the individual generator models in lexical overlap and semantic similarity. Evaluation was conducted using traditional overlap-based metrics (ROUGE) and semantic metrics, including BERTScore and G-Eval. While remaining competitive in ROUGE compared to strong encoder–decoder baselines, the pipeline consistently produces summaries with higher semantic quality. These findings demonstrate that large decoder-only language models can be efficiently fine-tuned for extreme summarization on limited consumer-grade hardware without sacrificing output quality. Full article
(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)
Show Figures

Figure 1

22 pages, 1462 KB  
Article
Effects of Window and Batch Size on Autoencoder-LSTM Models for Remaining Useful Life Prediction
by Eugene Jeon, Donghwan Jin and Yeonhee Kim
Machines 2026, 14(2), 135; https://doi.org/10.3390/machines14020135 - 23 Jan 2026
Viewed by 84
Abstract
Remaining useful life (RUL) prediction is central to predictive maintenance, but acquiring sufficient run-to-failure data remains challenging. To better exploit limited labeled data, this study investigates a pipeline combining an unsupervised autoencoder (AE) and supervised LSTM regression on the NASA C-MAPSS dataset. Building [...] Read more.
Remaining useful life (RUL) prediction is central to predictive maintenance, but acquiring sufficient run-to-failure data remains challenging. To better exploit limited labeled data, this study investigates a pipeline combining an unsupervised autoencoder (AE) and supervised LSTM regression on the NASA C-MAPSS dataset. Building on an AE-LSTM baseline, we analyze how window size and batch size affect accuracy and training efficiency. Using the FD001 and FD004 subsets with training-capped RUL labels, we perform multi-seed experiments over a wide grid of window lengths and batch sizes. The AE is pre-trained on normalized sensor streams and reused as a feature extractor, while the LSTM head is trained with early stopping. Performance was assessed using RMSE, C-MAPSS score, and training time, reporting 95% confidence intervals. Results show that fine-tuning the encoder with a batch size of 128 yielded the best mean RMSE of 13.99 (FD001) and 28.67 (FD004). We obtained stable optimal window ranges (40–70 for FD001; 60–80 for FD004) and found that batch sizes of 64–256 offer the best accuracy–efficiency trade-off. These optimal ranges were further validated using Particle Swarm Optimization (PSO). These findings offer practical recommendations for tuning AE-LSTM-based RUL prediction models and demonstrate that performance remains stable within specific hyperparameter ranges. Full article
17 pages, 5486 KB  
Article
Enhancing Parameter-Efficient Code Representations with Retrieval and Structural Priors
by Shihao Zheng, Yong Li and Xiang Ma
Appl. Sci. 2026, 16(2), 1106; https://doi.org/10.3390/app16021106 - 21 Jan 2026
Viewed by 71
Abstract
High-quality code representations are fundamental to code intelligence. Achieving such representations with parameter-efficient fine-tuning (PEFT) remains a key challenge. While code pre-trained models (CodePTMs) offer a robust foundation for general-purpose embeddings, current PEFT approaches face two main obstacles when adapting them: (i) they [...] Read more.
High-quality code representations are fundamental to code intelligence. Achieving such representations with parameter-efficient fine-tuning (PEFT) remains a key challenge. While code pre-trained models (CodePTMs) offer a robust foundation for general-purpose embeddings, current PEFT approaches face two main obstacles when adapting them: (i) they fail to adequately capture the deep structural characteristics of programs, and (ii) they are limited by the model’s finite internal parameters, restricting their ability to overcome inherent knowledge bottlenecks. To address these challenges, we introduce a parameter-efficient code representation learning framework that combines retrieval augmentation with structure-aware priors. Our framework features three complementary, lightweight modules: first, a structure–semantic dual-channel retrieval mechanism that infuses high-quality external code knowledge as non-parametric memory to alleviate the knowledge bottleneck; second, a graph relative bias module that strengthens the attention mechanism’s capacity to model structural relationships within programs; and third, a span-discriminative contrastive objective that sharpens the distinctiveness and boundary clarity of span-level representations. Extensive experiments on three benchmarks spanning six programming languages show that our method consistently outperforms state-of-the-art parameter-efficient baselines. Notably, on structure-sensitive tasks using the PLBART backbone, RS-Rep surpasses full fine-tuning, delivering a 22.1% improvement in Exact Match for code generation and a 4.4% increase in BLEU scores for code refinement, all while utilizing only about 5% of the trainable parameters. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

21 pages, 8669 KB  
Article
LLM4FB: A One-Sided CSI Feedback and Prediction Framework for Lightweight UEs via Large Language Models
by Xinxin Xie, Xinyu Ning, Yitong Liu, Hanning Wang, Jing Jin and Hongwen Yang
Sensors 2026, 26(2), 691; https://doi.org/10.3390/s26020691 - 20 Jan 2026
Viewed by 118
Abstract
Massive MIMO systems can substantially enhance spectral efficiency, but such gains rely on the availability of accurate channel state information (CSI). However, the increase in the number of antennas leads to a significant growth in feedback overhead, while conventional deep-learning-based CSI feedback methods [...] Read more.
Massive MIMO systems can substantially enhance spectral efficiency, but such gains rely on the availability of accurate channel state information (CSI). However, the increase in the number of antennas leads to a significant growth in feedback overhead, while conventional deep-learning-based CSI feedback methods also impose a substantial computational burden on the user equipment (UE). To address these challenges, this paper proposes LLM4FB, a one-sided CSI feedback framework that leverages a pre-trained large language model (LLM). In this framework, the UE performs only low-complexity linear projections to compress CSI. In contrast, the BS leverages a pre-trained LLM to accurately reconstruct and predict CSI. By utilizing the powerful modeling capabilities of the pre-trained LLM, only a small portion of the parameters needs to be fine-tuned to improve CSI recovery accuracy with low training cost. Furthermore, a multiobjective loss function is designed to simultaneously optimize normalized mean square error (NMSE) and spectral efficiency (SE). Simulation results show that LLM4FB outperforms existing methods across various compression ratios and mobility levels, achieving high-precision CSI feedback with minimal computational capability from terminal devices. Therefore, LLM4FB presents a highly promising solution for next-generation wireless sensor networks and industrial IoT applications, where terminal devices are often strictly constrained by energy and hardware resources. Full article
(This article belongs to the Section Communications)
Show Figures

Figure 1

22 pages, 13507 KB  
Article
Integrating AI for In-Depth Segmentation of Coastal Environments in Remote Sensing Imagery
by Pelagia Drakopoulou, Paraskevi Tzouveli, Aikaterini Karditsa and Serafim Poulos
Remote Sens. 2026, 18(2), 325; https://doi.org/10.3390/rs18020325 - 19 Jan 2026
Viewed by 133
Abstract
Mapping coastal landforms is critical for the sustainable management of ecosystems influenced by both natural dynamics and human activity. This study investigates the application of Transformer-based semantic segmentation models for pixel-level classification of key surface types such as water, sandy shores, rocky areas, [...] Read more.
Mapping coastal landforms is critical for the sustainable management of ecosystems influenced by both natural dynamics and human activity. This study investigates the application of Transformer-based semantic segmentation models for pixel-level classification of key surface types such as water, sandy shores, rocky areas, vegetation, and built structures. We utilize a diverse, multi-resolution dataset that includes NAIP (1 m), Quadrangle (6 m), Sentinel-2 (10 m), and Landsat-8 (15 m) imagery from U.S. coastlines, along with high-resolution aerial images of the Greek coastline provided by the Hellenic Land Registry. Due to the lack of labeled Greek data, models were pre-trained on U.S. datasets and fine-tuned using a manually annotated subset of Greek images. We evaluate the performance of three advanced Transformer architectures, with Mask2Former achieving the most robust results, further improved 11 through a coastal-class weighted focal loss to enhance boundary precision. The findings demonstrate that Transformer-based models offer an effective, scalable, and cost-efficient solution for automated coastal monitoring. This work highlights the potential of AI-driven remote sensing to replace or complement traditional in-situ surveys, and lays the foundation for future research in multimodal data integration and regional adaptation for environmental analysis. Full article
Show Figures

Figure 1

25 pages, 19621 KB  
Article
Scrap-SAM-CLIP: Assembling Foundation Models for Typical Shape Recognition in Scrap Classification and Rating
by Guangda Bao, Wenzhi Xia, Haichuan Wang, Zhiyou Liao, Ting Wu and Yun Zhou
Sensors 2026, 26(2), 656; https://doi.org/10.3390/s26020656 - 18 Jan 2026
Viewed by 293
Abstract
To address the limitation of 2D methods in inferring absolute scrap dimensions from images, we propose Scrap-SAM-CLIP (SSC), a vision-language model integrating the segment anything model (SAM) and contrastive language-image pre-training in Chinese (CN-CLIP). The model enables identification of canonical scrap shapes, establishing [...] Read more.
To address the limitation of 2D methods in inferring absolute scrap dimensions from images, we propose Scrap-SAM-CLIP (SSC), a vision-language model integrating the segment anything model (SAM) and contrastive language-image pre-training in Chinese (CN-CLIP). The model enables identification of canonical scrap shapes, establishing a foundational framework for subsequent 3D reconstruction and dimensional extraction within the 3D recognition pipeline. Individual modules of SSC are fine-tuned on the self-constructed scrap dataset. For segmentation, the combined box-and-point prompt yields optimal performance among various prompting strategies. MobileSAM and SAM-HQ-Tiny serve as effective lightweight alternatives for edge deployment. Fine-tuning the SAM decoder significantly enhances robustness under noisy prompts, improving accuracy by at least 5.55% with a five-positive-points prompt and up to 15.00% with a five-positive-points-and-five-negative-points prompt. In classification, SSC achieves 95.3% accuracy, outperforming Swin Transformer V2_base by 2.9%, with t-SNE visualizations confirming superior feature learning capability. The performance advantages of SSC stem from its modular assembly strategy, enabling component-specific optimization through subtask decoupling and enhancing system interpretability. This work refines the scrap 3D identification pipeline and demonstrates the efficacy of adapted foundation models in industrial vision systems. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

20 pages, 31235 KB  
Article
Muscle Fatigue Assessment in Healthcare Application by Using Surface Electromyography: A Transfer Learning Approach
by Andrea Manni, Gabriele Rescio, Andrea Caroppo and Alessandro Leone
Sensors 2026, 26(2), 654; https://doi.org/10.3390/s26020654 - 18 Jan 2026
Viewed by 221
Abstract
Monitoring muscle fatigue is essential to ensure safety and support activity in populations such as the elderly. This study introduces a novel deep learning framework for classifying muscle fatigue levels using data from wireless surface electromyographic sensors, with the long-term goal of supporting [...] Read more.
Monitoring muscle fatigue is essential to ensure safety and support activity in populations such as the elderly. This study introduces a novel deep learning framework for classifying muscle fatigue levels using data from wireless surface electromyographic sensors, with the long-term goal of supporting applications in Ambient Assisted Living. A new dataset was collected from healthy elderly and non-elderly adults performing dynamic tasks under controlled conditions, with muscle fatigue levels labelled through self-assessment. The proposed method employs a pipeline that transforms one-dimensional electromyographic signals into two-dimensional time–frequency images (scalograms) using the Continuous Wavelet Transform, which are then classified by a fine-tuned, pre-trained Convolutional Neural Network. These images are then classified by pretrained Convolutional Neural Networks on large-scale image datasets. The classification pipeline includes an initial binary discrimination between non-fatigued and fatigued conditions, followed by a refined three-level classification into No Fatigue, Moderate Fatigue, and Hard Fatigue. The system achieved an accuracy of 98.6% in the binary task and 95.6% in the multiclass setting. This integrated transfer learning pipeline outperformed traditional Machine Learning methods based on manually extracted features, which reached a maximum of 92% accuracy. These findings highlight the robustness and generalizability of the proposed approach, supporting its potential as a real-time, non-invasive muscle fatigue monitoring solution tailored to Ambient Assisted Living scenarios. Full article
(This article belongs to the Special Issue Feature Papers in Electronic Sensors 2025)
Show Figures

Figure 1

18 pages, 2581 KB  
Article
Enhancing Approaches to Detect Papilloma-Associated Hyperostosis Using a Few-Shot Transfer Learning Framework in Extremely Scarce Radiological Datasets
by Pham Huu Duy, Nguyen Minh Trieu and Nguyen Truong Thinh
Diagnostics 2026, 16(2), 311; https://doi.org/10.3390/diagnostics16020311 - 18 Jan 2026
Viewed by 145
Abstract
Background/Objectives: The application of deep learning models for rare diseases faces significant difficulties due to severe data scarcity. The detection of focal hyperostosis (PAH) is a crucial radiological sign for the surgical planning of sinonasal inverted papilloma, yet data is often limited. This [...] Read more.
Background/Objectives: The application of deep learning models for rare diseases faces significant difficulties due to severe data scarcity. The detection of focal hyperostosis (PAH) is a crucial radiological sign for the surgical planning of sinonasal inverted papilloma, yet data is often limited. This study introduces and validates a robust methodological framework for building clinically meaningful deep learning models under extremely limited data conditions (n = 20). Methods: We propose a few-shot learning framework based on the nnU-Net architecture, which integrates an in-domain transfer learning strategy (fine-tuning a pre-trained skull segmentation model) to address data scarcity. To further enhance robustness, a specialized data augmentation technique called “window shifting” is introduced to simulate inter-scanner variability. The entire framework was evaluated using a rigorous 5-fold cross-validation strategy. Results: Our proposed framework achieved a stable mean Dice Similarity Coefficient (DSC) of 0.48 ± 0.06. This performance significantly outperformed a baseline model trained from scratch, which failed to converge and yielded a clinically insignificant mean DSC of 0.09 ± 0.02. Conclusions: The analysis demonstrates that this methodological approach effectively overcomes instability and overfitting, generating reproducible and valuable predictions suitable for rare data types where large-scale data collection is not feasible. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

22 pages, 6241 KB  
Article
Using Large Language Models to Detect and Debunk Climate Change Misinformation
by Zeinab Shahbazi and Sara Behnamian
Big Data Cogn. Comput. 2026, 10(1), 34; https://doi.org/10.3390/bdcc10010034 - 17 Jan 2026
Viewed by 300
Abstract
The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. [...] Read more.
The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. This study presents a multi-stage system that employs state-of-the-art large language models such as Generative Pre-trained Transformer 4 (GPT-4), Large Language Model Meta AI (LLaMA) version 3 (LLaMA-3), and RoBERTa-large (Robustly optimized BERT pretraining approach large) to identify, classify, and generate scientifically grounded corrections for climate misinformation. The system integrates several complementary techniques, including transformer-based text classification, semantic similarity scoring using Sentence-BERT, stance detection, and retrieval-augmented generation (RAG) for evidence-grounded debunking. Misinformation instances are detected through a fine-tuned RoBERTa–Multi-Genre Natural Language Inference (MNLI) classifier (RoBERTa-MNLI), grouped using BERTopic, and verified against curated climate-science knowledge sources using BM25 and dense retrieval via FAISS (Facebook AI Similarity Search). The debunking component employs RAG-enhanced GPT-4 to produce accurate and persuasive counter-messages aligned with authoritative scientific reports such as those from the Intergovernmental Panel on Climate Change (IPCC). A diverse dataset of climate misinformation categories covering denialism, cherry-picking of data, false causation narratives, and misleading comparisons is compiled for evaluation. Benchmarking experiments demonstrate that LLM-based models substantially outperform traditional machine-learning baselines such as Support Vector Machines, Logistic Regression, and Random Forests in precision, contextual understanding, and robustness to linguistic variation. Expert assessment further shows that generated debunking messages exhibit higher clarity, scientific accuracy, and persuasive effectiveness compared to conventional fact-checking text. These results highlight the potential of advanced LLM-driven pipelines to provide scalable, real-time mitigation of climate misinformation while offering guidelines for responsible deployment of AI-assisted debunking systems. Full article
(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)
Show Figures

Figure 1

23 pages, 1503 KB  
Article
Hallucination-Aware Interpretable Sentiment Analysis Model: A Grounded Approach to Reliable Social Media Content Classification
by Abdul Rahaman Wahab Sait and Yazeed Alkhurayyif
Electronics 2026, 15(2), 409; https://doi.org/10.3390/electronics15020409 - 16 Jan 2026
Viewed by 180
Abstract
Sentiment analysis (SA) has become an essential tool for analyzing social media content in order to monitor public opinion and support digital analytics. Although transformer-based SA models exhibit remarkable performance, they lack mechanisms to mitigate hallucinated sentiment, which refers to the generation of [...] Read more.
Sentiment analysis (SA) has become an essential tool for analyzing social media content in order to monitor public opinion and support digital analytics. Although transformer-based SA models exhibit remarkable performance, they lack mechanisms to mitigate hallucinated sentiment, which refers to the generation of unsupported or overconfident predictions without explicit linguistic evidence. To address this limitation, this study presents a hallucination-aware SA model by incorporating semantic grounding, interpretability-congruent supervision, and neuro-symbolic reasoning within a unified architecture. The proposed model is based on a fine-tuned Open Pre-trained Transformer (OPT) model, using three fundamental mechanisms: a Sentiment Integrity Filter (SIF), a SHapley Additive exPlanations (SHAP)-guided regularization technique, and a confidence-based lexicon-deep fusion module. The experimental analysis was conducted on two multi-class sentiment datasets that contain Twitter (now X) and Reddit posts. In Dataset 1, the suggested model achieved an average accuracy of 97.6% and a hallucination rate of 2.3%, outperforming the current transformer-based and hybrid sentiment models. With Dataset 2, the framework demonstrated strong external generalization with an accuracy of 95.8%, and a hallucination rate of 3.4%, which is significantly lower than state-of-the-art methods. These findings indicate that it is possible to include hallucination mitigation into transformer optimization without any performance degradation, offering a deployable, interpretable, and linguistically complex social media SA framework, which will enhance the reliability of neural systems of language understanding. Full article
Show Figures

Figure 1

24 pages, 7667 KB  
Article
Trans-AODnet for Aerosol Optical Depth Retrieval and Atmospheric Correction of Moderate to High-Spatial-Resolution Satellite Imagery
by He Cai, Bo Zhong, Huilin Liu, Yao Li, Bailin Du, Yang Qiao, Xiaoya Wang, Shanlong Wu, Junjun Wu and Qinhuo Liu
Remote Sens. 2026, 18(2), 311; https://doi.org/10.3390/rs18020311 - 16 Jan 2026
Viewed by 110
Abstract
High accuracy and time synchronous aerosol optical depth (AOD) is essential for atmospheric correction (AC) of medium and high spatial resolution (MHSR) remote sensing data. However, existing high-resolution AOD retrieval methods often rely on sparsely distributed ground-based measurements, which limits their capacity to [...] Read more.
High accuracy and time synchronous aerosol optical depth (AOD) is essential for atmospheric correction (AC) of medium and high spatial resolution (MHSR) remote sensing data. However, existing high-resolution AOD retrieval methods often rely on sparsely distributed ground-based measurements, which limits their capacity to resolve fine-scale spatial heterogeneity and consequently constrains retrieval performance. To address this limitation, we propose a framework that takes GF-1 top-of-atmosphere (TOA) reflectance as input, where the model is first pre-trained using MCD19A2 as Pseudo-labels, with high-confidence samples weighted according to their spatial consistency and temporal stability, and then fine-tuned using Aerosol Robotic Network (AERONET) observations. This approach enables improved retrieval accuracy while better capturing surface variability. Validation across multiple regions demonstrates strong agreement with AOD measurements, achieving the correlation coefficient (R) of 0.941 and RMSE of 0.113. Compared to models without pretraining, the proportion of AOD retrievals within EE improves by 13%. While applied to AC, the corrected surface reflectance also shows strong consistency with in situ observations (R > 0.93, RMSE < 0.04). The proposed Trans-AODnet significantly enhances the accuracy and reliability of AOD inputs for AC of high-resolution wide-field sensors (e.g., GF-WFV), offering robust support for regional environmental monitoring and exhibiting strong potential for broader remote sensing applications. Full article
(This article belongs to the Section Atmospheric Remote Sensing)
Show Figures

Figure 1

19 pages, 8046 KB  
Article
Instruction Fine-Tuning Through the Lens of Verbatim Memorization
by Jie Zhang, Chi-Ho Lin and Suan Lee
Electronics 2026, 15(2), 377; https://doi.org/10.3390/electronics15020377 - 15 Jan 2026
Viewed by 200
Abstract
Supervised fine-tuning is key for model alignment, but its mechanisms are debated, with conflicting evidence supporting either a superficial alignment hypothesis or significant task improvements. This paper examines supervised fine-tuning’s impact from the perspective of verbatim memorization. Using the open-source OLMo-2 model series [...] Read more.
Supervised fine-tuning is key for model alignment, but its mechanisms are debated, with conflicting evidence supporting either a superficial alignment hypothesis or significant task improvements. This paper examines supervised fine-tuning’s impact from the perspective of verbatim memorization. Using the open-source OLMo-2 model series and test datasets (instruction format, safety-sensitive, and factual knowledge) constructed from its pre-training corpus, we analyzed changes across memorization, linguistic styles, and task performance. We found that supervised fine-tuning significantly weakens the model’s verbatim memorization of pre-training data. Simultaneously, it improves generated text in terms of alignment objectives, such as polite expression and structured organization. However, this process also leads to performance degradation on knowledge-intensive downstream tasks. Further representation analysis reveals that these changes are mainly concentrated in the later layers of the model. We conclude that supervised fine-tuning acts as a continuation of the learning process on new data. By adjusting model representations, supervised fine-tuning induces a learning tilt toward the styles and content of the instruction-tuning dataset. This inclination successfully instills alignment objectives while consequently reducing the effective accessibility of previously learned knowledge, which indicates the observed degradation in both pre-training data memorization and factual task performance. The source code is publicly available. Full article
Show Figures

Figure 1

27 pages, 4033 KB  
Article
Lightweight Fine-Tuning for Pig Cough Detection
by Xu Zhang, Baoming Li and Xiaoliu Xue
Animals 2026, 16(2), 253; https://doi.org/10.3390/ani16020253 - 14 Jan 2026
Viewed by 116
Abstract
Respiratory diseases pose a significant threat to intensive pig farming, and cough recognition serves as a key indicator for early intervention. However, its practical application is constrained by the scarcity of labeled samples and the complex acoustic conditions of farm environments. To address [...] Read more.
Respiratory diseases pose a significant threat to intensive pig farming, and cough recognition serves as a key indicator for early intervention. However, its practical application is constrained by the scarcity of labeled samples and the complex acoustic conditions of farm environments. To address these challenges, this study proposes a lightweight pig cough recognition method based on a pre-trained model. By freezing the backbone of a pre-trained audio neural network and fine-tuning only the classifier, our approach achieves effective knowledge transfer and domain adaptation with very limited data. We further enhance the model’s ability to capture temporal–spectral features of coughs through a time–frequency dual-stream module. On a dataset consisting of 107 cough events and 590 environmental noise clips, the proposed method achieved an accuracy of 94.59% and an F1-score of 92.86%, significantly outperforming several traditional machine learning and deep learning baseline models. Ablation studies validated the effectiveness of each component, with the model attaining a mean accuracy of 96.99% in cross-validation and demonstrating good calibration. The results indicate that our framework can achieve high-accuracy and well-generalized pig cough recognition under small-sample conditions. The main contribution of this work lies in proposing a lightweight fine-tuning paradigm for small-sample audio recognition in agricultural settings, offering a reliable technical solution for early warning of respiratory diseases on farms. It also highlights the potential of transfer learning in resource-limited scenarios. Full article
(This article belongs to the Section Pigs)
Show Figures

Figure 1

20 pages, 3780 KB  
Article
A Real-Time Dynamic Warning Method for MODS in Trauma Sepsis Patients Based on a Pre-Trained Transfer Learning Algorithm
by Jiahe Wen, Guanjun Liu, Panpan Chang, Pan Hu, Bin Liu, Chunliang Jiang, Xiaoyun Xu, Jun Ma and Guang Zhang
Diagnostics 2026, 16(2), 270; https://doi.org/10.3390/diagnostics16020270 - 14 Jan 2026
Viewed by 230
Abstract
Objectives: Multiple organ dysfunction syndrome (MODS) is a serious, prognostically poor complication in trauma sepsis. We developed an interpretable, multicenter-validated prediction model to enable early, individualized risk assessment and guide timely care. Methods: Using MIMIC-IV and eICU data, we built a pre-trained transfer-learning [...] Read more.
Objectives: Multiple organ dysfunction syndrome (MODS) is a serious, prognostically poor complication in trauma sepsis. We developed an interpretable, multicenter-validated prediction model to enable early, individualized risk assessment and guide timely care. Methods: Using MIMIC-IV and eICU data, we built a pre-trained transfer-learning model with a separation processing strategy and assessed interpretability with SHAP. Results: Internal validation included 700 MIMIC-IV patients; external validation included 110 eICU patients. Across 6-, 12-, and 24-h prediction windows, the best pre-trained model achieved an average AUC of 0.906. Notably, fine-tuning on only 100 trauma sepsis cases (3.6% of the training set) still yielded an AUC of 0.846, surpassing the non-pre-trained model by 0.165. SHAP analysis further revealed that platelet count was one of the most important variables contributing to MODS prediction. Conclusions: Overall, the pre-trained MODS model demonstrated robust discrimination, generalizability, and clear interpretability in both internal and external validations, highlighting its portability and clinical potential for early identification of high-risk trauma sepsis patients. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

Back to TopTop