Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,166)

Search Parameters:
Keywords = pretraining techniques

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 3892 KB  
Article
Transformer-Driven Semi-Supervised Learning for Prostate Cancer Histopathology: A DINOv2–TransUNet Framework
by Rubina Akter Rabeya, Jeong-Wook Seo, Nam Hoon Cho, Hee-Cheol Kim and Heung-Kook Choi
Mach. Learn. Knowl. Extr. 2026, 8(2), 26; https://doi.org/10.3390/make8020026 - 23 Jan 2026
Abstract
Prostate cancer is diagnosed through a comprehensive study of histopathology slides, which takes time and requires professional interpretation. To minimize this load, we developed a semi-supervised learning technique that combines transformer-based representation learning and a custom TransUNet classifier. To capture a wide range [...] Read more.
Prostate cancer is diagnosed through a comprehensive study of histopathology slides, which takes time and requires professional interpretation. To minimize this load, we developed a semi-supervised learning technique that combines transformer-based representation learning and a custom TransUNet classifier. To capture a wide range of morphological structures without manual annotation, our method pretrains DINOv2 on 10,000 unlabeled prostate tissue patches. After receiving the transformer-derived features, a bespoke CNN-based decoder uses residual upsampling and carefully constructed skip connections to merge data from many spatial scales. Expert pathologists identified only 20% of the patches in the whole dataset; the remaining unlabeled samples were contributed by using a consistency-driven learning method that promoted reliable predictions across various augmentations. The model received precision and recall scores of 91.81% and 89.02%, respectively, and an accuracy of 93.78% on an additional test set. These results exceed the performance of a conventional U-Net and a baseline encoder–decoder network. All things considered, the localized CNN (Convolutional Neural Network) decoding and global transformer attention provide a reliable method for prostate cancer classification in situations with little annotated data. Full article
Show Figures

Graphical abstract

41 pages, 2850 KB  
Article
Automated Classification of Humpback Whale Calls Using Deep Learning: A Comparative Study of Neural Architectures and Acoustic Feature Representations
by Jack C. Johnson and Yue Rong
Sensors 2026, 26(2), 715; https://doi.org/10.3390/s26020715 - 21 Jan 2026
Viewed by 65
Abstract
Passive acoustic monitoring (PAM) using hydrophones enables collecting acoustic data to be collected in large and diverse quantities, necessitating the need for a reliable automated classification system. This paper presents a data-processing pipeline and a set of neural networks designed for a humpback-whale-detection [...] Read more.
Passive acoustic monitoring (PAM) using hydrophones enables collecting acoustic data to be collected in large and diverse quantities, necessitating the need for a reliable automated classification system. This paper presents a data-processing pipeline and a set of neural networks designed for a humpback-whale-detection system. A collection of audio segments is compiled using publicly available audio repositories and extensively curated via manual methods, undertaking thorough examination, editing and clipping to produce a dataset minimizing bias or categorization errors. An array of standard data-augmentation techniques are applied to the collected audio, diversifying and expanding the original dataset. Multiple neural networks are designed and trained using TensorFlow 2.20.0 and Keras 3.13.1 frameworks, resulting in a custom curated architecture layout based on research and iterative improvements. The pre-trained model MobileNetV2 is also included for further analysis. Model performance demonstrates a strong dependence on both feature representation and network architecture. Mel spectrogram inputs consistently outperformed MFCC (Mel-Frequency Cepstral Coefficients) features across all model types. The highest performance was achieved by the pretrained MobileNetV2 using mel spectrograms without augmentation, reaching a test accuracy of 99.01% with balanced precision and recall of 99% and a Matthews correlation coefficient of 0.98. The custom CNN with mel spectrograms also achieved strong performance, with 98.92% accuracy and a false negative rate of only 0.75%. In contrast, models trained with MFCC representations exhibited consistently lower robustness and higher false negative rates. These results highlight the comparative strengths of the evaluated feature representations and network architectures for humpback whale detection. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

18 pages, 2581 KB  
Article
Enhancing Approaches to Detect Papilloma-Associated Hyperostosis Using a Few-Shot Transfer Learning Framework in Extremely Scarce Radiological Datasets
by Pham Huu Duy, Nguyen Minh Trieu and Nguyen Truong Thinh
Diagnostics 2026, 16(2), 311; https://doi.org/10.3390/diagnostics16020311 - 18 Jan 2026
Viewed by 133
Abstract
Background/Objectives: The application of deep learning models for rare diseases faces significant difficulties due to severe data scarcity. The detection of focal hyperostosis (PAH) is a crucial radiological sign for the surgical planning of sinonasal inverted papilloma, yet data is often limited. This [...] Read more.
Background/Objectives: The application of deep learning models for rare diseases faces significant difficulties due to severe data scarcity. The detection of focal hyperostosis (PAH) is a crucial radiological sign for the surgical planning of sinonasal inverted papilloma, yet data is often limited. This study introduces and validates a robust methodological framework for building clinically meaningful deep learning models under extremely limited data conditions (n = 20). Methods: We propose a few-shot learning framework based on the nnU-Net architecture, which integrates an in-domain transfer learning strategy (fine-tuning a pre-trained skull segmentation model) to address data scarcity. To further enhance robustness, a specialized data augmentation technique called “window shifting” is introduced to simulate inter-scanner variability. The entire framework was evaluated using a rigorous 5-fold cross-validation strategy. Results: Our proposed framework achieved a stable mean Dice Similarity Coefficient (DSC) of 0.48 ± 0.06. This performance significantly outperformed a baseline model trained from scratch, which failed to converge and yielded a clinically insignificant mean DSC of 0.09 ± 0.02. Conclusions: The analysis demonstrates that this methodological approach effectively overcomes instability and overfitting, generating reproducible and valuable predictions suitable for rare data types where large-scale data collection is not feasible. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

22 pages, 6241 KB  
Article
Using Large Language Models to Detect and Debunk Climate Change Misinformation
by Zeinab Shahbazi and Sara Behnamian
Big Data Cogn. Comput. 2026, 10(1), 34; https://doi.org/10.3390/bdcc10010034 - 17 Jan 2026
Viewed by 273
Abstract
The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. [...] Read more.
The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. This study presents a multi-stage system that employs state-of-the-art large language models such as Generative Pre-trained Transformer 4 (GPT-4), Large Language Model Meta AI (LLaMA) version 3 (LLaMA-3), and RoBERTa-large (Robustly optimized BERT pretraining approach large) to identify, classify, and generate scientifically grounded corrections for climate misinformation. The system integrates several complementary techniques, including transformer-based text classification, semantic similarity scoring using Sentence-BERT, stance detection, and retrieval-augmented generation (RAG) for evidence-grounded debunking. Misinformation instances are detected through a fine-tuned RoBERTa–Multi-Genre Natural Language Inference (MNLI) classifier (RoBERTa-MNLI), grouped using BERTopic, and verified against curated climate-science knowledge sources using BM25 and dense retrieval via FAISS (Facebook AI Similarity Search). The debunking component employs RAG-enhanced GPT-4 to produce accurate and persuasive counter-messages aligned with authoritative scientific reports such as those from the Intergovernmental Panel on Climate Change (IPCC). A diverse dataset of climate misinformation categories covering denialism, cherry-picking of data, false causation narratives, and misleading comparisons is compiled for evaluation. Benchmarking experiments demonstrate that LLM-based models substantially outperform traditional machine-learning baselines such as Support Vector Machines, Logistic Regression, and Random Forests in precision, contextual understanding, and robustness to linguistic variation. Expert assessment further shows that generated debunking messages exhibit higher clarity, scientific accuracy, and persuasive effectiveness compared to conventional fact-checking text. These results highlight the potential of advanced LLM-driven pipelines to provide scalable, real-time mitigation of climate misinformation while offering guidelines for responsible deployment of AI-assisted debunking systems. Full article
(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)
Show Figures

Figure 1

23 pages, 1503 KB  
Article
Hallucination-Aware Interpretable Sentiment Analysis Model: A Grounded Approach to Reliable Social Media Content Classification
by Abdul Rahaman Wahab Sait and Yazeed Alkhurayyif
Electronics 2026, 15(2), 409; https://doi.org/10.3390/electronics15020409 - 16 Jan 2026
Viewed by 165
Abstract
Sentiment analysis (SA) has become an essential tool for analyzing social media content in order to monitor public opinion and support digital analytics. Although transformer-based SA models exhibit remarkable performance, they lack mechanisms to mitigate hallucinated sentiment, which refers to the generation of [...] Read more.
Sentiment analysis (SA) has become an essential tool for analyzing social media content in order to monitor public opinion and support digital analytics. Although transformer-based SA models exhibit remarkable performance, they lack mechanisms to mitigate hallucinated sentiment, which refers to the generation of unsupported or overconfident predictions without explicit linguistic evidence. To address this limitation, this study presents a hallucination-aware SA model by incorporating semantic grounding, interpretability-congruent supervision, and neuro-symbolic reasoning within a unified architecture. The proposed model is based on a fine-tuned Open Pre-trained Transformer (OPT) model, using three fundamental mechanisms: a Sentiment Integrity Filter (SIF), a SHapley Additive exPlanations (SHAP)-guided regularization technique, and a confidence-based lexicon-deep fusion module. The experimental analysis was conducted on two multi-class sentiment datasets that contain Twitter (now X) and Reddit posts. In Dataset 1, the suggested model achieved an average accuracy of 97.6% and a hallucination rate of 2.3%, outperforming the current transformer-based and hybrid sentiment models. With Dataset 2, the framework demonstrated strong external generalization with an accuracy of 95.8%, and a hallucination rate of 3.4%, which is significantly lower than state-of-the-art methods. These findings indicate that it is possible to include hallucination mitigation into transformer optimization without any performance degradation, offering a deployable, interpretable, and linguistically complex social media SA framework, which will enhance the reliability of neural systems of language understanding. Full article
Show Figures

Figure 1

36 pages, 2298 KB  
Review
Onboard Deployment of Remote Sensing Foundation Models: A Comprehensive Review of Architecture, Optimization, and Hardware
by Hanbo Sang, Limeng Zhang, Tianrui Chen, Weiwei Guo and Zenghui Zhang
Remote Sens. 2026, 18(2), 298; https://doi.org/10.3390/rs18020298 - 16 Jan 2026
Viewed by 227
Abstract
With the rapid growth of multimodal remote sensing (RS) data, there is an increasing demand for intelligent onboard computing to alleviate the transmission and latency bottlenecks of traditional orbit-to-ground downlinking workflows. While many lightweight AI algorithms have been widely developed and deployed for [...] Read more.
With the rapid growth of multimodal remote sensing (RS) data, there is an increasing demand for intelligent onboard computing to alleviate the transmission and latency bottlenecks of traditional orbit-to-ground downlinking workflows. While many lightweight AI algorithms have been widely developed and deployed for onboard inference, their limited generalization capability restricts performance under the diverse and dynamic conditions of advanced Earth observation. Recent advances in remote sensing foundation models (RSFMs) offer a promising solution by providing pretrained representations with strong adaptability across diverse tasks and modalities. However, the deployment of RSFMs onboard resource-constrained devices such as nano satellites remains a significant challenge due to strict limitations in memory, energy, computation, and radiation tolerance. To this end, this review proposes the first comprehensive survey of onboard RSFMs deployment, where a unified deployment pipeline including RSFMs development, model compression techniques, and hardware optimization is introduced and surveyed in detail. Available hardware platforms are also discussed and compared, based on which some typical case studies for low Earth orbit (LEO) CubeSats are presented to analyze the feasibility of onboard RSFMs’ deployment. To conclude, this review aims to serve as a practical roadmap for future research on the deployment of RSFMs on edge devices, bridging the gap between the large-scale RSFMs and the resource constraints of spaceborne platforms for onboard computing. Full article
Show Figures

Graphical abstract

22 pages, 401 KB  
Article
Federated Learning for Intrusion Detection Under Class Imbalance: A Multi-Domain Ablation Study with Per-Client SMOTE
by Atike Demirbaş Paray and Murat Aydos
Appl. Sci. 2026, 16(2), 801; https://doi.org/10.3390/app16020801 - 13 Jan 2026
Viewed by 144
Abstract
Federated learning (FL) enables privacy-preserving collaboration for Network Intrusion Detection Systems (NIDSs), but its effectiveness under heterogeneous traffic, severe class imbalance, and domain shift remains insufficiently characterized. We evaluate FL in two settings: (i) single-domain training on CICIDS-2017, InSDN/OVS, and 5G-NIDD with cross-domain [...] Read more.
Federated learning (FL) enables privacy-preserving collaboration for Network Intrusion Detection Systems (NIDSs), but its effectiveness under heterogeneous traffic, severe class imbalance, and domain shift remains insufficiently characterized. We evaluate FL in two settings: (i) single-domain training on CICIDS-2017, InSDN/OVS, and 5G-NIDD with cross-domain testing, and (ii) multi-domain training that learns a unified model across enterprise and Software-Defined Network (SDN) traffic. Using consistent preprocessing and controlled ablations over balancing strategy, loss function, and client sampling, we find that dataset structure (class separability) largely determines single-domain FL gains. On datasets with lower separability, FL with Per-Client Synthetic Minority Over-sampling Technique (SMOTE) substantially improves Macro-F1 over centralized baselines, while well-separated datasets show limited benefit. However, single-domain models degrade sharply under domain shift, showing substantial degradation in cross-domain transfer. To mitigate this, we combine multi-domain FL with AutoEncoder pretraining and achieve 77% Macro-F1 across environments, demonstrating that FL can learn domain-invariant representations when trained on diverse traffic sources. Overall, our results indicate that Per-Client SMOTE is the preferred balancing strategy for federated NIDS, and that multi-domain training is often necessary when deployment environments differ from training data. Full article
Show Figures

Figure 1

15 pages, 665 KB  
Article
Comparative Evaluation of Deep Learning Models for the Classification of Impacted Maxillary Canines on Panoramic Radiographs
by Nazlı Tokatlı, Buket Erdem, Mustafa Özcan, Begüm Turan Maviş, Çağla Şar and Fulya Özdemir
Diagnostics 2026, 16(2), 219; https://doi.org/10.3390/diagnostics16020219 - 9 Jan 2026
Viewed by 240
Abstract
Background/Objectives: The early and accurate identification of impacted teeth in the maxilla is critical for effective dental treatment planning. Traditional diagnostic methods relying on manual interpretation of radiographic images are often time-consuming and subject to variability. Methods: This study presents a deep learning-based [...] Read more.
Background/Objectives: The early and accurate identification of impacted teeth in the maxilla is critical for effective dental treatment planning. Traditional diagnostic methods relying on manual interpretation of radiographic images are often time-consuming and subject to variability. Methods: This study presents a deep learning-based approach for automated classification of impacted maxillary canines using panoramic radiographs. A comparative evaluation of four pre-trained convolutional neural network (CNN) architectures—ResNet50, Xception, InceptionV3, and VGG16—was conducted through transfer learning techniques. In this retrospective single-center study, the dataset comprised 694 annotated panoramic radiographs sourced from the archives of a university dental hospital, with a mildly imbalanced representation of impacted and non-impacted cases. Models were assessed using accuracy, precision, recall, specificity, and F1-score. Results: Among the tested architectures, VGG16 demonstrated superior performance, achieving an accuracy of 99.28% and an F1-score of 99.43%. Additionally, a prototype diagnostic interface was developed to demonstrate the potential for clinical application. Conclusions: The findings underscore the potential of deep learning models, particularly VGG16, in enhancing diagnostic workflows; however, further validation on diverse, multi-center datasets is required to confirm clinical generalizability. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

28 pages, 2805 KB  
Review
Emerging Trends in Artificial Intelligence-Assisted Colorimetric Biosensors for Pathogen Diagnostics
by Muniyandi Maruthupandi and Nae Yoon Lee
Sensors 2026, 26(2), 439; https://doi.org/10.3390/s26020439 - 9 Jan 2026
Viewed by 259
Abstract
Infectious diseases caused by bacterial and viral pathogens remain a major global threat, particularly in areas with limited diagnostic resources. Conventional optical techniques are time-consuming, prone to operator errors, and require sophisticated instruments. Colorimetric biosensors, which convert biorecognitive processes into visible color changes, [...] Read more.
Infectious diseases caused by bacterial and viral pathogens remain a major global threat, particularly in areas with limited diagnostic resources. Conventional optical techniques are time-consuming, prone to operator errors, and require sophisticated instruments. Colorimetric biosensors, which convert biorecognitive processes into visible color changes, enable simple and low-cost point-of-care testing. Artificial intelligence (AI) enhances decision-making by enabling learning, training, and pattern recognition. Machine learning (ML) and deep learning (DL) improve diagnostic accuracy, but they do not autonomously adapt and are pre-trained on complex color variation, whereas traditional computer-based methods lack analysis ability. This review summarizes major pathogens in terms of their types, toxicity, and infection-related mortality, while highlighting research gaps between conventional optical biosensors and emerging AI-assisted colorimetric approaches. Recent advances in AI models, such as ML and DL algorithms, are discussed with a focus on their applications to clinical samples over the past five years. Finally, we propose a prospective direction for developing robust, explainable, and smartphone-compatible AI-assisted assays to support rapid, accurate, and user-friendly pathogen detection for health and clinical applications. This review provides a comprehensive overview of the AI models available to assist physicians and researchers in selecting the most effective method for pathogen detection. Full article
(This article belongs to the Special Issue Colorimetric Sensors: Methods and Applications (2nd Edition))
Show Figures

Figure 1

30 pages, 4507 KB  
Article
Training-Free Lightweight Transfer Learning for Land Cover Segmentation Using Multispectral Calibration
by Hye-Jung Moon and Nam-Wook Cho
Remote Sens. 2026, 18(2), 205; https://doi.org/10.3390/rs18020205 - 8 Jan 2026
Viewed by 142
Abstract
This study proposes a lightweight framework for transferring pretrained land cover classification architectures without additional training. The system utilizes French IGN imagery and Korean UAV and aerial imagery. It employs FLAIR U-Net models with ResNet34 and MiTB5 backbones, along with the AI-HUB U-Net. [...] Read more.
This study proposes a lightweight framework for transferring pretrained land cover classification architectures without additional training. The system utilizes French IGN imagery and Korean UAV and aerial imagery. It employs FLAIR U-Net models with ResNet34 and MiTB5 backbones, along with the AI-HUB U-Net. The implementation consists of four sequential stages. First, we perform class mapping between heterogeneous schemes and unify coordinate systems. Second, a quadratic polynomial regression equation is constructed. This formula uses multispectral band statistics as hyperparameters and class-wise IoU as the dependent variable. Third, optimal parameters are identified using the stationary point condition of Response Surface Methodology (RSM). Fourth, the final land cover map is generated by fusing class-wise optimal results at the pixel level. Experimental results show that optimization is typically completed within 60 inferences. This procedure achieves IoU improvements of up to 67.86 percentage points compared to the baseline. For automated application, these optimized values from a source domain are successfully transferred to target areas. This includes transfers between high-altitude mountainous and low-lying coastal territories via proportional mapping. This capability demonstrates cross-regional and cross-platform generalization between ResNet34 and MiTB5. Statistical validation confirmed that the performance surface followed a systematic quadratic response. Adjusted R2 values ranged from 0.706 to 0.999, with all p-values below 0.001. Consequently, the performance function is universally applicable across diverse geographic zones, spectral distributions, spatial resolutions, sensors, neural networks, and land cover classes. This approach achieves more than a 4000-fold reduction in computational resources compared to full model training, using only 32 to 150 tiles. Furthermore, the proposed technique demonstrates 10–74× superior resource efficiency (resource consumption per unit error reduction) over prior transfer learning schemes. Finally, this study presents a practical solution for inference and performance optimization of land cover semantic segmentation on standard commodity CPUs, while maintaining equivalent or superior IoU. Full article
Show Figures

Figure 1

33 pages, 4122 KB  
Article
Empirical Evaluation of UNet for Segmentation of Applicable Surfaces for Seismic Sensor Installation
by Mikhail Uzdiaev, Marina Astapova, Andrey Ronzhin and Aleksandra Figurek
J. Imaging 2026, 12(1), 34; https://doi.org/10.3390/jimaging12010034 - 8 Jan 2026
Viewed by 253
Abstract
The deployment of wireless seismic nodal systems necessitates the efficient identification of optimal locations for sensor installation, considering factors such as ground stability and the absence of interference. Semantic segmentation of satellite imagery has advanced significantly, and its application to this specific task [...] Read more.
The deployment of wireless seismic nodal systems necessitates the efficient identification of optimal locations for sensor installation, considering factors such as ground stability and the absence of interference. Semantic segmentation of satellite imagery has advanced significantly, and its application to this specific task remains unexplored. This work presents a baseline empirical evaluation of the U-Net architecture for the semantic segmentation of surfaces applicable for seismic sensor installation. We utilize a novel dataset of Sentinel-2 multispectral images, specifically labeled for this purpose. The study investigates the impact of pretrained encoders (EfficientNetB2, Cross-Stage Partial Darknet53—CSPDarknet53, and Multi-Axis Vision Transformer—MAxViT), different combinations of Sentinel-2 spectral bands (Red, Green, Blue (RGB), RGB+Near Infrared (NIR), 10-bands with 10 and 20 m/pix spatial resolution, full 13-band), and a technique for improving small object segmentation by modifying the input convolutional layer stride. Experimental results demonstrate that the CSPDarknet53 encoder generally outperforms the others (IoU = 0.534, Precision = 0.716, Recall = 0.635). The combination of RGB and Near-Infrared bands (10 m/pixel resolution) yielded the most robust performance across most configurations. Reducing the input stride from 2 to 1 proved beneficial for segmenting small linear objects like roads. The findings establish a baseline for this novel task and provide practical insights for optimizing deep learning models in the context of automated seismic nodal network installation planning. Full article
(This article belongs to the Special Issue Image Segmentation: Trends and Challenges)
Show Figures

Figure 1

12 pages, 465 KB  
Article
Using QR Codes for Payment Card Fraud Detection
by Rachid Chelouah and Prince Nwaekwu
Information 2026, 17(1), 39; https://doi.org/10.3390/info17010039 - 4 Jan 2026
Viewed by 284
Abstract
Debit and credit card payments have become the preferred method of payment for consumers, replacing paper checks and cash. However, this shift has also led to an increase in concerns regarding identity theft and payment security. To address these challenges, it is crucial [...] Read more.
Debit and credit card payments have become the preferred method of payment for consumers, replacing paper checks and cash. However, this shift has also led to an increase in concerns regarding identity theft and payment security. To address these challenges, it is crucial to develop an effective, secure, and reliable payment system. This research presents a comprehensive study on payment card fraud detection using deep learning techniques. The introduction highlights the significance of a strong financial system supported by a quick and secure payment system. It emphasizes the need for advanced methods to detect fraudulent activities in card transactions. The proposed methodology focuses on the conversion of a comma-separated values (CSV) dataset into quick response (QR) code images, enabling the application of deep neural networks and transfer learning. This representation enables leveraging pre-trained image-based architectures by encoding numeric transaction attributes into visual patterns suitable for convolutional neural networks. The feature extraction process involves the use of a convolutional neural network, specifically a residual network architecture. The results obtained through the under-sampling dataset balancing method revealed promising performance in terms of precision, accuracy, recall, and F1 score for the traditional models such as K-nearest neighbors (KNN), Decision Tree, Random Forest, AdaBoost, Bagging, and Gaussian Naïve Bayes. Furthermore, the proposed deep neural network model achieved high precision, indicating its effectiveness in detecting card fraud. The model also achieved high accuracy, recall, and F1 score, showcasing its superior performance compared to traditional machine learning models. In summary, this research contributes to the field of payment card fraud detection by leveraging deep learning techniques. The proposed methodology offers a sophisticated approach to detecting fraudulent activities in card payment systems, addressing the growing concerns of identity theft and payment security. By deploying the trained model in an Android application, real-time fraud detection becomes possible, further enhancing the security of card transactions. The findings of this study provide insights and avenues for future advancements in the field of payment card fraud detection. Full article
(This article belongs to the Section Information Security and Privacy)
Show Figures

Figure 1

40 pages, 5732 KB  
Review
From Context to Human: A Review of VLM Contextualization in the Recognition of Human States in Visual Data
by Corneliu Florea, Constantin-Bogdan Popescu, Andrei Racovițeanu, Andreea Nițu and Laura Florea
Mathematics 2026, 14(1), 175; https://doi.org/10.3390/math14010175 - 2 Jan 2026
Viewed by 323
Abstract
This paper presents a narrative review of the contextualization and contribution offered by vision–language models (VLMs) for human-centric understanding in images. Starting from the correlation between humans and their context (background) and by incorporating VLM-generated embeddings into recognition architectures, recent solutions have advanced [...] Read more.
This paper presents a narrative review of the contextualization and contribution offered by vision–language models (VLMs) for human-centric understanding in images. Starting from the correlation between humans and their context (background) and by incorporating VLM-generated embeddings into recognition architectures, recent solutions have advanced the recognition of human actions, the detection and classification of violent behavior, and inference of human emotions from body posture and facial expression. While powerful and general, VLMs may also introduce biases that can be reflected in the overall performance. Unlike prior reviews that focus on a single task or generic image captioning, this review jointly examines multiple human-centric problems in VLM-based approaches. The study begins by describing the key elements of VLMs (including architectural foundations, pre-training techniques, and cross-modal fusion strategies) and explains why they are suitable for contextualization. In addition to highlighting the improvements brought by VLMs, it critically discusses their limitations (including human-related biases) and presents a mathematical perspective and strategies for mitigating them. This review aims to consolidate the technical landscape of VLM-based contextualization for human state recognition and detection. It aims to serve as a foundational reference for researchers seeking to control the power of language-guided VLMs in recognizing human states correlated with contextual cues. Full article
(This article belongs to the Special Issue Advance in Neural Networks and Visual Learning)
Show Figures

Figure 1

19 pages, 3524 KB  
Article
Research on Underwater Fish Scale Loss Detection Method Based on Improved YOLOv8m and Transfer Learning
by Qiang Wang, Zhengyang Yu, Renxin Liu, Xingpeng Peng, Xiaoling Yang and Xiuwen He
Fishes 2026, 11(1), 21; https://doi.org/10.3390/fishes11010021 - 29 Dec 2025
Viewed by 269
Abstract
Monitoring fish skin health is essential in aquaculture, where scale loss serves as a critical indicator of fish health and welfare. However, automatic detection of scale loss regions remains challenging due to factors such as uneven underwater illumination, water turbidity, and complex background [...] Read more.
Monitoring fish skin health is essential in aquaculture, where scale loss serves as a critical indicator of fish health and welfare. However, automatic detection of scale loss regions remains challenging due to factors such as uneven underwater illumination, water turbidity, and complex background conditions. To address this issue, we constructed a scale loss dataset comprising approximately 2750 images captured under both clear above-water and complex underwater conditions, featuring over 7200 annotated targets. Various image enhancement techniques were evaluated, and the Clarity method was selected for preprocessing underwater samples to enhance feature representation. Based on the YOLOv8m architecture, we replaced the original FPN + PAN structure with a weighted bidirectional feature pyramid network to improve multi-scale feature fusion. A convolutional block attention module was incorporated into the output layers to highlight scale loss features in both channel and spatial dimensions. Additionally, a two-stage transfer learning strategy was employed, involving pretraining the model on above water data and subsequently fine-tuning it on a limited set of underwater samples to mitigate the effects of domain shift. Experimental results demonstrate that the proposed method achieves a mAP50 of 96.81%, a 5.98 percentage point improvement over the baseline YOLOv8m, with Precision and Recall increased by 10.14% and 8.70%, respectively. This approach reduces false positives and false negatives, showing excellent detection accuracy and robustness in complex underwater environments, offering a practical and effective approach for early fish disease monitoring in aquaculture. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Aquaculture)
Show Figures

Graphical abstract

22 pages, 1912 KB  
Article
Privacy-Aware Continual Self-Supervised Learning on Multi-Window Chest Computed Tomography for Domain-Shift Robustness
by Ren Tasai, Guang Li, Ren Togo, Takahiro Ogawa, Kenji Hirata, Minghui Tang, Takaaki Yoshimura, Hiroyuki Sugimori, Noriko Nishioka, Yukie Shimizu, Kohsuke Kudo and Miki Haseyama
Bioengineering 2026, 13(1), 32; https://doi.org/10.3390/bioengineering13010032 - 27 Dec 2025
Viewed by 354
Abstract
We propose a novel continual self-supervised learning (CSSL) framework for simultaneously learning diverse features from multi-window-obtained chest computed tomography (CT) images and ensuring data privacy. Achieving a robust and highly generalizable model in medical image diagnosis is challenging, mainly because of issues, such [...] Read more.
We propose a novel continual self-supervised learning (CSSL) framework for simultaneously learning diverse features from multi-window-obtained chest computed tomography (CT) images and ensuring data privacy. Achieving a robust and highly generalizable model in medical image diagnosis is challenging, mainly because of issues, such as the scarcity of large-scale, accurately annotated datasets and domain shifts inherent to dynamic healthcare environments. Specifically, in chest CT, these domain shifts often arise from differences in window settings, which are optimized for distinct clinical purposes. Previous CSSL frameworks often mitigated domain shift by reusing past data, a typically impractical approach owing to privacy constraints. Our approach addresses these challenges by effectively capturing the relationship between previously learned knowledge and new information across different training stages through continual pretraining on unlabeled images. Specifically, by incorporating a latent replay-based mechanism into CSSL, our method mitigates catastrophic forgetting due to domain shifts during continual pretraining while ensuring data privacy. Additionally, we introduce a feature distillation technique that integrates Wasserstein distance-based knowledge distillation and batch-knowledge ensemble, enhancing the ability of the model to learn meaningful, domain-shift-robust representations. Finally, we validate our approach using chest CT images obtained across two different window settings, demonstrating superior performance compared with other approaches. Full article
(This article belongs to the Special Issue Modern Medical Imaging in Disease Diagnosis Applications)
Show Figures

Figure 1

Back to TopTop