MDPI - Publisher of Open Access Journals

19 pages, 5315 KB

Open AccessFeature PaperArticle

Style-Aware and Uncertainty-Guided Approach to Semi-Supervised Domain Generalization in Medical Imaging

by Zineb Tissir, Yunyoung Chang and Sang-Woong Lee

Mathematics 2025, 13(17), 2763; https://doi.org/10.3390/math13172763 - 28 Aug 2025

Deep learning has significantly advanced medical image analysis by enabling accurate, automated diagnosis across diverse clinical tasks such as lesion classification and disease detection. However, the practical deployment of these systems is still hindered by two major challenges: the limited availability of expert-annotated [...] Read more.

Deep learning has significantly advanced medical image analysis by enabling accurate, automated diagnosis across diverse clinical tasks such as lesion classification and disease detection. However, the practical deployment of these systems is still hindered by two major challenges: the limited availability of expert-annotated data and substantial domain shifts caused by variations in imaging devices, acquisition protocols, and patient populations. Although recent semi-supervised domain generalization (SSDG) approaches attempt to address these challenges, they often suffer from two key limitations: (i) reliance on computationally expensive uncertainty modeling techniques such as Monte Carlo dropout, and (ii) inflexible shared-head classifiers that fail to capture domain-specific variability across heterogeneous imaging styles. To overcome these limitations, we propose MultiStyle-SSDG, a unified semi-supervised domain generalization framework designed to improve model generalization in low-label scenarios. Our method introduces a multi-style ensemble pseudo-labeling strategy guided by entropy-based filtering, incorporates prototype-based conformity and semantic alignment to regularize the feature space, and employs a domain-specific multi-head classifier fused through attention-weighted prediction. Additionally, we introduce a dual-level neural-style transfer pipeline that simulates realistic domain shifts while preserving diagnostic semantics. We validated our framework on the ISIC2019 skin lesion classification benchmark using 5% and 10% labeled data. MultiStyle-SSDG consistently outperformed recent state-of-the-art methods such as FixMatch, StyleMatch, and UPLM, achieving statistically significant improvements in classification accuracy under simulated domain shifts including style, background, and corruption. Specifically, our method achieved 78.6% accuracy with 5% labeled data and 80.3% with 10% labeled data on ISIC2019, surpassing FixMatch by 4.9–5.3 percentage points and UPLM by 2.1–2.4 points. Ablation studies further confirmed the individual contributions of each component, and t-SNE visualizations illustrate enhanced intra-class compactness and cross-domain feature consistency. These results demonstrate that our style-aware, modular framework offers a robust and scalable solution for generalizable computer-aided diagnosis in real-world medical imaging settings. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

18 pages, 2235 KB

Open AccessArticle

FRAM-Based Safety Culture Model for the Analysis of Socio-Technical and Environmental Variability in Mechanised Agricultural Activities

by Pierluigi Rossi, Federica Caffaro and Massimo Cecchini

Safety 2025, 11(3), 80; https://doi.org/10.3390/safety11030080 - 25 Aug 2025

Viewed by 213

Abstract

Mechanised agricultural operations are often performed individually, under minimal supervision and across a wide range of unfavourable working conditions, resulting in a complex mixture of hazards and external stressors that severely affect safety conditions. Socio-technical and environmental constraints significantly affect safety culture and [...] Read more.

Mechanised agricultural operations are often performed individually, under minimal supervision and across a wide range of unfavourable working conditions, resulting in a complex mixture of hazards and external stressors that severely affect safety conditions. Socio-technical and environmental constraints significantly affect safety culture and require continuous performance adjustments to overcome timing pressures, resource limitations, and unstable weather conditions. This study introduces a FRAM-based safety culture model that embeds the thoroughness-efficiency trade-off (ETTO) in four distinct operational modes that adhere to specific safety cultures, namely, thoroughness, risk awareness, compliance, and efficiency. This model has been instantiated for mechanised ploughing: foreground task functions were coupled with background functions that represent socio-technical constraints and environmental variability, while severity classes for potential incidents were derived from the US OSHA accident database. The framework was also supported by a semi-quantitative Resonance Index based on severity and coupling strength, the Total Resonance Index (TRI), to assess how variability propagates in foreground functions and to identify hot-spot functions where small adjustments can escalate into high resonance and hazardous conditions. Results showed that the negative effects on functional resonance generated by safety detriment on TRI observed between compliance and effective working modes were three times larger than the drift between risk awareness and compliance, demonstrating that efficiency comes with a much higher cost than keeping safety at compliance levels. Extending the proposed approach with quantitative assessments could further support the management of socio-technical and environmental drivers in mechanised farming, strengthening the role of safety as a competitive asset for enhancing resilience and service quality. Full article

(This article belongs to the Special Issue Occupational Health and Safety (OHS): Emerging Trends and Future Directions)

► Show Figures

Figure 1

18 pages, 2639 KB

Open AccessArticle

CA-NodeNet: A Category-Aware Graph Neural Network for Semi-Supervised Node Classification

by Zichang Lu, Meiyu Zhong, Qiguo Sun and Kai Ma

Electronics 2025, 14(16), 3215; https://doi.org/10.3390/electronics14163215 - 13 Aug 2025

Viewed by 181

Abstract

Graph convolutional networks (GCNs) have demonstrated remarkable effectiveness in processing graph-structured data and have been widely adopted across various domains. Existing methods mitigate over-smoothing through selective aggregation strategies such as attention mechanisms, edge dropout, and neighbor sampling. While some approaches incorporate global structural [...] Read more.

Graph convolutional networks (GCNs) have demonstrated remarkable effectiveness in processing graph-structured data and have been widely adopted across various domains. Existing methods mitigate over-smoothing through selective aggregation strategies such as attention mechanisms, edge dropout, and neighbor sampling. While some approaches incorporate global structural context, they often underexplore category-aware representations and inter-category differences, which are crucial for enhancing node discriminability. To address these limitations, a novel framework, CA-NodeNet, is proposed for semi-supervised node classification. CA-NodeNet comprises three key components: (1) coarse-grained node feature learning, (2) category-decoupled multi-branch attention, and (3) inter-category difference feature learning. Initially, a GCN-based encoder is employed to aggregate neighborhood information and learn coarse-grained representations. Subsequently, the category-decoupled multi-branch attention module employs a hierarchical multi-branch architecture, in which each branch incorporates category-specific attention mechanisms to project coarse-grained features into disentangled semantic subspaces. Furthermore, a layer-wise intermediate supervision strategy is adopted to facilitate the learning of discriminative category-specific features within each branch. To further enhance node feature discriminability, we introduce an inter-category difference feature learning module. This module first encodes pairwise differences between the category-specific features obtained from the previous stage and then integrates complementary information across multiple feature pairs to refine node representations. Finally, we design a dual-component optimization function that synergistically combines intermediate supervision loss with the final classification objective, encouraging the network to learn robust and fine-grained node representations. Extensive experiments on multiple real-world benchmark datasets demonstrate the superior performance of CA-NodeNet over existing state-of-the-art methods. Ablation studies further validate the effectiveness of each module in contributing to overall performance gains. Full article

(This article belongs to the Special Issue How Graph Convolutional Networks Work: Mechanisms and Models)

► Show Figures

Figure 1

21 pages, 5025 KB

Open AccessArticle

Cascaded Self-Supervision to Advance Cardiac MRI Segmentation in Low-Data Regimes

by Martin Urschler, Elisabeth Rechberger, Franz Thaler and Darko Štern

Bioengineering 2025, 12(8), 872; https://doi.org/10.3390/bioengineering12080872 - 12 Aug 2025

Viewed by 564

Abstract

Deep learning has shown remarkable success in medical image analysis over the last decade; however, many contributions focused on supervised methods which learn exclusively from labeled training samples. Acquiring expert-level annotations in large quantities is time-consuming and costly, even more so in medical [...] Read more.

Deep learning has shown remarkable success in medical image analysis over the last decade; however, many contributions focused on supervised methods which learn exclusively from labeled training samples. Acquiring expert-level annotations in large quantities is time-consuming and costly, even more so in medical image segmentation, where annotations are required on a pixel level and often in 3D. As a result, available labeled training data and consequently performance is often limited. Frequently, however, additional unlabeled data are available and can be readily integrated into model training, paving the way for semi- or self-supervised learning (SSL). In this work, we investigate popular SSL strategies in more detail, namely Transformation Consistency, Student–Teacher and Pseudo-Labeling, as well as exhaustive combinations thereof. We comprehensively evaluate these methods on two 2D and 3D cardiac Magnetic Resonance datasets (ACDC, MMWHS) for which several different multi-compartment segmentation labels are available. To assess performance in limited dataset scenarios, different setups with a decreasing amount of patients in the labeled dataset are investigated. We identify cascaded Self-Supervision as the best methodology, where we propose to employ Pseudo-Labeling and a self-supervised cascaded Student–Teacher model simultaneously. Our evaluation shows that in all scenarios, all investigated SSL methods outperform the respective low-data supervised baseline as well as state-of-the-art self-supervised approaches. This is most prominent in the very-low-labeled data regime, where for our proposed method we demonstrate

10.17 %

and

6.72 %

improvement in Dice Similarity Coefficient (DSC) for ACDC and MMWHS, respectively, compared with the low-data supervised approach, as well as

2.47 %

and

7.64 %

DSC improvement, respectively, when compared with related work. Moreover, in most experiments, our proposed method is able to greatly decrease the performance gap when compared to the fully supervised scenario, where all available labeled samples are used. We conclude that it is always beneficial to incorporate unlabeled data in cardiac MRI segmentation whenever it is present. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Medical Imaging Processing)

► Show Figures

Figure 1

43 pages, 5258 KB

Open AccessArticle

Twin Self-Supervised Learning Framework for Glaucoma Diagnosis Using Fundus Images

by Suguna Gnanaprakasam and Rolant Gini John Barnabas

Appl. Syst. Innov. 2025, 8(4), 111; https://doi.org/10.3390/asi8040111 - 11 Aug 2025

Viewed by 333

Abstract

Glaucoma is a serious eye condition that damages the optic nerve and affects the transmission of visual information to the brain. It is the second leading cause of blindness worldwide. With deep learning, CAD systems have shown promising results in diagnosing glaucoma but [...] Read more.

Glaucoma is a serious eye condition that damages the optic nerve and affects the transmission of visual information to the brain. It is the second leading cause of blindness worldwide. With deep learning, CAD systems have shown promising results in diagnosing glaucoma but mostly rely on small-labeled datasets. Annotated fundus image datasets improve deep learning predictions by aiding pattern identification but require extensive curation. In contrast, unlabeled fundus images are more accessible. The proposed method employs a semi-supervised learning approach to utilize both labeled and unlabeled data effectively. It follows traditional supervised training with the generation of pseudo-labels for unlabeled data, and incorporates self-supervised techniques that eliminate the need for manual annotation. It uses a twin self-supervised learning approach to improve glaucoma diagnosis by integrating pseudo-labels from one model into another self-supervised model for effective detection. The self-supervised patch-based exemplar CNN generates pseudo-labels in the first stage. These pseudo-labeled data, combined with labeled data, train a convolutional auto-encoder classification model in the second stage to identify glaucoma features. A support vector machine classifier handles the final classification of glaucoma in the model, achieving 98% accuracy and 0.98 AUC on the internal, same-source combined fundus image datasets. Also, the model maintains reasonably good generalization to the external (fully unseen) data, achieving AUC of 0.91 on the CRFO dataset and AUC of 0.87 on the Papilla dataset. These results demonstrate the method’s effectiveness, robustness, and adaptability in addressing limited labeled fundus data and aid in improved health and lifestyle. Full article

(This article belongs to the Special Issue Advancing Healthcare Through Intelligent Clinical Decision Support Systems: Techniques, Applications, and Future Directions)

► Show Figures

Figure 1

24 pages, 3507 KB

Open AccessArticle

A Semi-Supervised Wildfire Image Segmentation Network with Multi-Scale Structural Fusion and Pixel-Level Contrastive Consistency

by Yong Sun, Wei Wei, Jia Guo, Haifeng Lin and Yiqing Xu

Fire 2025, 8(8), 313; https://doi.org/10.3390/fire8080313 - 7 Aug 2025

Viewed by 554

Abstract

The increasing frequency and intensity of wildfires pose serious threats to ecosystems, property, and human safety worldwide. Accurate semantic segmentation of wildfire images is essential for real-time fire monitoring, spread prediction, and disaster response. However, existing deep learning methods heavily rely on large [...] Read more.

The increasing frequency and intensity of wildfires pose serious threats to ecosystems, property, and human safety worldwide. Accurate semantic segmentation of wildfire images is essential for real-time fire monitoring, spread prediction, and disaster response. However, existing deep learning methods heavily rely on large volumes of pixel-level annotated data, which are difficult and costly to obtain in real-world wildfire scenarios due to complex environments and urgent time constraints. To address this challenge, we propose a semi-supervised wildfire image segmentation framework that enhances segmentation performance under limited annotation conditions by integrating multi-scale structural information fusion and pixel-level contrastive consistency learning. Specifically, a Lagrange Interpolation Module (LIM) is designed to construct structured interpolation representations between multi-scale feature maps during the decoding stage, enabling effective fusion of spatial details and semantic information, and improving the model’s ability to capture flame boundaries and complex textures. Meanwhile, a Pixel Contrast Consistency (PCC) mechanism is introduced to establish pixel-level semantic constraints between CutMix and Flip augmented views, guiding the model to learn consistent intra-class and discriminative inter-class feature representations, thereby reducing the reliance on large labeled datasets. Extensive experiments on two public wildfire image datasets, Flame and D-Fire, demonstrate that our method consistently outperforms other approaches under various annotation ratios. For example, with only half of the labeled data, our model achieves 5.0% and 6.4% mIoU improvements on the Flame and D-Fire datasets, respectively, compared to the baseline. This work provides technical support for efficient wildfire perception and response in practical applications. Full article

► Show Figures

Figure 1

21 pages, 1212 KB

Open AccessArticle

A Semi-Supervised Approach to Characterise Microseismic Landslide Events from Big Noisy Data

by David Murray, Lina Stankovic and Vladimir Stankovic

Geosciences 2025, 15(8), 304; https://doi.org/10.3390/geosciences15080304 - 6 Aug 2025

Viewed by 295

Abstract

Most public seismic recordings, sampled at hundreds of Hz, tend to be unlabelled, i.e., not catalogued, mainly because of the sheer volume of samples and the amount of time needed by experts to confidently label detected events. This is especially challenging for very [...] Read more.

Most public seismic recordings, sampled at hundreds of Hz, tend to be unlabelled, i.e., not catalogued, mainly because of the sheer volume of samples and the amount of time needed by experts to confidently label detected events. This is especially challenging for very low signal-to-noise ratio microseismic events that characterise landslides during rock and soil mass displacement. Whilst numerous supervised machine learning models have been proposed to classify landslide events, they rely on a large amount of labelled datasets. Therefore, there is an urgent need to develop tools to effectively automate the data-labelling process from a small set of labelled samples. In this paper, we propose a semi-supervised method for labelling of signals recorded by seismometers that can reduce the time and expertise needed to create fully annotated datasets. The proposed Siamese network approach learns best class-exemplar anchors, leveraging learned similarity between these anchor embeddings and unlabelled signals. Classification is performed via soft-labelling and thresholding instead of hard class boundaries. Furthermore, network output explainability is used to explain misclassifications and we demonstrate the effect of anchors on performance, via ablation studies. The proposed approach classifies four landslide classes, namely earthquakes, micro-quakes, rockfall and anthropogenic noise, demonstrating good agreement with manually detected events while requiring few training data to be effective, hence reducing the time needed for labelling and updating models. Full article

(This article belongs to the Special Issue Integrated Approaches with Seismic Techniques to Investigate Landslide Areas)

► Show Figures

Figure 1

17 pages, 1519 KB

Open AccessArticle

TOM-SSL: Tomato Disease Recognition Using Pseudo-Labelling-Based Semi-Supervised Learning

by Sathiyamohan Nishankar, Thurairatnam Mithuran, Selvarajah Thuseethan, Yakub Sebastian, Kheng Cher Yeo and Bharanidharan Shanmugam

AgriEngineering 2025, 7(8), 248; https://doi.org/10.3390/agriengineering7080248 - 5 Aug 2025

Viewed by 499

Abstract

In the agricultural domain, the availability of labelled data for disease recognition tasks is often limited due to the cost and expertise required for annotation. In this paper, a novel semi-supervised learning framework named TOM-SSL is proposed for automatic tomato leaf disease recognition [...] Read more.

In the agricultural domain, the availability of labelled data for disease recognition tasks is often limited due to the cost and expertise required for annotation. In this paper, a novel semi-supervised learning framework named TOM-SSL is proposed for automatic tomato leaf disease recognition using pseudo-labelling. TOM-SSL effectively addresses the challenge of limited labelled data by leveraging a small labelled subset and confidently pseudo-labelled samples from a large pool of unlabelled data to improve classification performance. Utilising only 10% of the labelled data, the proposed framework with a MobileNetV3-Small backbone achieves the best accuracy at 72.51% on the tomato subset of the PlantVillage dataset and 70.87% on the Taiwan tomato leaf disease dataset across 10 disease categories in PlantVillage and 6 in the Taiwan dataset. While achieving recognition performance on par with current state-of-the-art supervised methods, notably, the proposed approach offers a tenfold enhancement in label efficiency. Full article

(This article belongs to the Special Issue Transforming Agriculture with Artificial Intelligence: Recent Advances and Applications)

► Show Figures

Figure 1

21 pages, 6219 KB

Open AccessArticle

Semi-Supervised Density Estimation with Background-Augmented Data for In Situ Seed Counting

by Baek-Gyeom Sung, Chun-Gu Lee, Yeong-Ho Kang, Seung-Hwa Yu and Dae-Hyun Lee

Agriculture 2025, 15(15), 1682; https://doi.org/10.3390/agriculture15151682 - 4 Aug 2025

Viewed by 411

Abstract

Direct seeding has gained prominence as a labor-efficient and environmentally sustainable alternative to conventional transplanting in rice cultivation. In direct seeding systems, early-stage management is crucial for stable seedling establishment, with sowing uniformity measured by seed counts being a critical indicator of success. [...] Read more.

Direct seeding has gained prominence as a labor-efficient and environmentally sustainable alternative to conventional transplanting in rice cultivation. In direct seeding systems, early-stage management is crucial for stable seedling establishment, with sowing uniformity measured by seed counts being a critical indicator of success. However, conventional manual seed counting methods are time-consuming, prone to human error, and impractical for large-scale or repetitive tasks, necessitating advanced automated solutions. Recent advances in computer vision technologies and precision agriculture tools, offer the potential to automate seed counting tasks. Nevertheless, challenges such as domain discrepancies and limited labeled data restrict robust real-world deployment. To address these issues, we propose a density estimation-based seed counting framework integrating semi-supervised learning and background augmentation. This framework includes a cost-effective data acquisition system enabling diverse domain data collection through indoor background augmentation, combined with semi-supervised learning to utilize augmented data effectively while minimizing labeling costs. The experimental results on field data from unknown domains show that our approach reduces seed counting errors by up to 58.5% compared to conventional methods, highlighting its potential as a scalable and effective solution for agricultural applications in real-world environments. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

22 pages, 2678 KB

Open AccessArticle

Federated Semi-Supervised Learning with Uniform Random and Lattice-Based Client Sampling

by Mei Zhang and Feng Yang

Entropy 2025, 27(8), 804; https://doi.org/10.3390/e27080804 - 28 Jul 2025

Viewed by 366

Abstract

Federated semi-supervised learning (Fed-SSL) has emerged as a powerful framework that leverages both labeled and unlabeled data distributed across clients. To reduce communication overhead, real-world deployments often adopt partial client participation, where only a subset of clients is selected in each round. However, [...] Read more.

Federated semi-supervised learning (Fed-SSL) has emerged as a powerful framework that leverages both labeled and unlabeled data distributed across clients. To reduce communication overhead, real-world deployments often adopt partial client participation, where only a subset of clients is selected in each round. However, under non-i.i.d. data distributions, the choice of client sampling strategy becomes critical, as it significantly affects training stability and final model performance. To address this challenge, we propose a novel federated averaging semi-supervised learning algorithm, called FedAvg-SSL, that considers two sampling approaches, uniform random sampling (standard Monte Carlo) and a structured lattice-based sampling, inspired by quasi-Monte Carlo (QMC) techniques, which ensures more balanced client participation through structured deterministic selection. On the client side, each selected participant alternates between updating the global model and refining the pseudo-label model using local data. We provide a rigorous convergence analysis, showing that FedAvg-SSL achieves a sublinear convergence rate with linear speedup. Extensive experiments not only validate our theoretical findings but also demonstrate the advantages of lattice-based sampling in federated learning, offering insights into the interplay among algorithm performance, client participation rates, local update steps, and sampling strategies. Full article

(This article belongs to the Special Issue Number Theoretic Methods in Statistics: Theory and Applications)

► Show Figures

Figure 1

19 pages, 43909 KB

Open AccessArticle

DualBranch-AMR: A Semi-Supervised AMR Method Based on Dual-Student Consistency Regularization with Dynamic Stability Evaluation

by Jiankun Ma, Zhenxi Zhang, Linrun Zhang, Yu Li, Haoyue Tan, Xiaoran Shi and Feng Zhou

Sensors 2025, 25(15), 4553; https://doi.org/10.3390/s25154553 - 23 Jul 2025

Viewed by 316

Abstract

Modulation recognition, as one of the key technologies in the field of wireless communications, holds significant importance in applications such as spectrum resource management, interference suppression, and cognitive radio. While deep learning has substantially improved the performance of Automatic Modulation Recognition (AMR), it [...] Read more.

Modulation recognition, as one of the key technologies in the field of wireless communications, holds significant importance in applications such as spectrum resource management, interference suppression, and cognitive radio. While deep learning has substantially improved the performance of Automatic Modulation Recognition (AMR), it heavily relies on large amounts of labeled data. Given the high annotation costs and privacy concerns, researching semi-supervised AMR methods that leverage readily available unlabeled data for training is of great significance. This study constructs a semi-supervised AMR method based on dual-student. Specifically, we first adopt a dual-branch co-training architecture to fully exploit unlabeled data and effectively learn deep feature representations. Then, we develop a dynamic stability evaluation module using strong and weak augmentation strategies to improve the accuracy of generated pseudo-labels. Finally, based on the dual-student semi-supervised framework and pseudo-label stability evaluation, we propose a stability-guided consistency regularization constraint method and conduct semi-supervised AMR model training. The experimental results demonstrate that the proposed DualBranch-AMR method significantly outperforms traditional supervised baseline approaches on benchmark datasets. With only 5% labeled data, it achieves a recognition accuracy of 55.84%, reaching over 90% of the performance of fully supervised training. This validates the superiority of the proposed method under semi-supervised conditions. Full article

(This article belongs to the Section Communications)

► Show Figures

Figure 1

13 pages, 325 KB

Open AccessArticle

“It Can Be Quite Daunting”: Promoting Mental Health Service Use for Vulnerable Young People

by Anne Gu, Michelle Kehoe, Kirsty Pope and Liza Hopkins

Healthcare 2025, 13(14), 1740; https://doi.org/10.3390/healthcare13141740 - 18 Jul 2025

Viewed by 404

Abstract

Background: Today, young people face a variety of social, environmental and psychological challenges, making them more vulnerable to developing mental health issues. Worldwide 15% of adolescents experience poor mental health, with the majority not seeking help or receiving care. Therefore, it is [...] Read more.

Background: Today, young people face a variety of social, environmental and psychological challenges, making them more vulnerable to developing mental health issues. Worldwide 15% of adolescents experience poor mental health, with the majority not seeking help or receiving care. Therefore, it is critical that youth mental health services become more youth-friendly to encourage help-seeking. This study examines a new pilot volunteer model of care introduced into a youth mental health service in Melbourne, Australia. The aim of the study is to explore staff perspectives of the volunteer model. Methods: A qualitative research design was undertaken using semi-structured one-on-one interviews. Eight staff participated. Data was thematically analysed using an inductive approach. Results: Two main themes, ‘promoting service use’ and ‘implementation to practice’, were generated, along with sub-themes. The themes highlight benefits to staff such as reductions in workload and benefits to volunteers through the gaining of experience and knowledge. However, there was a need to support volunteers through greater training and supervision. Conclusions: Volunteers in youth mental health services can create a welcoming environment which enhances access and engagement for young people seeking help. Volunteers in a youth mental health setting can enhance accessibility, reducing staff workload and fostering meaningful engagement. Full article

(This article belongs to the Special Issue Mental Health Promotion and Illness Prevention in Vulnerable Populations—2nd Edition)

► Show Figures

Figure 1

23 pages, 21197 KB

Open AccessArticle

DLPLSR: Dual Label Propagation-Driven Least Squares Regression with Feature Selection for Semi-Supervised Learning

by Shuanghao Zhang, Zhengtong Yang and Zhaoyin Shi

Mathematics 2025, 13(14), 2290; https://doi.org/10.3390/math13142290 - 16 Jul 2025

Viewed by 273

Abstract

In the real world, most data are unlabeled, which drives the development of semi-supervised learning (SSL). Among SSL methods, least squares regression (LSR) has attracted attention for its simplicity and efficiency. However, existing semi-supervised LSR approaches suffer from challenges such as the insufficient [...] Read more.

In the real world, most data are unlabeled, which drives the development of semi-supervised learning (SSL). Among SSL methods, least squares regression (LSR) has attracted attention for its simplicity and efficiency. However, existing semi-supervised LSR approaches suffer from challenges such as the insufficient use of unlabeled data, low pseudo-label accuracy, and inefficient label propagation. To address these issues, this paper proposes dual label propagation-driven least squares regression with feature selection, named DLPLSR, which is a pseudo-label-free SSL framework. DLPLSR employs a fuzzy-graph-based clustering strategy to capture global relationships among all samples, and manifold regularization preserves local geometric consistency, so that it implements the dual label propagation mechanism for comprehensive utilization of unlabeled data. Meanwhile, a dual-feature selection mechanism is established by integrating orthogonal projection for maximizing feature information with an ℓ_2,1-norm regularization for eliminating redundancy, thereby jointly enhancing the discriminative power. Benefiting from these two designs, DLPLSR boosts learning performance without pseudo-labeling. Finally, the objective function admits an efficient closed-form solution solvable via an alternating optimization strategy. Extensive experiments on multiple benchmark datasets show the superiority of DLPLSR compared to state-of-the-art LSR-based SSL methods. Full article

(This article belongs to the Special Issue Machine Learning and Optimization for Clustering Algorithms)

► Show Figures

Figure 1

27 pages, 6169 KB

Open AccessArticle

Application of Semi-Supervised Clustering with Membership Information and Deep Learning in Landslide Susceptibility Assessment

by Hua Xia, Zili Qin, Yuanxin Tong, Yintian Li, Rui Zhang and Hongxia Luo

Land 2025, 14(7), 1472; https://doi.org/10.3390/land14071472 - 15 Jul 2025

Viewed by 314

Abstract

Landslide susceptibility assessment (LSA) plays a crucial role in disaster prevention and mitigation. Traditional random selection of non-landslide samples (labeled as 0) suffers from poor representativeness and high randomness, which may include potential landslide areas and affect the accuracy of LSA. To address [...] Read more.

Landslide susceptibility assessment (LSA) plays a crucial role in disaster prevention and mitigation. Traditional random selection of non-landslide samples (labeled as 0) suffers from poor representativeness and high randomness, which may include potential landslide areas and affect the accuracy of LSA. To address this issue, this study proposes a novel Landslide Susceptibility Index–based Semi-supervised Fuzzy C-Means (LSI-SFCM) sampling strategy combining membership degrees. It utilizes landslide and unlabeled samples to map landslide membership degree via Semi-supervised Fuzzy C-Means (SFCM). Non-landslide samples are selected from low-membership regions and assigned membership values as labels. This study developed three models for LSA—Convolutional Neural Network (CNN), U-Net, and Support Vector Machine (SVM), and compared three negative sample sampling strategies: Random Sampling (RS), SFCM (samples labeled 0), and LSI-SFCM. The results demonstrate that the LSI-SFCM effectively enhances the representativeness and diversity of negative samples, improving the predictive performance and classification reliability. Deep learning models using LSI-SFCM performed with superior predictive capability. The CNN model achieved an area under the receiver operating characteristic curve (AUC) of 95.52% and a prediction rate curve value of 0.859. Furthermore, compared with the traditional unsupervised fuzzy C-means (FCM) clustering, SFCM produced a more reasonable distribution of landslide membership degrees, better reflecting the distinction between landslides and non-landslides. This approach enhances the reliability of LSA and provides a scientific basis for disaster prevention and mitigation authorities. Full article

► Show Figures

Figure 1

16 pages, 609 KB

Open AccessArticle

Enhancing Software Defect Prediction Using Ensemble Techniques and Diverse Machine Learning Paradigms

by Ayesha Siddika, Momotaz Begum, Fahmid Al Farid, Jia Uddin and Hezerul Abdul Karim

Eng 2025, 6(7), 161; https://doi.org/10.3390/eng6070161 - 15 Jul 2025

Viewed by 1053

Abstract

In today’s fast-paced world of software development, it is essential to ensure that programs run smoothly without any issues. When dealing with complex applications, the objective is to predict and resolve problems before they escalate. The prediction of software defects is a crucial [...] Read more.

In today’s fast-paced world of software development, it is essential to ensure that programs run smoothly without any issues. When dealing with complex applications, the objective is to predict and resolve problems before they escalate. The prediction of software defects is a crucial element in maintaining the stability and reliability of software systems. This research addresses this need by combining advanced techniques (ensemble techniques) with seventeen machine learning algorithms for predicting software defects, categorised into three types: semi-supervised, self-supervised, and supervised. In supervised learning, we mainly experimented with several algorithms, including random forest, k-nearest neighbors, support vector machines, logistic regression, gradient boosting, AdaBoost classifier, quadratic discriminant analysis, Gaussian training, decision tree, passive aggressive, and ridge classifier. In semi-supervised learning, we tested are autoencoders, semi-supervised support vector machines, and generative adversarial networks. For self-supervised learning, we utilized are autoencoder, simple framework for contrastive learning of representations, and bootstrap your own latent. After comparing the performance of each machine learning algorithm, we identified the most effective one. Among these, the gradient boosting AdaBoost classifier demonstrated superior performance based on an accuracy of 90%, closely followed by the AdaBoost classifier at 89%. Finally, we applied ensemble methods to predict software defects, leveraging the collective strengths of these diverse approaches. This enables software developers to significantly enhance defect prediction accuracy, thereby improving overall system robustness and reliability. Full article

► Show Figures

Figure 1

Search Results (568)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (568)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI