MDPI - Publisher of Open Access Journals

26 pages, 718 KiB

Open AccessReview

Advancements in Semi-Supervised Deep Learning for Brain Tumor Segmentation in MRI: A Literature Review

by Chengcheng Jin, Theam Foo Ng and Haidi Ibrahim

AI 2025, 6(7), 153; https://doi.org/10.3390/ai6070153 - 11 Jul 2025

Viewed by 307

For automatic tumor segmentation in magnetic resonance imaging (MRI), deep learning offers very powerful technical support with significant results. However, the success of supervised learning is strongly dependent on the quantity and accuracy of labeled training data, which is challenging to acquire in [...] Read more.

For automatic tumor segmentation in magnetic resonance imaging (MRI), deep learning offers very powerful technical support with significant results. However, the success of supervised learning is strongly dependent on the quantity and accuracy of labeled training data, which is challenging to acquire in MRI. Semi-supervised learning approaches have arisen to tackle this difficulty, yielding comparable brain tumor segmentation outcomes with fewer labeled samples. This literature review explores key semi-supervised learning techniques for medical image segmentation, including pseudo-labeling, consistency regularization, generative adversarial networks, contrastive learning, and holistic methods. We specifically examine the application of these approaches in brain tumor MRI segmentation. Our findings suggest that semi-supervised learning can outperform traditional supervised methods by providing more effective guidance, thereby enhancing the potential for clinical computer-aided diagnosis. This literature review serves as a comprehensive introduction to semi-supervised learning in tumor MRI segmentation, including glioma segmentation, offering valuable insights and a comparative analysis of current methods for researchers in the field. Full article

(This article belongs to the Section Medical & Healthcare AI)

► Show Figures

Figure 1

21 pages, 4359 KiB

Open AccessArticle

Identification of NAPL Contamination Occurrence States in Low-Permeability Sites Using UNet Segmentation and Electrical Resistivity Tomography

by Mengwen Gao, Yu Xiao and Xiaolei Zhang

Appl. Sci. 2025, 15(13), 7109; https://doi.org/10.3390/app15137109 - 24 Jun 2025

Viewed by 194

Abstract

To address the challenges in identifying NAPL contamination within low-permeability clay sites, this study innovatively integrates high-density electrical resistivity tomography (ERT) with a UNet deep learning model to establish an intelligent contamination detection system. Taking an industrial site in Shanghai as the research [...] Read more.

To address the challenges in identifying NAPL contamination within low-permeability clay sites, this study innovatively integrates high-density electrical resistivity tomography (ERT) with a UNet deep learning model to establish an intelligent contamination detection system. Taking an industrial site in Shanghai as the research object, we collected apparent resistivity data using the WGMD-9 system, obtained resistivity profiles through inversion imaging, and constructed training sets by generating contamination labels via K-means clustering. A semantic segmentation model with skip connections and multi-scale feature fusion was developed based on the UNet architecture to achieve automatic identification of contaminated areas. Experimental results demonstrate that the model achieves a mean Intersection over Union (mIoU) of 86.58%, an accuracy (Acc) of 99.42%, a precision (Pre) of 75.72%, a recall (Rec) of 76.80%, and an F1 score (f1) of 76.23%, effectively overcoming the noise interference in electrical anomaly interpretation through conventional geophysical methods in low-permeability clay, while outperforming DeepLabV3, DeepLabV3+, PSPNet, and LinkNet models. Time-lapse resistivity imaging verifies the feasibility of dynamic monitoring for contaminant migration, while the integration of the VGG-16 encoder and hyperparameter optimization (learning rate of 0.0001 and batch size of 8) significantly enhances model performance. Case visualization reveals high consistency between segmentation results and actual contamination distribution, enabling precise localization of spatial morphology for contamination plumes. This technological breakthrough overcomes the high-cost and low-efficiency limitations of traditional borehole sampling, providing a high-precision, non-destructive intelligent detection solution for contaminated site remediation. Full article

► Show Figures

Figure 1

14 pages, 1580 KiB

Open AccessArticle

Machine Learning Classification of Fossilized Pectinodon bakkeri Teeth Images: Insights into Troodontid Theropod Dinosaur Morphology

by Jacob Bahn, Germán H. Alférez and Keith Snyder

Mach. Learn. Knowl. Extr. 2025, 7(2), 45; https://doi.org/10.3390/make7020045 - 21 May 2025

Viewed by 1828

Abstract

Although the manual classification of microfossils is possible, it can become burdensome. Machine learning offers an alternative that allows for automatic classification. Our contribution is to use machine learning to develop an automated approach for classifying images of Pectinodon bakkeri teeth. This can [...] Read more.

Although the manual classification of microfossils is possible, it can become burdensome. Machine learning offers an alternative that allows for automatic classification. Our contribution is to use machine learning to develop an automated approach for classifying images of Pectinodon bakkeri teeth. This can be expanded for use with many other species. Our approach is composed of two steps. First, PCA and K-means were applied to a numerical dataset with 459 samples collected at the Hanson Ranch Bonebed in eastern Wyoming, containing the following features: crown height, fore-aft basal length, basal width, anterior denticles, and posterior denticles per millimeter. The results obtained in this step were used to automatically organize the P. bakkeri images from two out of three clusters generated. Finally, the tooth images were used to train a convolutional neural network with two classes. The model has an accuracy of 71%, a precision of 71%, a recall of 70.5%, and an F1-score of 70.5%. Full article

(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition, 2nd Edition)

► Show Figures

Figure 1

24 pages, 1212 KiB

Open AccessArticle

Comparative Evaluation of Automatic Detection and Classification of Daily Living Activities Using Batch Learning and Stream Learning Algorithms

by Paula Sofía Muñoz, Ana Sofía Orozco, Jaime Pabón, Daniel Gómez, Ricardo Salazar-Cabrera, Jesús D. Cerón, Diego M. López and Bernd Blobel

J. Pers. Med. 2025, 15(5), 208; https://doi.org/10.3390/jpm15050208 - 20 May 2025

Viewed by 423

Abstract

Background/Objectives: Activities of Daily Living (ADLs) are crucial for assessing an individual’s autonomy, encompassing tasks such as eating, dressing, and moving around, among others. Predicting these activities is part of health monitoring, elderly care, and intelligent systems, improving quality of life, and facilitating [...] Read more.

Background/Objectives: Activities of Daily Living (ADLs) are crucial for assessing an individual’s autonomy, encompassing tasks such as eating, dressing, and moving around, among others. Predicting these activities is part of health monitoring, elderly care, and intelligent systems, improving quality of life, and facilitating early dependency detection, all of which are relevant components of personalized health and social care. However, the automatic classification of ADLs from sensor data remains challenging due to high variability in human behavior, sensor noise, and discrepancies in data acquisition protocols. These challenges limit the accuracy and applicability of existing solutions. This study details the modeling and evaluation of real-time ADL classification models based on batch learning (BL) and stream learning (SL) algorithms. Methods: The methodology followed is the Cross-Industry Standard Process for Data Mining (CRISP-DM). The models were trained with a comprehensive dataset integrating 23 ADL-centric datasets using accelerometers and gyroscopes data. The data were preprocessed by applying normalization and sampling rate unification techniques, and finally, relevant sensor locations on the body were selected. Results: After cleaning and debugging, a final dataset was generated, containing 238,990 samples, 56 activities, and 52 columns. The study compared models trained with BL and SL algorithms, evaluating their performance under various classification scenarios using accuracy, area under the curve (AUC), and F1-score metrics. Finally, a mobile application was developed to classify ADLs in real time (feeding data from a dataset). Conclusions: The outcome of this study can be used in various data science projects related to ADL and Human activity recognition (HAR), and due to the integration of diverse data sources, it is potentially useful to address bias and improve generalizability in Machine Learning models. The principal advantage of online learning algorithms is dynamically adapting to data changes, representing a significant advance in personal autonomy and health care monitoring. Full article

(This article belongs to the Special Issue Selected Papers From the pHealth 2024 Conference, Rende, Italy, 27–29 May 2024)

► Show Figures

Figure 1

31 pages, 13317 KiB

Open AccessArticle

3D Micro-Expression Recognition Based on Adaptive Dynamic Vision

by Weiyi Kong, Zhisheng You and Xuebin Lv

Sensors 2025, 25(10), 3175; https://doi.org/10.3390/s25103175 - 18 May 2025

Viewed by 665

Abstract

In the research on intelligent perception, dynamic emotion recognition has been the focus in recent years. Small samples and unbalanced data are the main reasons for the low recognition accuracy of current technologies. Inspired by circular convolution networks, this paper innovatively proposes an [...] Read more.

In the research on intelligent perception, dynamic emotion recognition has been the focus in recent years. Small samples and unbalanced data are the main reasons for the low recognition accuracy of current technologies. Inspired by circular convolution networks, this paper innovatively proposes an adaptive dynamic micro-expression recognition algorithm based on self-supervised learning, namely MADV-Net. Firstly, a basic model is pre-trained with accurate tag data, and then an efficient facial motion encoder is used to embed facial coding unit tags. Finally, a cascaded pyramid structure is constructed by the multi-level adaptive dynamic encoder, and the multi-level head perceptron is used as the input into the classification loss function to calculate facial micro-motion features in the dynamic video stream. In this study, a large number of experiments were carried out on the open-source datasets SMIC, CASME-II, CAS(ME)^2, and SAMM. Compared with the 13 mainstream SOTA methods, the average recognition accuracy of MADV-Net is 72.87%, 89.94%, 83.32% and 89.53%, respectively. The stable generalization ability of this method is proven, providing a new research paradigm for automatic emotion recognition. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

14 pages, 723 KiB

Open AccessArticle

RMPT: Reinforced Memory-Driven Pure Transformer for Automatic Chest X-Ray Report Generation

by Caijie Qin, Yize Xiong, Weibin Chen and Yong Li

Mathematics 2025, 13(9), 1492; https://doi.org/10.3390/math13091492 - 30 Apr 2025

Viewed by 361

Abstract

Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still [...] Read more.

Automatic generation of chest X-ray reports, designed to produce clinically precise descriptions from chest X-ray images, is gaining significant research attention because of its vast potential in clinical applications. Recently, despite considerable progress, current models typically adhere to a CNN–Transformer-based framework, which still fails to enhance the perceptual field during image feature extraction. To solve this problem, we propose the Reinforced Memory-driven Pure Transformer (RMPT), which is a novel Transformer–Transformer-based model. In implementation, our RMPT employs the Swin Transformer to extract visual features from given X-ray images, which has a larger perceptual field to better model the relationships between different regions. Furthermore, we adopt a memory-driven Transformer (MemTrans) to effectively model similar patterns in different reports, which is able to facilitate the model to generate long reports. Finally, we present an innovative training approach leveraging Reinforcement Learning (RL) that efficiently steers the model to focus on challenging samples, consequently improving its comprehensive performance across both straightforward and complex situations. Experimental results on the IU X-ray dataset show that our proposed RMPT achieves superior performance on various Natural Language Generation (NLG) evaluation metrics. Further ablation study results demonstrate that our RMPT model achieves 10.5% overall performance compared to the base mode. Full article

► Show Figures

Figure 1

27 pages, 10552 KiB

Open AccessArticle

Enhancing Dongba Pictograph Recognition Using Convolutional Neural Networks and Data Augmentation Techniques

by Shihui Li, Lan Thi Nguyen, Wirapong Chansanam, Natthakan Iam-On and Tossapon Boongoen

Information 2025, 16(5), 362; https://doi.org/10.3390/info16050362 - 29 Apr 2025

Viewed by 514

Abstract

The recognition of Dongba pictographs presents significant challenges due to the pitfalls in traditional feature extraction methods, classification algorithms’ high complexity, and generalization ability. This study proposes a convolutional neural network (CNN)-based image classification method to enhance the accuracy and efficiency of Dongba [...] Read more.

The recognition of Dongba pictographs presents significant challenges due to the pitfalls in traditional feature extraction methods, classification algorithms’ high complexity, and generalization ability. This study proposes a convolutional neural network (CNN)-based image classification method to enhance the accuracy and efficiency of Dongba pictograph recognition. The research begins with collecting and manually categorizing Dongba pictograph images, followed by these preprocessing steps to improve image quality: normalization, grayscale conversion, filtering, denoising, and binarization. The dataset, comprising 70,000 image samples, is categorized into 18 classes based on shape characteristics and manual annotations. A CNN model is then trained using a dataset that is split into training (with 70% of all the samples), validation (20%), and test (10%) sets. In particular, data augmentation techniques, including rotation, affine transformation, scaling, and translation, are applied to enhance classification accuracy. Experimental results demonstrate that the proposed model achieves a classification accuracy of 99.43% and consistently outperforms other conventional methods, with its performance peaking at 99.84% under optimized training conditions—specifically, with 75 training epochs and a batch size of 512. This study provides a robust and efficient solution for automatically classifying Dongba pictographs, contributing to their digital preservation and scholarly research. By leveraging deep learning techniques, the proposed approach facilitates the rapid and precise identification of Dongba hieroglyphs, supporting the ongoing efforts in cultural heritage preservation and the broader application of artificial intelligence in linguistic studies. Full article

(This article belongs to the Special Issue Machine Learning and Data Mining: Innovations in Big Data Analytics)

► Show Figures

Figure 1

12 pages, 8198 KiB

Open AccessArticle

A Convolutional Neural Network SAR Image Denoising Algorithm Based on Self-Learning Strategies

by Jun Wang and Ke Xu

Appl. Sci. 2025, 15(9), 4786; https://doi.org/10.3390/app15094786 - 25 Apr 2025

Viewed by 642

Abstract

Due to its high resolution and all-weather imaging capability, Synthetic Aperture Radar (SAR) is widely used in fields such as Earth observation and environmental monitoring. However, SAR images are prone to noise interference during the imaging process, which seriously affects the visualization effect [...] Read more.

Due to its high resolution and all-weather imaging capability, Synthetic Aperture Radar (SAR) is widely used in fields such as Earth observation and environmental monitoring. However, SAR images are prone to noise interference during the imaging process, which seriously affects the visualization effect and subsequent analysis of the image. This article proposes a convolutional neural network SAR image denoising algorithm based on self-learning strategy. Particularly, a denoising convolutional neural network that utilizes a self-learning denoising model and a twin convolutional network structure is proposed. By constructing a noise original image dataset sample pair for training, the model can automatically learn image features and noise distribution, significantly improving the denoising effect of SAR images and possessing stronger generalization ability. The simulation experiments verified the effectiveness of the proposed method, indicating its potential application in SAR image denoising. Full article

► Show Figures

Figure 1

22 pages, 4865 KiB

Open AccessArticle

An Unsupervised Fusion Strategy for Anomaly Detection via Chebyshev Graph Convolution and a Modified Adversarial Network

by Hamideh Manafi, Farnaz Mahan and Habib Izadkhah

Biomimetics 2025, 10(4), 245; https://doi.org/10.3390/biomimetics10040245 - 17 Apr 2025

Viewed by 538

Abstract

Anomalies refer to data inconsistent with the overall trend of the dataset and may indicate an error or an unusual event. Time series prediction can detect anomalies that happen unexpectedly in critical situations during the usage of a system or a network. Detecting [...] Read more.

Anomalies refer to data inconsistent with the overall trend of the dataset and may indicate an error or an unusual event. Time series prediction can detect anomalies that happen unexpectedly in critical situations during the usage of a system or a network. Detecting or predicting anomalies in the traditional way is time-consuming and error-prone. Accordingly, the automatic recognition of anomalies is applicable to reduce the cost of defects and will pave the way for companies to optimize their performance. This unsupervised technique is an efficient way of detecting abnormal samples during the fluctuations of time series. In this paper, an unsupervised deep network is proposed to predict temporal information. The correlations between the neighboring samples are acquired to construct a graph of neighboring fluctuations. The extricated features related to the temporal distribution of the time samples in the constructed graph representation are used to impose the Chebyshev graph convolution layers. The output is used to train an adversarial network for anomaly detection. A modification is performed for the generative adversarial network’s cost function to perfectly match our purpose. Thus, the proposed method is based on combining generative adversarial networks (GANs) and a Chebyshev graph, which has shown good results in various domains. Accordingly, the performance of the proposed fusion approach of a Chebyshev graph-based modified adversarial network (Cheb-MA) is evaluated on the Numenta dataset. The proposed model was evaluated based on various evaluation indices, including the average F1-score, and was able to reach a value of 82.09%, which is very promising compared to recent research. Full article

(This article belongs to the Special Issue Biomimicry for Optimization, Control, and Automation: 3rd Edition)

► Show Figures

Figure 1

21 pages, 45568 KiB

Open AccessArticle

Detecting Long-Term Spatiotemporal Dynamics of Urban Green Spaces with Training Sample Migration Method

by Mengyao Wang, Pan Li, Chunyu Wang, Wei Chen, Zhongen Niu, Na Zeng, Xingxing Han and Xinchao Sun

Remote Sens. 2025, 17(8), 1426; https://doi.org/10.3390/rs17081426 - 17 Apr 2025

Viewed by 502

Abstract

Urban green spaces (UGSs) are critical for landscape, ecological, and climate studies. However, the generation of long-term annual UGSs maps is often constrained by the lack of sufficient, high-quality training samples for training classifiers. In this study, we introduce an automatic training sample [...] Read more.

Urban green spaces (UGSs) are critical for landscape, ecological, and climate studies. However, the generation of long-term annual UGSs maps is often constrained by the lack of sufficient, high-quality training samples for training classifiers. In this study, we introduce an automatic training sample migration method based on visually interpreted reference data and long-term Landsat imagery, implemented on the Google Earth Engine (GEE) platform, to produce annual UGSs maps for Tianjin from 1984 to 2022. Migrating training samples to each year significantly improved classification performance, especially for UGSs and water bodies. UGSs coverage in sample areas increased from 5% to 38%, resulting in more reliable trend detection. Our spatiotemporal analysis revealed that green coverage in the study area reached up to 40%, dominated by tree cover that is significantly underestimated in existing global and regional land cover products. Distinct temporal patterns emerged between the old built-up area (OBUA) and new built-up area (NBUA). Early UGS decline was largely driven by NBUAs, while post-2007 greening involved both OBUAs and NBUAs, as captured by classification maps and vegetation indices. Our study proposes a scalable and practical framework for long-term land cover mapping in rapidly urbanizing regions, with enhanced potential as higher-resolution data becomes increasingly accessible. Full article

(This article belongs to the Special Issue Remote Sensing for Monitoring Land-Use/Land-Cover Change and Impacts on Ecosystem Service)

► Show Figures

Figure 1

20 pages, 12398 KiB

Open AccessArticle

A Rice-Mapping Method with Integrated Automatic Generation of Training Samples and Random Forest Classification Using Google Earth Engine

by Yuqing Fan, Debao Yuan, Liuya Zhang, Maochen Zhao and Renxu Yang

Agronomy 2025, 15(4), 873; https://doi.org/10.3390/agronomy15040873 - 31 Mar 2025

Viewed by 618

Abstract

Accurate mapping of rice planting areas is of great significance in terms of food security and market stability. However, the existing research into high-resolution rice mapping has relied heavily on fine-scale temporal remote sensing image data. Due to cloud occlusion and banding problems, [...] Read more.

Accurate mapping of rice planting areas is of great significance in terms of food security and market stability. However, the existing research into high-resolution rice mapping has relied heavily on fine-scale temporal remote sensing image data. Due to cloud occlusion and banding problems, data extraction from Landsat series remote sensing images with medium spatial resolution is not optimal. Therefore, this study proposes a rice mapping method (LR) using Google Earth Engine (GEE), which uses Landsat images and integrates automatic generation of training samples and a machine learning algorithm, with the assistance of phenological methods. The proposed LR method initially generated rice distribution maps based on phenology, and 300 sample points were selected for meta-identification of rice images via an enhanced pixel-based phenological feature composite method (Eppf-CM) utilizing high-resolution imagery. Subsequently, the inundation frequency (F) and an improved sample point statistical feature, i.e., the ratio of change amplitude of LSWI to NDVI (RCLN), were introduced to combine Eppf-CM with combined consideration of vegetation phenology and surface water variation (CCVS) methods, to automate the generation of training data with the aid of phenology. The sample data were optimized by an alternate iterative method involving extraction of neighborhood information. Finally, a random forest (RF) probabilistic model trained by integrating data from different phenological periods was used for rice mapping. To test its performance, we mapped rice distribution at 30 m resolution (“LR_Rice”) across Heilongjiang Province, China from 2010 to 2022, with annual overall accuracy (OA) and Kappa coefficients greater than 0.97 and 0.95, respectively, and compared them with four existing rice mapping products. The spatial distribution characteristics of rice cultivation extracted by the LR algorithm were accurate and the performance was optimal. In addition, the extracted area of LR_Rice was highly consistent with the agricultural statistical area; the coefficient of determination R² was 0.9915, and the RMSE was 22.5 kha. The results show that this method can accurately obtain large-scale rice planting information, which is of great significance for food security, water resource management, and environmentally sustainable development. Full article

(This article belongs to the Section Precision and Digital Agriculture)

► Show Figures

Figure 1

19 pages, 3770 KiB

Open AccessArticle

A New Pes Planus Automatic Diagnosis Method: ViT-OELM Hybrid Modeling

by Derya Avcı

Diagnostics 2025, 15(7), 867; https://doi.org/10.3390/diagnostics15070867 - 28 Mar 2025

Viewed by 430

Abstract

Background/Objectives: Pes planus (flat feet) is a condition characterized by flatter than normal soles of the foot. In this study, a Vision Transformer (ViT)-based deep learning architecture is proposed to automate the diagnosis of pes planus. The model analyzes foot images and classifies [...] Read more.

Background/Objectives: Pes planus (flat feet) is a condition characterized by flatter than normal soles of the foot. In this study, a Vision Transformer (ViT)-based deep learning architecture is proposed to automate the diagnosis of pes planus. The model analyzes foot images and classifies them into two classes, as “pes planus” and “not pes planus”. In the literature, models based on Convolutional neural networks (CNNs) can automatically perform such classification, regression, and prediction processes, but these models cannot capture long-term addictions and general conditions. Methods: In this study, the pes planus dataset, which is openly available on the Kaggle database, was used. This paper suggests a ViT-OELM hybrid model for automatic diagnosis from the obtained pes planus images. The suggested ViT-OELM hybrid model includes an attention mechanism for feature extraction from the pes planus images. A total of 1000 features obtained for each sample image from this attention mechanism are used as inputs for an Optimum Extreme Learning Machine (OELM) classifier using various activation functions, and are classified. Results: In this study, the performance of this suggested ViT-OELM hybrid model is compared with some other studies, which used the same pes planus database. These comparison results are given. The suggested ViT-OELM hybrid model was trained for binary classification. The performance metrics were computed in testing phase. The model showed 98.04% accuracy, 98.04% recall, 98.05% precision, and an F-1 score of 98.03%. Conclusions: Our suggested ViT-OELM hybrid model demonstrates superior performance compared to those of other studies, which used the same dataset, in the literature. Full article

(This article belongs to the Special Issue Classification of Diseases Using Machine Learning Algorithms: 2nd Edition)

► Show Figures

Figure 1

13 pages, 2295 KiB

Open AccessArticle

Seafloor Sediment Classification Using Small-Sample Multi-Beam Data Based on Convolutional Neural Networks

by Haibo Ma, Xianghua Lai, Taojun Hu, Xiaoming Fu, Xingwei Zhang and Sheng Song

J. Mar. Sci. Eng. 2025, 13(4), 671; https://doi.org/10.3390/jmse13040671 - 27 Mar 2025

Viewed by 447

Abstract

Accurate, rapid, and automatic seafloor sediment classification represents a crucial challenge in marine sediment research. To address this, our study proposes a seafloor sediment classification method integrating convolutional neural networks (CNNs) with small-sample multi-beam backscatter data. We implemented four CNN architectures for classification—LeNet, [...] Read more.

Accurate, rapid, and automatic seafloor sediment classification represents a crucial challenge in marine sediment research. To address this, our study proposes a seafloor sediment classification method integrating convolutional neural networks (CNNs) with small-sample multi-beam backscatter data. We implemented four CNN architectures for classification—LeNet, AlexNet, GoogLeNet, and VGG—all achieving an overall accuracy exceeding 92%. To overcome the scarcity of seafloor sediment acoustic image data, we applied a deep convolutional generative adversarial network (DCGAN) for data augmentation, incorporating a de-normalization and anti-normalization module into the original DCGAN framework. Through comparative analysis of the generated versus original datasets using visual inspection and grayscale co-occurrence matrix methods, we substantially enhanced the similarity between synthetic and authentic images. Subsequent model training using the augmented dataset demonstrated improved classification performance across all architectures: LeNet showed a 1.88% accuracy increase, AlexNet an increase of 1.06%, GoogLeNet an increase of 2.59%, and VGG16 achieved a 2.97% improvement. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

20 pages, 15232 KiB

Open AccessArticle

Swift Transfer of Lactating Piglet Detection Model Using Semi-Automatic Annotation Under an Unfamiliar Pig Farming Environment

by Qi’an Ding, Fang Zheng, Luo Liu, Peng Li and Mingxia Shen

Agriculture 2025, 15(7), 696; https://doi.org/10.3390/agriculture15070696 - 25 Mar 2025

Viewed by 343

Abstract

Manual annotation of piglet imagery across varied farming environments is labor-intensive. To address this, we propose a semi-automatic approach within an active learning framework that integrates a pre-annotation model for piglet detection. We further examine how data sample composition influences pre-annotation efficiency to [...] Read more.

Manual annotation of piglet imagery across varied farming environments is labor-intensive. To address this, we propose a semi-automatic approach within an active learning framework that integrates a pre-annotation model for piglet detection. We further examine how data sample composition influences pre-annotation efficiency to enhance the deployment of lactating piglet detection models. Our study utilizes original samples from pig farms in Jingjiang, Suqian, and Sheyang, along with new data from the Yinguang pig farm in Danyang. Using the YOLOv5 framework, we constructed both single and mixed training sets of piglet images, evaluated their performance, and selected the optimal pre-annotation model. This model generated bounding box coordinates on processed new samples, which were subsequently manually refined to train the final model. Results indicate that expanding the dataset and diversifying pigpen scenes significantly improve pre-annotation performance. The best model achieved a test precision of 0.921 on new samples, and after manual calibration, the final model exhibited a training precision of 0.968, a recall of 0.952, and an average precision of 0.979 at the IoU threshold of 0.5. The model demonstrated robust detection under various lighting conditions, with bounding boxes closely conforming to piglet contours, thereby substantially reducing manual labor. This approach is cost-effective for piglet segmentation tasks and offers strong support for advancing smart agricultural technologies. Full article

(This article belongs to the Special Issue Optics and Image Analysis in Modern Agriculture: Transforming Practices and Unveiling Opportunities)

► Show Figures

Figure 1

21 pages, 6196 KiB

Open AccessArticle

Building a Gender-Bias-Resistant Super Corpus as a Deep Learning Baseline for Speech Emotion Recognition

by Babak Abbaschian and Adel Elmaghraby

Sensors 2025, 25(7), 1991; https://doi.org/10.3390/s25071991 - 22 Mar 2025

Viewed by 523

Abstract

The focus on Speech Emotion Recognition has dramatically increased in recent years, driven by the need for automatic speech-recognition-based systems and intelligent assistants to enhance user experience by incorporating emotional content. While deep learning techniques have significantly advanced SER systems, their robustness concerning [...] Read more.

The focus on Speech Emotion Recognition has dramatically increased in recent years, driven by the need for automatic speech-recognition-based systems and intelligent assistants to enhance user experience by incorporating emotional content. While deep learning techniques have significantly advanced SER systems, their robustness concerning speaker gender and out-of-distribution data has not been thoroughly examined. Furthermore, standards for SER remain rooted in landmark papers from the 2000s, even though modern deep learning architectures can achieve comparable or superior results to the state of the art of that era. In this research, we address these challenges by creating a new super corpus from existing databases, providing a larger pool of samples. We benchmark this dataset using various deep learning architectures, setting a new baseline for the task. Additionally, our experiments reveal that models trained on this super corpus demonstrate superior generalization and accuracy and exhibit lower gender bias compared to models trained on individual databases. We further show that traditional preprocessing techniques, such as denoising and normalization, are insufficient to address inherent biases in the data. However, our data augmentation approach effectively shifts these biases, improving model fairness across gender groups and emotions and, in some cases, fully debiasing the models. Full article

(This article belongs to the Special Issue Emotion Recognition and Cognitive Behavior Analysis Based on Sensors)

► Show Figures

Graphical abstract

Search Results (281)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (281)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI