Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (197)

Search Parameters:
Keywords = convolutional pose machines

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 9956 KB  
Article
A Study on Flood Susceptibility Mapping in the Poyang Lake Basin Based on Machine Learning Model Comparison and SHapley Additive exPlanations Interpretation
by Zhuojia Li, Jie Tian, Youchen Zhu, Danlu Chen, Qin Ji and Deliang Sun
Water 2025, 17(20), 2955; https://doi.org/10.3390/w17202955 - 14 Oct 2025
Abstract
Floods are among the most destructive natural disasters, and accurate flood susceptibility mapping (FSM) is crucial for disaster prevention and mitigation amid climate change. The Poyang Lake basin, characterized by complex flood formation mechanisms and high spatial heterogeneity, poses challenges for the application [...] Read more.
Floods are among the most destructive natural disasters, and accurate flood susceptibility mapping (FSM) is crucial for disaster prevention and mitigation amid climate change. The Poyang Lake basin, characterized by complex flood formation mechanisms and high spatial heterogeneity, poses challenges for the application of FSM models. Currently, the use of machine learning models in this field faces several bottlenecks, including unclear model applicability, limited sample quality, and insufficient machine interpretation. To address these issues, we take the 2020 Poyang Lake flood as a case study and establish a high-precision flood inundation sample database. After feature screening, the performance of three hybrid models optimized by Particle Swarm Optimization (PSO)—Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Convolutional Neural Network (CNN) is compared. Furthermore, the Shapley Additive exPlanations (SHAP) framework is employed to interpret the contributions and interaction effects of the driving factors. The results demonstrate that the ensemble learning models exhibit superior performance, indicating their greater applicability for flood susceptibility mapping in complex basins such as Poyang Lake. The RF model has the best predictive performance, achieving an area under the receiver operating characteristic curve (AUC) value of 0.9536. Elevation is the most important global driving factor, while SHAP local interpretation reveals that the driving mechanism has significant spatial heterogeneity, and the susceptibility of local depressions is mainly controlled by the terrain moisture index. A nonlinear phenomenon is observed where the SHAP value was negative under extremely high late rainfall, which is preliminarily attributed to the “spatial transfer that is prone to occurrence” mechanism triggered by the backwater effect, highlighting the complex nonlinear interactions among factors. The proposed “high-precision sampling, model comparison, SHAP explanation” framework effectively improves the accuracy and interpretability of FSM. These research findings can provide a scientific basis for smart flood control and precise flood risk management in basins. Full article
Show Figures

Figure 1

19 pages, 2194 KB  
Article
Intelligent Motion Classification via Computer Vision for Smart Manufacturing and Ergonomic Risk Prevention in SMEs
by Armando Mares-Castro, Valentin Calzada-Ledesma, María Blanca Becerra-Rodríguez, Raúl Santiago-Montero and Anayansi Estrada-Monje
Appl. Sci. 2025, 15(20), 10914; https://doi.org/10.3390/app152010914 - 11 Oct 2025
Viewed by 93
Abstract
The transition toward Industry 4.0 and the emerging concept of Industry 5.0 demand intelligent tools that integrate efficiency, adaptability, and human-centered design. This paper presents a Computer Vision-based framework for automated motion classification in Methods-Time Measurement 2 (MTM-2), with the aim of supporting [...] Read more.
The transition toward Industry 4.0 and the emerging concept of Industry 5.0 demand intelligent tools that integrate efficiency, adaptability, and human-centered design. This paper presents a Computer Vision-based framework for automated motion classification in Methods-Time Measurement 2 (MTM-2), with the aim of supporting industrial time studies and ergonomic risk assessment. The system uses a Convolutional Neural Network (CNN) for pose estimation and derives angular kinematic features of key joints to characterize upper limb movements. A two-stage experimental design was conducted: first, three lightweight classifiers—K-Nearest Neighbors (KNN), Support Vector Machines (SVMs), and a Shallow Neural Network (SNN)—were compared, with KNN demonstrating the best trade-off between accuracy and efficiency; second, KNN was tested under noisy conditions to assess robustness. The results show near-perfect accuracy (≈100%) on 8919 motion instances, with an average inference time of 1 microsecond per sample, reducing the analysis time compared to manual transcription. Beyond efficiency, the framework addresses ergonomic risks such as wrist hyperextension, offering a scalable and cost-effective solution for Small and Medium-sized Enterprises. It also facilitates integration with Manufacturing Execution Systems and Digital Twins, and is therefore aligned with Industry 5.0 goals. Full article
Show Figures

Graphical abstract

20 pages, 4033 KB  
Article
AI-Based Virtual Assistant for Solar Radiation Prediction and Improvement of Sustainable Energy Systems
by Tomás Gavilánez, Néstor Zamora, Josué Navarrete, Nino Vega and Gabriela Vergara
Sustainability 2025, 17(19), 8909; https://doi.org/10.3390/su17198909 - 8 Oct 2025
Viewed by 292
Abstract
Advances in machine learning have improved the ability to predict critical environmental conditions, including solar radiation levels that, while essential for life, can pose serious risks to human health. In Ecuador, due to its geographical location and altitude, UV radiation reaches extreme levels. [...] Read more.
Advances in machine learning have improved the ability to predict critical environmental conditions, including solar radiation levels that, while essential for life, can pose serious risks to human health. In Ecuador, due to its geographical location and altitude, UV radiation reaches extreme levels. This study presents the development of a chatbot system driven by a hybrid artificial intelligence model, combining Random Forest, CatBoost, Gradient Boosting, and a 1D Convolutional Neural Network. The model was trained with meteorological data, optimized using hyperparameters (iterations: 500–1500, depth: 4–8, learning rate: 0.01–0.3), and evaluated through MAE, MSE, R2, and F1-Score. The hybrid model achieved superior accuracy (MAE = 13.77 W/m2, MSE = 849.96, R2 = 0.98), outperforming traditional methods. A 15% error margin was observed without significantly affecting classification. The chatbot, implemented via Telegram and hosted on Heroku, provided real-time personalized alerts, demonstrating an effective, accessible, and scalable solution for health safety and environmental awareness. Furthermore, it facilitates decision-making in the efficient generation of renewable energy and supports a more sustainable energy transition. It offers a tool that strengthens the relationship between artificial intelligence and sustainability by providing a practical instrument for integrating clean energy and mitigating climate change. Full article
Show Figures

Graphical abstract

22 pages, 3493 KB  
Article
NeuroFed-LightTCN: Federated Lightweight Temporal Convolutional Networks for Privacy-Preserving Seizure Detection in EEG Data
by Zheng You Lim, Ying Han Pang, Shih Yin Ooi, Wee How Khoh and Yee Jian Chew
Appl. Sci. 2025, 15(17), 9660; https://doi.org/10.3390/app15179660 - 2 Sep 2025
Viewed by 552
Abstract
This study investigates on-edge seizure detection that aims to resolve two major constraints that hold the deployment of deep learning models in clinical settings at present. First, centralized training requires gathering and consolidating data across institutions, which poses a serious issue of privacy. [...] Read more.
This study investigates on-edge seizure detection that aims to resolve two major constraints that hold the deployment of deep learning models in clinical settings at present. First, centralized training requires gathering and consolidating data across institutions, which poses a serious issue of privacy. Second, a high computational overhead inherent in inference imposes a crushing burden on resource-limited edge devices. Hence, we propose NeuroFed-LightTCN, a federated learning (FL) framework, incorporating a lightweight temporal convolutional network (TCN), designed for resource-efficient and privacy-preserving seizure detection. The proposed framework integrates depthwise separable convolutions, grouped with structured pruning to enhance efficiency, scalability, and performance. Furthermore, asynchronous aggregation is employed to mitigate training overhead. Empirical tests demonstrate that the network can be reduced fully to 70% with a 44.9% decrease in parameters (65.4 M down to 34.9 M and an inferencing latency of 56 ms) and still maintain 97.11% accuracy, a metric that outperforms both the non-FL and FL TCN optimizations. Ablation shows that asynchronous aggregation reduces training times by 3.6 to 18%, and pruning sustains performance even at extreme sparsity: an F1-score of 97.17% at a 70% pruning rate. Overall, the proposed NeuroFed-LightTCN addresses the trade-off between computational efficiency and model performance, delivering a viable solution to federated edge-device learning. Through the interaction of federated-optimization-driven approaches and lightweight architectural innovation, scalable and privacy-aware machine learning can be a practical reality, without compromising accuracy, and so its potential utility can be expanded to the real world. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

19 pages, 1724 KB  
Article
Advancing Air Quality Monitoring: Deep Learning-Based CNN-RNN Hybrid Model for PM2.5 Forecasting
by Anıl Utku, Umit Can, Mustafa Alpsülün, Hasan Celal Balıkçı, Azadeh Amoozegar, Abdulmuttalip Pilatin and Abdulkadir Barut
Atmosphere 2025, 16(9), 1003; https://doi.org/10.3390/atmos16091003 - 24 Aug 2025
Viewed by 1808
Abstract
Particulate matter, particularly PM2.5, poses a significant threat to public health due to its ability to disperse widely and its detrimental impact on the respiratory and circulatory systems upon inhalation. Consequently, it is imperative to maintain regular monitoring and assessment of [...] Read more.
Particulate matter, particularly PM2.5, poses a significant threat to public health due to its ability to disperse widely and its detrimental impact on the respiratory and circulatory systems upon inhalation. Consequently, it is imperative to maintain regular monitoring and assessment of particulate matter levels to anticipate air pollution events and promptly mitigate their adverse effects. However, predicting air quality is inherently complex, given the multitude of variables that influence it. Deep learning models, renowned for their ability to capture nonlinear relationships, offer a promising approach to address this challenge, with hybrid architectures demonstrating enhanced performance. This study aims to develop and evaluate a hybrid model integrating Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) for forecasting PM2.5 levels in India, Milan, and Frankfurt. A comparative analysis with established deep learning and machine learning techniques substantiates the superior predictive capabilities of the proposed CNN-RNN model. The findings underscore its potential as an effective tool for air quality prediction, with implications for informed decision-making and proactive intervention strategies to safeguard public health. Full article
(This article belongs to the Section Air Quality)
Show Figures

Figure 1

23 pages, 14694 KB  
Article
PLCNet: A 3D-CNN-Based Plant-Level Classification Network Hyperspectral Framework for Sweetpotato Virus Disease Detection
by Qiaofeng Zhang, Wei Wang, Han Su, Gaoxiang Yang, Jiawen Xue, Hui Hou, Xiaoyue Geng, Qinghe Cao and Zhen Xu
Remote Sens. 2025, 17(16), 2882; https://doi.org/10.3390/rs17162882 - 19 Aug 2025
Viewed by 761
Abstract
Sweetpotato virus disease (SPVD) poses a significant threat to global sweetpotato production; therefore, early, accurate field-scale detection is necessary. To address the limitations of the currently utilized assays, we propose PLCNet (Plant-Level Classification Network), a rapid, non-destructive SPVD identification framework using UAV-acquired hyperspectral [...] Read more.
Sweetpotato virus disease (SPVD) poses a significant threat to global sweetpotato production; therefore, early, accurate field-scale detection is necessary. To address the limitations of the currently utilized assays, we propose PLCNet (Plant-Level Classification Network), a rapid, non-destructive SPVD identification framework using UAV-acquired hyperspectral imagery. High-resolution data from early sweetpotato growth stages were processed via three feature selection methods—Random Forest (RF), Minimum Redundancy Maximum Relevance (mRMR), and Local Covariance Matrix (LCM)—in combination with 24 vegetation indices. Variance Inflation Factor (VIF) analysis reduced multicollinearity, yielding an optimized SPVD-sensitive feature set. First, using the RF-selected bands and vegetation indices, we benchmarked four classifiers—Support Vector Machine (SVM), Gradient Boosting Decision Tree (GBDT), Residual Network (ResNet), and 3D Convolutional Neural Network (3D-CNN). Under identical inputs, the 3D-CNN achieved superior performance (OA = 96.55%, Macro F1 = 95.36%, UA_mean = 0.9498, PA_mean = 0.9504), outperforming SVM, GBDT, and ResNet. Second, with the same spectral–spatial features and 3D-CNN backbone, we compared a pixel-level baseline (CropdocNet) against our plant-level PLCNet. CropdocNet exhibited spatial fragmentation and isolated errors, whereas PLCNet’s two-stage pipeline—deep feature extraction followed by connected-component analysis and majority voting—aggregated voxel predictions into coherent whole-plant labels, substantially reducing noise and enhancing biological interpretability. By integrating optimized feature selection, deep learning, and plant-level post-processing, PLCNet delivers a scalable, high-throughput solution for precise SPVD monitoring in agricultural fields. Full article
Show Figures

Figure 1

16 pages, 5104 KB  
Article
Integrating OpenPose for Proactive Human–Robot Interaction Through Upper-Body Pose Recognition
by Shih-Huan Tseng, Jhih-Ciang Chiang, Cheng-En Shiue and Hsiu-Ping Yueh
Electronics 2025, 14(15), 3112; https://doi.org/10.3390/electronics14153112 - 5 Aug 2025
Viewed by 884
Abstract
This paper introduces a novel system that utilizes OpenPose for skeleton estimation to enable a tabletop robot to interact with humans proactively. By accurately recognizing upper-body poses based on the skeleton information, the robot autonomously approaches individuals and initiates conversations. The contributions of [...] Read more.
This paper introduces a novel system that utilizes OpenPose for skeleton estimation to enable a tabletop robot to interact with humans proactively. By accurately recognizing upper-body poses based on the skeleton information, the robot autonomously approaches individuals and initiates conversations. The contributions of this paper can be summarized into three main features. Firstly, we conducted a comprehensive data collection process, capturing five different table-front poses: looking down, looking at the screen, looking at the robot, resting the head on hands, and stretching both hands. These poses were selected to represent common interaction scenarios. Secondly, we designed the robot’s dialog content and movement patterns to correspond with the identified table-front poses. By aligning the robot’s responses with the specific pose, we aimed to create a more engaging and intuitive interaction experience for users. Finally, we performed an extensive evaluation by exploring the performance of three classification models—non-linear Support Vector Machine (SVM), Artificial Neural Network (ANN), and convolutional neural network (CNN)—for accurately recognizing table-front poses. We used an Asus Zenbo Junior robot to acquire images and leveraged OpenPose to extract 12 upper-body skeleton points as input for training the classification models. The experimental results indicate that the ANN model outperformed the other models, demonstrating its effectiveness in pose recognition. Overall, the proposed system not only showcases the potential of utilizing OpenPose for proactive human–robot interaction but also demonstrates its real-world applicability. By combining advanced pose recognition techniques with carefully designed dialog and movement patterns, the tabletop robot successfully engages with humans in a proactive manner. Full article
Show Figures

Figure 1

27 pages, 1326 KB  
Systematic Review
Application of Artificial Intelligence in Pancreatic Cyst Management: A Systematic Review
by Donghyun Lee, Fadel Jesry, John J. Maliekkal, Lewis Goulder, Benjamin Huntly, Andrew M. Smith and Yazan S. Khaled
Cancers 2025, 17(15), 2558; https://doi.org/10.3390/cancers17152558 - 2 Aug 2025
Viewed by 1093
Abstract
Background: Pancreatic cystic lesions (PCLs), including intraductal papillary mucinous neoplasms (IPMNs) and mucinous cystic neoplasms (MCNs), pose a diagnostic challenge due to their variable malignant potential. Current guidelines, such as Fukuoka and American Gastroenterological Association (AGA), have moderate predictive accuracy and may lead [...] Read more.
Background: Pancreatic cystic lesions (PCLs), including intraductal papillary mucinous neoplasms (IPMNs) and mucinous cystic neoplasms (MCNs), pose a diagnostic challenge due to their variable malignant potential. Current guidelines, such as Fukuoka and American Gastroenterological Association (AGA), have moderate predictive accuracy and may lead to overtreatment or missed malignancies. Artificial intelligence (AI), incorporating machine learning (ML) and deep learning (DL), offers the potential to improve risk stratification, diagnosis, and management of PCLs by integrating clinical, radiological, and molecular data. This is the first systematic review to evaluate the application, performance, and clinical utility of AI models in the diagnosis, classification, prognosis, and management of pancreatic cysts. Methods: A systematic review was conducted in accordance with PRISMA guidelines and registered on PROSPERO (CRD420251008593). Databases searched included PubMed, EMBASE, Scopus, and Cochrane Library up to March 2025. The inclusion criteria encompassed original studies employing AI, ML, or DL in human subjects with pancreatic cysts, evaluating diagnostic, classification, or prognostic outcomes. Data were extracted on the study design, imaging modality, model type, sample size, performance metrics (accuracy, sensitivity, specificity, and area under the curve (AUC)), and validation methods. Study quality and bias were assessed using the PROBAST and adherence to TRIPOD reporting guidelines. Results: From 847 records, 31 studies met the inclusion criteria. Most were retrospective observational (n = 27, 87%) and focused on preoperative diagnostic applications (n = 30, 97%), with only one addressing prognosis. Imaging modalities included Computed Tomography (CT) (48%), endoscopic ultrasound (EUS) (26%), and Magnetic Resonance Imaging (MRI) (9.7%). Neural networks, particularly convolutional neural networks (CNNs), were the most common AI models (n = 16), followed by logistic regression (n = 4) and support vector machines (n = 3). The median reported AUC across studies was 0.912, with 55% of models achieving AUC ≥ 0.80. The models outperformed clinicians or existing guidelines in 11 studies. IPMN stratification and subtype classification were common focuses, with CNN-based EUS models achieving accuracies of up to 99.6%. Only 10 studies (32%) performed external validation. The risk of bias was high in 93.5% of studies, and TRIPOD adherence averaged 48%. Conclusions: AI demonstrates strong potential in improving the diagnosis and risk stratification of pancreatic cysts, with several models outperforming current clinical guidelines and human readers. However, widespread clinical adoption is hindered by high risk of bias, lack of external validation, and limited interpretability of complex models. Future work should prioritise multicentre prospective studies, standardised model reporting, and development of interpretable, externally validated tools to support clinical integration. Full article
(This article belongs to the Section Methods and Technologies Development)
Show Figures

Figure 1

30 pages, 4409 KB  
Article
Accident Impact Prediction Based on a Deep Convolutional and Recurrent Neural Network Model
by Pouyan Sajadi, Mahya Qorbani, Sobhan Moosavi and Erfan Hassannayebi
Urban Sci. 2025, 9(8), 299; https://doi.org/10.3390/urbansci9080299 - 1 Aug 2025
Viewed by 911
Abstract
Traffic accidents pose a significant threat to public safety, resulting in numerous fatalities, injuries, and a substantial economic burden each year. The development of predictive models capable of the real-time forecasting of post-accident impact using readily available data can play a crucial role [...] Read more.
Traffic accidents pose a significant threat to public safety, resulting in numerous fatalities, injuries, and a substantial economic burden each year. The development of predictive models capable of the real-time forecasting of post-accident impact using readily available data can play a crucial role in preventing adverse outcomes and enhancing overall safety. However, existing accident predictive models encounter two main challenges: first, a reliance on either costly or non-real-time data, and second, the absence of a comprehensive metric to measure post-accident impact accurately. To address these limitations, this study proposes a deep neural network model known as the cascade model. It leverages readily available real-world data from Los Angeles County to predict post-accident impacts. The model consists of two components: Long Short-Term Memory (LSTM) and a Convolutional Neural Network (CNN). The LSTM model captures temporal patterns, while the CNN extracts patterns from the sparse accident dataset. Furthermore, an external traffic congestion dataset is incorporated to derive a new feature called the “accident impact” factor, which quantifies the influence of an accident on surrounding traffic flow. Extensive experiments were conducted to demonstrate the effectiveness of the proposed hybrid machine learning method in predicting the post-accident impact compared to state-of-the-art baselines. The results reveal a higher precision in predicting minimal impacts (i.e., cases with no reported accidents) and a higher recall in predicting more significant impacts (i.e., cases with reported accidents). Full article
Show Figures

Figure 1

25 pages, 11507 KB  
Article
Accurate EDM Calibration of a Digital Twin for a Seven-Axis Robotic EDM System and 3D Offline Cutting Path
by Sergio Tadeu de Almeida, John P. T. Mo, Cees Bil, Songlin Ding and Chi-Tsun Cheng
Micromachines 2025, 16(8), 892; https://doi.org/10.3390/mi16080892 - 31 Jul 2025
Viewed by 564
Abstract
The increasing utilization of hard-to-cut materials in high-performance sectors such as aerospace and defense has pushed manufacturing systems to be flexible in processing large workpieces with a wide range of materials while also delivering high precision. Recent studies have highlighted the potential of [...] Read more.
The increasing utilization of hard-to-cut materials in high-performance sectors such as aerospace and defense has pushed manufacturing systems to be flexible in processing large workpieces with a wide range of materials while also delivering high precision. Recent studies have highlighted the potential of integrating industrial robots (IRs) with electric discharge machining (EDM) to create a non-contact, low-force manufacturing platform, particularly suited for the accurate machining of hard-to-cut materials into complex and large-scale monolithic components. In response to this potential, a novel robotic EDM system has been developed. However, the manual programming and control of such a convoluted system present a significant challenge, often leading to inefficiencies and increased error rates, creating a scenario where the EDM process becomes unfeasible. To enhance the industrial applicability of this robotic EDM technology, this study focuses on a novel methodology to develop and validate a digital twin (DT) of the physical robotic EDM system. The digital twin functions as a virtual experimental environment for tool motion, effectively addressing the challenges posed by collisions and kinematic singularities inherent in the physical system, yet with proven 20-micron EDM gap accuracy. Furthermore, it facilitates a CNC-like, user-friendly offline programming framework for robotic EDM cutting path generation. Full article
Show Figures

Figure 1

19 pages, 1339 KB  
Article
Convolutional Graph Network-Based Feature Extraction to Detect Phishing Attacks
by Saif Safaa Shakir, Leyli Mohammad Khanli and Hojjat Emami
Future Internet 2025, 17(8), 331; https://doi.org/10.3390/fi17080331 - 25 Jul 2025
Viewed by 922
Abstract
Phishing attacks pose significant risks to security, drawing considerable attention from both security professionals and customers. Despite extensive research, the current phishing website detection mechanisms often fail to efficiently diagnose unknown attacks due to their poor performances in the feature selection stage. Many [...] Read more.
Phishing attacks pose significant risks to security, drawing considerable attention from both security professionals and customers. Despite extensive research, the current phishing website detection mechanisms often fail to efficiently diagnose unknown attacks due to their poor performances in the feature selection stage. Many techniques suffer from overfitting when working with huge datasets. To address this issue, we propose a feature selection strategy based on a convolutional graph network, which utilizes a dataset containing both labels and features, along with hyperparameters for a Support Vector Machine (SVM) and a graph neural network (GNN). Our technique consists of three main stages: (1) preprocessing the data by dividing them into testing and training sets, (2) constructing a graph from pairwise feature distances using the Manhattan distance and adding self-loops to nodes, and (3) implementing a GraphSAGE model with node embeddings and training the GNN by updating the node embeddings through message passing from neighbors, calculating the hinge loss, applying the softmax function, and updating weights via backpropagation. Additionally, we compute the neighborhood random walk (NRW) distance using a random walk with restart to create an adjacency matrix that captures the node relationships. The node features are ranked based on gradient significance to select the top k features, and the SVM is trained using the selected features, with the hyperparameters tuned through cross-validation. We evaluated our model on a test set, calculating the performance metrics and validating the effectiveness of the PhishGNN dataset. Our model achieved a precision of 90.78%, an F1-score of 93.79%, a recall of 97%, and an accuracy of 93.53%, outperforming the existing techniques. Full article
(This article belongs to the Section Cybersecurity)
Show Figures

Graphical abstract

24 pages, 9767 KB  
Article
Improved Binary Classification of Underwater Images Using a Modified ResNet-18 Model
by Mehrunnisa, Mikolaj Leszczuk, Dawid Juszka and Yi Zhang
Electronics 2025, 14(15), 2954; https://doi.org/10.3390/electronics14152954 - 24 Jul 2025
Cited by 1 | Viewed by 1559
Abstract
In recent years, the classification of underwater images has become one of the most remarkable areas of research in computer vision due to its useful applications in marine sciences, aquatic robotics, and sea exploration. Underwater imaging is pivotal for the evaluation of marine [...] Read more.
In recent years, the classification of underwater images has become one of the most remarkable areas of research in computer vision due to its useful applications in marine sciences, aquatic robotics, and sea exploration. Underwater imaging is pivotal for the evaluation of marine eco-systems, analysis of biological habitats, and monitoring underwater infrastructure. Extracting useful information from underwater images is highly challenging due to factors such as light distortion, scattering, poor contrast, and complex foreground patterns. These difficulties make traditional image processing and machine learning techniques struggle to analyze images accurately. As a result, these challenges and complexities make the classification difficult or poor to perform. Recently, deep learning techniques, especially convolutional neural network (CNN), have emerged as influential tools for underwater image classification, contributing noteworthy improvements in accuracy and performance in the presence of all these challenges. In this paper, we have proposed a modified ResNet-18 model for the binary classification of underwater images into raw and enhanced images. In the proposed modified ResNet-18 model, we have added new layers such as Linear, rectified linear unit (ReLU) and dropout layers, arranged in a block that was repeated three times to enhance feature extraction and improve learning. This enables our model to learn the complex patterns present in the image in more detail, which helps the model to perform the classification very well. Due to these newly added layers, our proposed model addresses various complexities such as noise, distortion, varying illumination conditions, and complex patterns by learning vigorous features from underwater image datasets. To handle the issue of class imbalance present in the dataset, we applied a data augmentation technique. The proposed model achieved outstanding performance, with 96% accuracy, 99% precision, 92% sensitivity, 99% specificity, 95% F1-score, and a 96% Area under the Receiver Operating Characteristic Curve (AUC-ROC) score. These results demonstrate the strength and reliability of our proposed model in handling the challenges posed by the underwater imagery and making it a favorable solution for advancing underwater image classification tasks. Full article
Show Figures

Figure 1

20 pages, 41202 KB  
Article
Copper Stress Levels Classification in Oilseed Rape Using Deep Residual Networks and Hyperspectral False-Color Images
by Yifei Peng, Jun Sun, Zhentao Cai, Lei Shi, Xiaohong Wu, Chunxia Dai and Yubin Xie
Horticulturae 2025, 11(7), 840; https://doi.org/10.3390/horticulturae11070840 - 16 Jul 2025
Cited by 1 | Viewed by 510
Abstract
In recent years, heavy metal contamination in agricultural products has become a growing concern in the field of food safety. Copper (Cu) stress in crops not only leads to significant reductions in both yield and quality but also poses potential health risks to [...] Read more.
In recent years, heavy metal contamination in agricultural products has become a growing concern in the field of food safety. Copper (Cu) stress in crops not only leads to significant reductions in both yield and quality but also poses potential health risks to humans. This study proposes an efficient and precise non-destructive detection method for Cu stress in oilseed rape, which is based on hyperspectral false-color image construction using principal component analysis (PCA). By comprehensively capturing the spectral representation of oilseed rape plants, both the one-dimensional (1D) spectral sequence and spatial image data were utilized for multi-class classification. The classification performance of models based on 1D spectral sequences was compared from two perspectives: first, between machine learning and deep learning methods (best accuracy: 93.49% vs. 96.69%); and second, between shallow and deep convolutional neural networks (CNNs) (best accuracy: 95.15% vs. 96.69%). For spatial image data, deep residual networks were employed to evaluate the effectiveness of visible-light and false-color images. The RegNet architecture was chosen for its flexible parameterization and proven effectiveness in extracting multi-scale features from hyperspectral false-color images. This flexibility enabled RegNetX-6.4GF to achieve optimal performance on the dataset constructed from three types of false-color images, with the model reaching a Macro-Precision, Macro-Recall, Macro-F1, and Accuracy of 98.17%, 98.15%, 98.15%, and 98.15%, respectively. Furthermore, Grad-CAM visualizations revealed that latent physiological changes in plants under heavy metal stress guided feature learning within CNNs, and demonstrated the effectiveness of false-color image construction in extracting discriminative features. Overall, the proposed technique can be integrated into portable hyperspectral imaging devices, enabling real-time and non-destructive detection of heavy metal stress in modern agricultural practices. Full article
Show Figures

Figure 1

32 pages, 2302 KB  
Review
Early Detection of Alzheimer’s Disease Using Generative Models: A Review of GANs and Diffusion Models in Medical Imaging
by Md Minul Alam and Shahram Latifi
Algorithms 2025, 18(7), 434; https://doi.org/10.3390/a18070434 - 15 Jul 2025
Viewed by 1882
Abstract
Alzheimer’s disease (AD) is a progressive, non-curable neurodegenerative disorder that poses persistent challenges for early diagnosis due to its gradual onset and the difficulty in distinguishing pathological changes from normal aging. Neuroimaging, particularly MRI and PET, plays a key role in detection; however, [...] Read more.
Alzheimer’s disease (AD) is a progressive, non-curable neurodegenerative disorder that poses persistent challenges for early diagnosis due to its gradual onset and the difficulty in distinguishing pathological changes from normal aging. Neuroimaging, particularly MRI and PET, plays a key role in detection; however, limitations in data availability and the complexity of early structural biomarkers constrain traditional diagnostic approaches. This review investigates the use of generative models, specifically Generative Adversarial Networks (GANs) and Diffusion Models, as emerging tools to address these challenges. These models are capable of generating high-fidelity synthetic brain images, augmenting datasets, and enhancing machine learning performance in classification tasks. The review synthesizes findings across multiple studies, revealing that GAN-based models achieved diagnostic accuracies up to 99.70%, with image quality metrics such as SSIM reaching 0.943 and PSNR up to 33.35 dB. Diffusion Models, though relatively new, demonstrated strong performance with up to 92.3% accuracy and FID scores as low as 11.43. Integrating generative models with convolutional neural networks (CNNs) and multimodal inputs further improved diagnostic reliability. Despite these advancements, challenges remain, including high computational demands, limited interpretability, and ethical concerns regarding synthetic data. This review offers a comprehensive perspective to inform future AI-driven research in early AD detection. Full article
(This article belongs to the Special Issue Advancements in Signal Processing and Machine Learning for Healthcare)
Show Figures

Graphical abstract

22 pages, 6645 KB  
Article
Visual Detection on Aircraft Wing Icing Process Using a Lightweight Deep Learning Model
by Yang Yan, Chao Tang, Jirong Huang, Zhixiong Cen and Zonghong Xie
Aerospace 2025, 12(7), 627; https://doi.org/10.3390/aerospace12070627 - 12 Jul 2025
Viewed by 589
Abstract
Aircraft wing icing significantly threatens aviation safety, causing substantial losses to the aviation industry each year. High transparency and blurred edges of icing areas in wing images pose challenges to wing icing detection by machine vision. To address these challenges, this study proposes [...] Read more.
Aircraft wing icing significantly threatens aviation safety, causing substantial losses to the aviation industry each year. High transparency and blurred edges of icing areas in wing images pose challenges to wing icing detection by machine vision. To address these challenges, this study proposes a detection model, Wing Icing Detection DeeplabV3+ (WID-DeeplabV3+), for efficient and precise aircraft wing leading edge icing detection under natural lighting conditions. WID-DeeplabV3+ adopts the lightweight MobileNetV3 as its backbone network to enhance the extraction of edge features in icing areas. Ghost Convolution and Atrous Spatial Pyramid Pooling modules are incorporated to reduce model parameters and computational complexity. The model is optimized using the transfer learning method, where pre-trained weights are utilized to accelerate convergence and enhance performance. Experimental results show WID-DeepLabV3+ segments the icing edge at 1920 × 1080 within 0.03 s. The model achieves the accuracy of 97.15%, an IOU of 94.16%, a precision of 97%, and a recall of 96.96%, representing respective improvements of 1.83%, 3.55%, 1.79%, and 2.04% over DeeplabV3+. The number of parameters and computational complexity are reduced by 92% and 76%, respectively. With high accuracy, superior IOU, and fast inference speed, WID-DeeplabV3+ provides an effective solution for wing-icing detection. Full article
(This article belongs to the Section Aeronautics)
Show Figures

Figure 1

Back to TopTop