Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (89)

Search Parameters:
Keywords = deep multimodel learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
13 pages, 1712 KB  
Article
Deep Learning-Driven Insights into Hardness and Electrical Conductivity of Low-Alloyed Copper Alloys
by Mihail Kolev, Juliana Javorova, Tatiana Simeonova, Yasen Hadjitodorov and Boyko Krastev
Alloys 2025, 4(4), 22; https://doi.org/10.3390/alloys4040022 - 10 Oct 2025
Viewed by 241
Abstract
Understanding the intricate relationship between composition, processing conditions, and material properties is essential for optimizing Cu-based alloys. Machine learning offers a powerful tool for decoding these complex interactions, enabling more efficient alloy design. This work introduces a comprehensive machine learning framework aimed at [...] Read more.
Understanding the intricate relationship between composition, processing conditions, and material properties is essential for optimizing Cu-based alloys. Machine learning offers a powerful tool for decoding these complex interactions, enabling more efficient alloy design. This work introduces a comprehensive machine learning framework aimed at accurately predicting key properties such as hardness and electrical conductivity of low-alloyed Cu-based alloys. By integrating various input parameters, including chemical composition and thermo-mechanical processing parameters, the study develops and validates multiple machine learning models, including Multi-Layer Perceptron with Production-Aware Deep Architecture (MLP-PADA), Deep Feedforward Network with Multi-Regularization Framework (DFF-MRF), Feedforward Network with Self-Adaptive Optimization (FFN-SAO), and Feedforward Network with Materials Mapping (FFN-TMM). On a held-out test set, DFF-MRF achieved the best generalization (R2_test = 0.9066; RMSE_test = 5.3644), followed by MLP-PADA (R2_test = 0.8953; RMSE_test = 5.7080) and FFN-TMM (R2_test = 0.8914; RMSE_test = 5.8126), with FFN-SAO slightly lower (R2_test = 0.8709). Additionally, a computational performance analysis was conducted to evaluate inference time, memory usage, energy consumption, and batch scalability across all models. Feature importance analysis was conducted, revealing that aging temperature, Cr, and aging duration were the most influential factors for hardness. In contrast, aging duration, aging temperature, solution treatment temperature, and Cu played key roles in electrical conductivity. The results demonstrate the effectiveness of these advanced machine learning models in predicting critical material properties, offering insightful advancements for materials science research. This study introduces the first controlled, statistically validated, multi-model benchmark that integrates composition and thermo-mechanical processing with deployment-grade profiling for property prediction of low-alloyed Cu alloys. Full article
Show Figures

Figure 1

28 pages, 6039 KB  
Article
Detection and Classification of Unhealthy Heartbeats Using Deep Learning Techniques
by Abdullah M. Albarrak, Raneem Alharbi and Ibrahim A. Ibrahim
Sensors 2025, 25(19), 5976; https://doi.org/10.3390/s25195976 - 26 Sep 2025
Viewed by 511
Abstract
Arrhythmias are a common and potentially life-threatening category of cardiac disorders, making accurate and early detection crucial for improving clinical outcomes. Electrocardiograms are widely used to monitor heart rhythms, yet their manual interpretation remains prone to inconsistencies due to the complexity of the [...] Read more.
Arrhythmias are a common and potentially life-threatening category of cardiac disorders, making accurate and early detection crucial for improving clinical outcomes. Electrocardiograms are widely used to monitor heart rhythms, yet their manual interpretation remains prone to inconsistencies due to the complexity of the signals. This research investigates the effectiveness of machine learning and deep learning techniques for automated arrhythmia classification using ECG signals from the MIT-BIH dataset. We compared Gradient Boosting Machine (GBM) and Multilayer Perceptron (MLP) as traditional machine learning models with a hybrid deep learning model combining one-dimensional convolutional neural networks (1D-CNNs) and long short-term memory (LSTM) networks. Furthermore, the Grey Wolf Optimizer (GWO) was utilized to automatically optimize the hyperparameters of the 1D-CNN-LSTM model, enhancing its performance. Experimental results show that the proposed 1D-CNN-LSTM model achieved the highest accuracy of 97%, outperforming both classical machine learning and other deep learning baselines. The classification report and confusion matrix confirm the model’s robustness in identifying various arrhythmia types. These findings emphasize the possible benefits of integrating metaheuristic optimization with hybrid deep learning. Full article
(This article belongs to the Special Issue Sensors Technology and Application in ECG Signal Processing)
Show Figures

Figure 1

25 pages, 7348 KB  
Article
Intelligent Segmentation of Urban Building Roofs and Solar Energy Potential Estimation for Photovoltaic Applications
by Junsen Zeng, Minglong Yang, Xiujuan Tang, Xiaotong Guan and Tingting Ma
J. Imaging 2025, 11(10), 334; https://doi.org/10.3390/jimaging11100334 - 25 Sep 2025
Viewed by 256
Abstract
To support dual-carbon objectives and enhance the accuracy of rooftop distributed photovoltaic (PV) planning, this study proposes a multidimensional coupled evaluation framework that integrates an improved rooftop segmentation network (CESW-TransUNet), a residual-fusion ensemble, and physics-based shading and performance simulations, thereby correcting the bias [...] Read more.
To support dual-carbon objectives and enhance the accuracy of rooftop distributed photovoltaic (PV) planning, this study proposes a multidimensional coupled evaluation framework that integrates an improved rooftop segmentation network (CESW-TransUNet), a residual-fusion ensemble, and physics-based shading and performance simulations, thereby correcting the bias of conventional 2-D area–based methods. First, CESW-TransUNet, equipped with convolution-enhanced modules, achieves robust multi-scale rooftop extraction and reaches an IoU of 78.50% on the INRIA benchmark, representing a 2.27 percentage point improvement over TransUNet. Second, the proposed residual fusion strategy adaptively integrates multiple models, including DeepLabV3+ and PSPNet, further improving the IoU to 79.85%. Finally, by coupling Ecotect-based shadow analysis with PVsyst performance modeling, the framework systematically quantifies dynamic inter-building shading, rooftop equipment occupancy, and installation suitability. A case study demonstrates that the method reduces the systematic overestimation of annual generation by 27.7% compared with traditional 2-D assessments. The framework thereby offers a quantitative, end-to-end decision tool for urban rooftop PV planning, enabling more reliable evaluation of generation and carbon-mitigation potential. Full article
Show Figures

Figure 1

17 pages, 3604 KB  
Article
Cloud-Edge Collaborative Inference-Based Smart Detection Method for Small Objects
by Cong Ye, Shengkun Li, Jianlei Wang, Hongru Li, Xiao Li and Sujie Shao
Modelling 2025, 6(4), 112; https://doi.org/10.3390/modelling6040112 - 24 Sep 2025
Viewed by 427
Abstract
Emerging technologies are revolutionizing power system operation and maintenance. Intelligent state perception is pivotal for stable grid operation, with small object detection technology being vital for identifying minor hazards in power facilities. However, challenges like small object size, low resolution, occlusion, and low [...] Read more.
Emerging technologies are revolutionizing power system operation and maintenance. Intelligent state perception is pivotal for stable grid operation, with small object detection technology being vital for identifying minor hazards in power facilities. However, challenges like small object size, low resolution, occlusion, and low confidence arise in small object detection for power operation and maintenance. This paper proposes PyraFAN, a feature fusion method designed for small object detection, and introduces a cloud-edge collaborative inference based smart detection method. This method boosts detection accuracy while ensuring real-time performance. Additionally, a graph-guided distillation method is developed for edge models. By quantifying model performance and task similarity, multi-model collaborative training is realized to improve detection accuracy. Experimental results show that compared with standalone edge models, the proposed method improves detection accuracy by 6.98% and reduces the false negative rate by 19.56%. The PyraFAN module can enhance edge model detection accuracy by approximately 12.2%. Updating edge models via cloud model distillation increases the mAP@0.5 of edge models by 2.7%. Compared to cloud models, the cloud-edge collaboration method reduces average inference latency by 0.8%. This research offers an effective solution for improving the accuracy of deep learning based small object detection in power operation and maintenance within cloud-edge computing environments. Full article
Show Figures

Figure 1

26 pages, 5305 KB  
Article
Development of Real-Time IoT-Based Air Quality Forecasting System Using Machine Learning Approach
by Onem Yildiz and Hilmi Saygin Sucuoglu
Sustainability 2025, 17(19), 8531; https://doi.org/10.3390/su17198531 - 23 Sep 2025
Viewed by 1055
Abstract
Air quality monitoring and forecasting have become increasingly critical in urban environments due to rising pollution levels and their impact on public health. Recent advances in Internet of Things (IoT) technology and machine learning offer promising alternatives to traditional monitoring stations, which are [...] Read more.
Air quality monitoring and forecasting have become increasingly critical in urban environments due to rising pollution levels and their impact on public health. Recent advances in Internet of Things (IoT) technology and machine learning offer promising alternatives to traditional monitoring stations, which are limited by high costs and sparse deployment. This paper presents the development of a real-time, low-cost air quality forecasting system that integrates IoT-based sensing units with predictive machine learning algorithms. The proposed system employs low-cost gas sensors and microcontroller-based hardware to monitor pollutants such as particulate matter, carbon monoxide, carbon dioxide and volatile organic compounds. A fully functional prototype device was designed and manufactured using Fused Deposition Modeling (FDM) with modular and scalable features. The data acquisition pipeline includes on-device adjustment, local smoothing, and cloud transfer for real-time storage and visualization. Advanced feature engineering and a multi-model training strategy were used to generate accurate short-term forecasts. Among the models tested, the GRU-based deep learning model yielded the highest performance, achieving R2 values above 0.93 and maintaining latency below 130 ms, suitable for real-time use. The system also achieved over 91% accuracy in health-based AQI category predictions and demonstrated stable performance without sensor saturation under high-pollution conditions. This study demonstrates that combining embedded hardware, real-time analytics, and ML-driven forecasting enables robust and scalable air quality management solutions, contributing directly to sustainable development goals through enhanced environmental monitoring and public health responsiveness. Full article
(This article belongs to the Special Issue Achieving Sustainability in New Product Development and Supply Chain)
Show Figures

Figure 1

21 pages, 1827 KB  
Article
A Multi-Model Fusion Framework for Aeroengine Remaining Useful Life Prediction
by Bing Tan, Yang Zhang, Xia Wei, Lei Wang, Yanming Chang, Li Zhang, Yingzhe Fan and Caio Graco Rodrigues Leandro Roza
Eng 2025, 6(9), 210; https://doi.org/10.3390/eng6090210 - 1 Sep 2025
Viewed by 498
Abstract
As the core component of aircraft systems, aeroengines require accurate Remaining Useful Life (RUL) prediction to ensure flight safety, which serves as a key part of Prognostics and Health Management (PHM). Traditional RUL prediction methods primarily fall into two main categories: physics-based and [...] Read more.
As the core component of aircraft systems, aeroengines require accurate Remaining Useful Life (RUL) prediction to ensure flight safety, which serves as a key part of Prognostics and Health Management (PHM). Traditional RUL prediction methods primarily fall into two main categories: physics-based and data-driven approaches. Physics-based methods mainly rely on extensive prior knowledge, limiting their scalability, while data-driven methods (including statistical analysis and machine learning) struggle with handling high-dimensional data and suboptimal modeling of multi-scale temporal dependencies. To address these challenges and enhance prediction accuracy and robustness, we propose a novel hybrid deep learning framework (CLSTM-TCN) integrating 2D Convolutional Neural Network (2D-CNN), Long Short-Term Memory (LSTM) network, and Temporal Convolutional Network (TCN) modules. The CLSTM-TCN framework follows a progressive feature refinement logic: 2D-CNN first extracts short-term local features and inter-feature interactions from input data; the LSTM network then models long-term temporal dependencies in time series to strengthen global temporal dynamics representation; and TCN ultimately captures multi-scale temporal features via dilated convolutions, overcoming the limitations of the LSTM network in long-range dependency modeling while enabling parallel computing. Validated on the NASA C-MAPSS data set (focusing on FD001), the CLSTM-TCN model achieves a root mean square error (RMSE) of 13.35 and a score function (score) of 219. Compared to the CNN-LSTM, CNN-TCN, and LSTM-TCN models, it reduces the RMSE by 27.94%, 30.79%, and 30.88%, respectively, and significantly outperforms the traditional single-model methods (e.g., standalone CNN or LSTM network). Notably, the model maintains stability across diverse operational conditions, with RMSE fluctuations capped within 15% for all test cases. Ablation studies confirm the synergistic effect of each module: removing 2D-CNN, LSTM, or TCN leads to an increase in the RMSE and score. This framework effectively handles high-dimensional data and multi-scale temporal dependencies, providing an accurate and robust solution for aeroengine RUL prediction. While current performance is validated under single operating conditions, ongoing efforts to optimize hyperparameter tuning, enhance adaptability to complex operating scenarios, and integrate uncertainty analysis will further strengthen its practical value in aircraft health management. Full article
Show Figures

Figure 1

29 pages, 4725 KB  
Article
Feature Fusion Using Deep Learning Algorithms in Image Classification for Security Purposes by Random Weight Network
by Mustafa Servet Kiran, Gokhan Seyfi, Merve Yilmaz, Engin Esme and Xizhao Wang
Appl. Sci. 2025, 15(16), 9053; https://doi.org/10.3390/app15169053 - 17 Aug 2025
Viewed by 842
Abstract
Automated threat detection in X-ray security imagery is a critical yet challenging task, where conventional deep learning models often struggle with low accuracy and overfitting. This study addresses these limitations by introducing a novel framework based on feature fusion. The proposed method extracts [...] Read more.
Automated threat detection in X-ray security imagery is a critical yet challenging task, where conventional deep learning models often struggle with low accuracy and overfitting. This study addresses these limitations by introducing a novel framework based on feature fusion. The proposed method extracts features from multiple and diverse deep learning architectures and classifies them using a Random Weight Network (RWN), whose hyperparameters are optimized for maximum performance. The results show substantial improvements at each stage: while the best standalone deep learning model achieved a test accuracy of 83.55%, applying the RWN to a single feature set increased accuracy to 94.82%. Notably, the proposed feature fusion framework achieved a state-of-the-art test accuracy of 97.44%. These findings demonstrate that a modular approach combining multi-model feature fusion with an efficient classifier is a highly effective strategy for improving the accuracy and generalization capability of automated threat detection systems. Full article
(This article belongs to the Special Issue Deep Learning for Image Processing and Computer Vision)
Show Figures

Figure 1

23 pages, 8286 KB  
Article
Context-Guided SAR Ship Detection with Prototype-Based Model Pretraining and Check–Balance-Based Decision Fusion
by Haowen Zhou, Zhe Geng, Minjie Sun, Linyi Wu and He Yan
Sensors 2025, 25(16), 4938; https://doi.org/10.3390/s25164938 - 10 Aug 2025
Cited by 1 | Viewed by 664
Abstract
To address the challenging problem of multi-scale inshore–offshore ship detection in synthetic aperture radar (SAR) remote sensing images, we propose a novel deep learning-based automatic ship detection method within the framework of compositional learning. The proposed method is supported by three pillars: context-guided [...] Read more.
To address the challenging problem of multi-scale inshore–offshore ship detection in synthetic aperture radar (SAR) remote sensing images, we propose a novel deep learning-based automatic ship detection method within the framework of compositional learning. The proposed method is supported by three pillars: context-guided region proposal, prototype-based model-pretraining, and multi-model ensemble learning. To reduce the false alarms induced by the discrete ground clutters, the prior knowledge of the harbour’s layout is exploited to generate land masks for terrain delimitation. To prepare the model for the diverse ship targets of different sizes and orientations it might encounter in the test environment, a novel cross-dataset model pretraining strategy is devised, where the SAR images of several key ship target prototypes from the auxiliary dataset are used to support class-incremental learning. To combine the advantages of diverse model architectures, an adaptive decision-level fusion framework is proposed, which consists of three components: a dynamic confidence threshold assignment strategy based on the sizes of targets, a weighted fusion mechanism based on president-senate check–balance, and Soft-NMS-based Dense Group Target Bounding Box Fusion (Soft-NMS-DGT-BBF). The performance enhancement brought by contextual knowledge-aided terrain delimitation, cross-dataset prototype-based model pretraining and check–balance-based adaptive decision-level fusion are validated with a series of ingeniously devised experiments based on the FAIR-CSAR-Ship dataset. Full article
(This article belongs to the Special Issue SAR Imaging Technologies and Applications)
Show Figures

Figure 1

25 pages, 2915 KB  
Article
Multi-Model Identification of Rice Leaf Diseases Based on CEL-DL-Bagging
by Zhenghua Zhang, Rufeng Wang and Siqi Huang
AgriEngineering 2025, 7(8), 255; https://doi.org/10.3390/agriengineering7080255 - 7 Aug 2025
Viewed by 646
Abstract
This study proposes CEL-DL-Bagging (Cross-Entropy Loss-optimized Deep Learning Bagging), a multi-model fusion framework that integrates cross-entropy loss-weighted voting with Bootstrap Aggregating (Bagging). First, we develop a lightweight recognition architecture by embedding a salient position attention (SPA) mechanism into four base networks (YOLOv5s-cls, EfficientNet-B0, [...] Read more.
This study proposes CEL-DL-Bagging (Cross-Entropy Loss-optimized Deep Learning Bagging), a multi-model fusion framework that integrates cross-entropy loss-weighted voting with Bootstrap Aggregating (Bagging). First, we develop a lightweight recognition architecture by embedding a salient position attention (SPA) mechanism into four base networks (YOLOv5s-cls, EfficientNet-B0, MobileNetV3, and ShuffleNetV2), significantly enhancing discriminative feature extraction for disease patterns. Our experiments show that these SPA-enhanced models achieve consistent accuracy gains of 0.8–1.7 percentage points, peaking at 97.86%. Building on this, we introduce DB-CEWSV—an ensemble framework combining Deep Bootstrap Aggregating (DB) with adaptive Cross-Entropy Weighted Soft Voting (CEWSV). The system dynamically optimizes model weights based on their cross-entropy performance, using SPA-augmented networks as base learners. The final integrated model attains 98.33% accuracy, outperforming the strongest individual base learner by 0.48 percentage points. Compared with single models, the ensemble learning algorithm proposed in this study led to better generalization and robustness of the ensemble learning model and better identification of rice diseases in the natural background. It provides a technical reference for applying rice disease identification in practical engineering. Full article
(This article belongs to the Topic Digital Agriculture, Smart Farming and Crop Monitoring)
Show Figures

Figure 1

28 pages, 5208 KB  
Article
ORC System Temperature and Evaporation Pressure Control Based on DDPG-MGPC
by Jing Li, Zexu Gao, Xi Zhou and Junyuan Zhang
Processes 2025, 13(7), 2314; https://doi.org/10.3390/pr13072314 - 21 Jul 2025
Cited by 1 | Viewed by 694
Abstract
The organic Rankine cycle (ORC) is a key technology for the recovery of low-grade waste heat, but its efficient and stable operation is challenged by complex kinetic coupling. This paper proposes a model partitioning strategy based on gap measurement to construct a high-fidelity [...] Read more.
The organic Rankine cycle (ORC) is a key technology for the recovery of low-grade waste heat, but its efficient and stable operation is challenged by complex kinetic coupling. This paper proposes a model partitioning strategy based on gap measurement to construct a high-fidelity ORC system model and combines the setting of observer decoupling and multi-model switching strategies to reduce the coupling impact and enhance adaptability. For control optimization, the reinforcement learning method of deep deterministic Policy Gradient (DDPG) is adopted to break through the limitations of the traditional discrete action space and achieve precise optimization in the continuous space. The proposed DDPG-MGPC (Hybrid Model Predictive Control) framework significantly enhances robustness and adaptability through the synergy of reinforcement learning and model prediction. Simulation shows that, compared with the existing hybrid reinforcement learning and MPC methods, DDPG-MGPC has better tracking performance and anti-interference ability under dynamic working conditions, providing a more efficient solution for the practical application of ORC. Full article
(This article belongs to the Section Energy Systems)
Show Figures

Figure 1

24 pages, 824 KB  
Article
MMF-Gait: A Multi-Model Fusion-Enhanced Gait Recognition Framework Integrating Convolutional and Attention Networks
by Kamrul Hasan, Khandokar Alisha Tuhin, Md Rasul Islam Bapary, Md Shafi Ud Doula, Md Ashraful Alam, Md Atiqur Rahman Ahad and Md. Zasim Uddin
Symmetry 2025, 17(7), 1155; https://doi.org/10.3390/sym17071155 - 19 Jul 2025
Viewed by 893
Abstract
Gait recognition is a reliable biometric approach that uniquely identifies individuals based on their natural walking patterns. It is widely used to recognize individuals who are challenging to camouflage and do not require a person’s cooperation. The general face-based person recognition system often [...] Read more.
Gait recognition is a reliable biometric approach that uniquely identifies individuals based on their natural walking patterns. It is widely used to recognize individuals who are challenging to camouflage and do not require a person’s cooperation. The general face-based person recognition system often fails to determine the offender’s identity when they conceal their face by wearing helmets and masks to evade identification. In such cases, gait-based recognition is ideal for identifying offenders, and most existing work leverages a deep learning (DL) model. However, a single model often fails to capture a comprehensive selection of refined patterns in input data when external factors are present, such as variation in viewing angle, clothing, and carrying conditions. In response to this, this paper introduces a fusion-based multi-model gait recognition framework that leverages the potential of convolutional neural networks (CNNs) and a vision transformer (ViT) in an ensemble manner to enhance gait recognition performance. Here, CNNs capture spatiotemporal features, and ViT features multiple attention layers that focus on a particular region of the gait image. The first step in this framework is to obtain the Gait Energy Image (GEI) by averaging a height-normalized gait silhouette sequence over a gait cycle, which can handle the left–right gait symmetry of the gait. After that, the GEI image is fed through multiple pre-trained models and fine-tuned precisely to extract the depth spatiotemporal feature. Later, three separate fusion strategies are conducted, and the first one is decision-level fusion (DLF), which takes each model’s decision and employs majority voting for the final decision. The second is feature-level fusion (FLF), which combines the features from individual models through pointwise addition before performing gait recognition. Finally, a hybrid fusion combines DLF and FLF for gait recognition. The performance of the multi-model fusion-based framework was evaluated on three publicly available gait databases: CASIA-B, OU-ISIR D, and the OU-ISIR Large Population dataset. The experimental results demonstrate that the fusion-enhanced framework achieves superior performance. Full article
(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)
Show Figures

Figure 1

14 pages, 3012 KB  
Article
Deep Learning-Based Automated Detection of Welding Defects in Pressure Pipeline Radiograph
by Wenpin Zhang, Wangwang Liu, Xinghua Yu, Dugang Kang, Zhi Xiong, Xiao Lv, Song Huang and Yan Li
Coatings 2025, 15(7), 808; https://doi.org/10.3390/coatings15070808 - 10 Jul 2025
Cited by 1 | Viewed by 2694
Abstract
This study applies deep learning-based object detection technology to defect detection in weld radiographs, proposing a technical solution for accurately identifying the types and locations of defects in weld X-ray radiographs. The research encompasses the construction of a defect dataset, the design of [...] Read more.
This study applies deep learning-based object detection technology to defect detection in weld radiographs, proposing a technical solution for accurately identifying the types and locations of defects in weld X-ray radiographs. The research encompasses the construction of a defect dataset, the design of a multi-model object detection network, and the development of an automated film evaluation algorithm. This technology significantly enhances the efficiency and accuracy of detecting and identifying harmful defects on weld radiographs, providing critical technical support for ensuring the safe operation and efficient maintenance of pipelines of pressure equipment. Full article
(This article belongs to the Special Issue Advances in Protective Coatings for Metallic Surfaces)
Show Figures

Figure 1

14 pages, 6074 KB  
Article
Cross-Modal Data Fusion via Vision-Language Model for Crop Disease Recognition
by Wenjie Liu, Guoqing Wu, Han Wang and Fuji Ren
Sensors 2025, 25(13), 4096; https://doi.org/10.3390/s25134096 - 30 Jun 2025
Viewed by 920
Abstract
Crop diseases pose a significant threat to agricultural productivity and global food security. Timely and accurate disease identification is crucial for improving crop yield and quality. While most existing deep learning-based methods focus primarily on image datasets for disease recognition, they often overlook [...] Read more.
Crop diseases pose a significant threat to agricultural productivity and global food security. Timely and accurate disease identification is crucial for improving crop yield and quality. While most existing deep learning-based methods focus primarily on image datasets for disease recognition, they often overlook the complementary role of textual features in enhancing visual understanding. To address this problem, we proposed a cross-modal data fusion via a vision-language model for crop disease recognition. Our approach leverages the Zhipu.ai multi-model to generate comprehensive textual descriptions of crop leaf diseases, including global description, local lesion description, and color-texture description. These descriptions are encoded into feature vectors, while an image encoder extracts image features. A cross-attention mechanism then iteratively fuses multimodal features across multiple layers, and a classification prediction module generates classification probabilities. Extensive experiments on the Soybean Disease, AI Challenge 2018, and PlantVillage datasets demonstrate that our method outperforms state-of-the-art image-only approaches with higher accuracy and fewer parameters. Specifically, with only 1.14M model parameters, our model achieves a 98.74%, 87.64% and 99.08% recognition accuracy on the three datasets, respectively. The results highlight the effectiveness of cross-modal learning in leveraging both visual and textual cues for precise and efficient disease recognition, offering a scalable solution for crop disease recognition. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

27 pages, 12000 KB  
Article
Multi-Model Synergistic Satellite-Derived Bathymetry Fusion Approach Based on Mamba Coral Reef Habitat Classification
by Xuechun Zhang, Yi Ma, Feifei Zhang, Zhongwei Li and Jingyu Zhang
Remote Sens. 2025, 17(13), 2134; https://doi.org/10.3390/rs17132134 - 21 Jun 2025
Cited by 1 | Viewed by 657
Abstract
As fundamental geophysical information, the high-precision detection of shallow water bathymetry is critical data support for the utilization of island resources and coral reef protection delimitation. In recent years, the combination of active and passive remote sensing technologies has led to a revolutionary [...] Read more.
As fundamental geophysical information, the high-precision detection of shallow water bathymetry is critical data support for the utilization of island resources and coral reef protection delimitation. In recent years, the combination of active and passive remote sensing technologies has led to a revolutionary breakthrough in satellite-derived bathymetry (SDB). Optical SDB extracts bathymetry by quantifying light–water–bottom interactions. Therefore, the apparent differences in the reflectance of different bottom types in specific wavelength bands are a core component of SDB. In this study, refined classification was performed for complex seafloor sediment and geomorphic features in coral reef habitats. A multi-model synergistic SDB fusion approach constrained by coral reef habitat classification based on the deep learning framework Mamba was constructed. The dual error of the global single model was suppressed by exploiting sediment and geomorphic partitions, as well as the accuracy complementarity of different models. Based on multispectral remote sensing imagery Sentinel-2 and the Ice, Cloud, and Land Elevation Satellite-2 (ICESat-2) active spaceborne lidar bathymetry data, wide-range and high-accuracy coral reef habitat classification results and bathymetry information were obtained for the Yuya Shoal (0–23 m) and Niihau Island (0–40 m). The results showed that the overall Mean Absolute Errors (MAEs) in the two study areas were 0.2 m and 0.5 m and the Mean Absolute Percentage Errors (MAPEs) were 9.77% and 6.47%, respectively. And R2 reached 0.98 in both areas. The estimated error of the SDB fusion strategy based on coral reef habitat classification was reduced by more than 90% compared with classical SDB models and a single machine learning method, thereby improving the capability of SDB in complex geomorphic ocean areas. Full article
(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)
Show Figures

Figure 1

18 pages, 373 KB  
Article
Machine Learning- and Deep Learning-Based Multi-Model System for Hate Speech Detection on Facebook
by Amna Naseeb, Muhammad Zain, Nisar Hussain, Amna Qasim, Fiaz Ahmad, Grigori Sidorov and Alexander Gelbukh
Algorithms 2025, 18(6), 331; https://doi.org/10.3390/a18060331 - 1 Jun 2025
Cited by 2 | Viewed by 1309
Abstract
Hate speech is a complex topic that transcends language, culture, and even social spheres. Recently, the spread of hate speech on social media sites like Facebook has added a new layer of complexity to the issue of online safety and content moderation. This [...] Read more.
Hate speech is a complex topic that transcends language, culture, and even social spheres. Recently, the spread of hate speech on social media sites like Facebook has added a new layer of complexity to the issue of online safety and content moderation. This study seeks to minimize this problem by developing an Arabic script-based tool for automatically detecting hate speech in Roman Urdu, an informal script used most commonly for South Asian digital communications. Roman Urdu is relatively complex as there are no standardized spellings, leading to syntactic variations, which increases the difficulty of hate speech detection. To tackle this problem, we adopt a holistic strategy using a combination of six machine learning (ML) and four Deep Learning (DL) models, a dataset from Facebook comments, which was preprocessed (tokenization, stopwords removal, etc.), and text vectorization (TF-IDF, word embeddings). The ML algorithms used in this study are LR, SVM, RF, NB, KNN, and GBM. We also use deep learning architectures like CNN, RNN, LSTM, and GRU to increase the accuracy of the classification further. It is proven by the experimental results that deep learning models outperform the traditional ML approaches by a significant margin, with CNN and LSTM achieving accuracies of 95.1% and 96.2%, respectively. As far as we are aware, this is the first work that investigates QLoRA for fine-tuning large models for the task of offensive language detection in Roman Urdu. Full article
(This article belongs to the Special Issue Linguistic and Cognitive Approaches to Dialog Agents)
Show Figures

Figure 1

Back to TopTop