Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (11,029)

Search Parameters:
Keywords = convolutional neural network (CNN)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 1130 KiB  
Review
Beyond the Backbone: A Quantitative Review of Deep-Learning Architectures for Tropical Cyclone Track Forecasting
by He Huang, Difei Deng, Liang Hu, Yawen Chen and Nan Sun
Remote Sens. 2025, 17(15), 2675; https://doi.org/10.3390/rs17152675 (registering DOI) - 2 Aug 2025
Abstract
Accurate forecasting of tropical cyclone (TC) tracks is critical for disaster preparedness and risk mitigation. While traditional numerical weather prediction (NWP) systems have long served as the backbone of operational forecasting, they face limitations in computational cost and sensitivity to initial conditions. In [...] Read more.
Accurate forecasting of tropical cyclone (TC) tracks is critical for disaster preparedness and risk mitigation. While traditional numerical weather prediction (NWP) systems have long served as the backbone of operational forecasting, they face limitations in computational cost and sensitivity to initial conditions. In recent years, deep learning (DL) has emerged as a promising alternative, offering data-driven modeling capabilities for capturing nonlinear spatiotemporal patterns. This paper presents a comprehensive review of DL-based approaches for TC track forecasting. We categorize all DL-based TC tracking models according to the architecture, including recurrent neural networks (RNNs), convolutional neural networks (CNNs), Transformers, graph neural networks (GNNs), generative models, and Fourier-based operators. To enable rigorous performance comparison, we introduce a Unified Geodesic Distance Error (UGDE) metric that standardizes evaluation across diverse studies and lead times. Based on this metric, we conduct a critical comparison of state-of-the-art models and identify key insights into their relative strengths, limitations, and suitable application scenarios. Building on this framework, we conduct a critical cross-model analysis that reveals key trends, performance disparities, and architectural tradeoffs. Our analysis also highlights several persistent challenges, such as long-term forecast degradation, limited physical integration, and generalization to extreme events, pointing toward future directions for developing more robust and operationally viable DL models for TC track forecasting. To support reproducibility and facilitate standardized evaluation, we release an open-source UGDE conversion tool on GitHub. Full article
(This article belongs to the Section AI Remote Sensing)
18 pages, 7062 KiB  
Article
Multimodal Feature Inputs Enable Improved Automated Textile Identification
by Magken George Enow Gnoupa, Andy T. Augousti, Olga Duran, Olena Lanets and Solomiia Liaskovska
Textiles 2025, 5(3), 31; https://doi.org/10.3390/textiles5030031 (registering DOI) - 2 Aug 2025
Abstract
This study presents an advanced framework for fabric texture classification by leveraging macro- and micro-texture extraction techniques integrated with deep learning architectures. Co-occurrence histograms, local binary patterns (LBPs), and albedo-dependent feature maps were employed to comprehensively capture the surface properties of fabrics. A [...] Read more.
This study presents an advanced framework for fabric texture classification by leveraging macro- and micro-texture extraction techniques integrated with deep learning architectures. Co-occurrence histograms, local binary patterns (LBPs), and albedo-dependent feature maps were employed to comprehensively capture the surface properties of fabrics. A late fusion approach was applied using four state-of-the-art convolutional neural networks (CNNs): InceptionV3, ResNet50_V2, DenseNet, and VGG-19. Excellent results were obtained, with the ResNet50_V2 achieving a precision of 0.929, recall of 0.914, and F1 score of 0.913. Notably, the integration of multimodal inputs allowed the models to effectively distinguish challenging fabric types, such as cotton–polyester and satin–silk pairs, which exhibit overlapping texture characteristics. This research not only enhances the accuracy of textile classification but also provides a robust methodology for material analysis, with significant implications for industrial applications in fashion, quality control, and robotics. Full article
14 pages, 841 KiB  
Article
Enhanced Deep Learning for Robust Stress Classification in Sows from Facial Images
by Syed U. Yunas, Ajmal Shahbaz, Emma M. Baxter, Mark F. Hansen, Melvyn L. Smith and Lyndon N. Smith
Agriculture 2025, 15(15), 1675; https://doi.org/10.3390/agriculture15151675 (registering DOI) - 2 Aug 2025
Abstract
Stress in pigs poses significant challenges to animal welfare and productivity in modern pig farming, contributing to increased antimicrobial use and the rise of antimicrobial resistance (AMR). This study involves stress classification in pregnant sows by exploring five deep learning models: ConvNeXt, EfficientNet_V2, [...] Read more.
Stress in pigs poses significant challenges to animal welfare and productivity in modern pig farming, contributing to increased antimicrobial use and the rise of antimicrobial resistance (AMR). This study involves stress classification in pregnant sows by exploring five deep learning models: ConvNeXt, EfficientNet_V2, MobileNet_V3, RegNet, and Vision Transformer (ViT). These models are used for stress detection from facial images, leveraging an expanded dataset. A facial image dataset of sows was collected at Scotland’s Rural College (SRUC) and the images were categorized into primiparous Low-Stressed (LS) and High-Stress (HS) groups based on expert behavioural assessments and cortisol level analysis. The selected deep learning models were then trained on this enriched dataset and their performance was evaluated using cross-validation on unseen data. The Vision Transformer (ViT) model outperformed the others across the dataset of annotated facial images, achieving an average accuracy of 0.75, an F1 score of 0.78 for high-stress detection, and consistent batch-level performance (up to 0.88 F1 score). These findings highlight the efficacy of transformer-based models for automated stress detection in sows, supporting early intervention strategies to enhance welfare, optimize productivity, and mitigate AMR risks in livestock production. Full article
Show Figures

Figure 1

27 pages, 1326 KiB  
Systematic Review
Application of Artificial Intelligence in Pancreatic Cyst Management: A Systematic Review
by Donghyun Lee, Fadel Jesry, John J. Maliekkal, Lewis Goulder, Benjamin Huntly, Andrew M. Smith and Yazan S. Khaled
Cancers 2025, 17(15), 2558; https://doi.org/10.3390/cancers17152558 (registering DOI) - 2 Aug 2025
Abstract
Background: Pancreatic cystic lesions (PCLs), including intraductal papillary mucinous neoplasms (IPMNs) and mucinous cystic neoplasms (MCNs), pose a diagnostic challenge due to their variable malignant potential. Current guidelines, such as Fukuoka and American Gastroenterological Association (AGA), have moderate predictive accuracy and may lead [...] Read more.
Background: Pancreatic cystic lesions (PCLs), including intraductal papillary mucinous neoplasms (IPMNs) and mucinous cystic neoplasms (MCNs), pose a diagnostic challenge due to their variable malignant potential. Current guidelines, such as Fukuoka and American Gastroenterological Association (AGA), have moderate predictive accuracy and may lead to overtreatment or missed malignancies. Artificial intelligence (AI), incorporating machine learning (ML) and deep learning (DL), offers the potential to improve risk stratification, diagnosis, and management of PCLs by integrating clinical, radiological, and molecular data. This is the first systematic review to evaluate the application, performance, and clinical utility of AI models in the diagnosis, classification, prognosis, and management of pancreatic cysts. Methods: A systematic review was conducted in accordance with PRISMA guidelines and registered on PROSPERO (CRD420251008593). Databases searched included PubMed, EMBASE, Scopus, and Cochrane Library up to March 2025. The inclusion criteria encompassed original studies employing AI, ML, or DL in human subjects with pancreatic cysts, evaluating diagnostic, classification, or prognostic outcomes. Data were extracted on the study design, imaging modality, model type, sample size, performance metrics (accuracy, sensitivity, specificity, and area under the curve (AUC)), and validation methods. Study quality and bias were assessed using the PROBAST and adherence to TRIPOD reporting guidelines. Results: From 847 records, 31 studies met the inclusion criteria. Most were retrospective observational (n = 27, 87%) and focused on preoperative diagnostic applications (n = 30, 97%), with only one addressing prognosis. Imaging modalities included Computed Tomography (CT) (48%), endoscopic ultrasound (EUS) (26%), and Magnetic Resonance Imaging (MRI) (9.7%). Neural networks, particularly convolutional neural networks (CNNs), were the most common AI models (n = 16), followed by logistic regression (n = 4) and support vector machines (n = 3). The median reported AUC across studies was 0.912, with 55% of models achieving AUC ≥ 0.80. The models outperformed clinicians or existing guidelines in 11 studies. IPMN stratification and subtype classification were common focuses, with CNN-based EUS models achieving accuracies of up to 99.6%. Only 10 studies (32%) performed external validation. The risk of bias was high in 93.5% of studies, and TRIPOD adherence averaged 48%. Conclusions: AI demonstrates strong potential in improving the diagnosis and risk stratification of pancreatic cysts, with several models outperforming current clinical guidelines and human readers. However, widespread clinical adoption is hindered by high risk of bias, lack of external validation, and limited interpretability of complex models. Future work should prioritise multicentre prospective studies, standardised model reporting, and development of interpretable, externally validated tools to support clinical integration. Full article
(This article belongs to the Section Methods and Technologies Development)
Show Figures

Figure 1

22 pages, 4300 KiB  
Article
Optimised DNN-Based Agricultural Land Cover Mapping Using Sentinel-2 and Landsat-8 with Google Earth Engine
by Nisha Sharma, Sartajvir Singh and Kawaljit Kaur
Land 2025, 14(8), 1578; https://doi.org/10.3390/land14081578 (registering DOI) - 1 Aug 2025
Abstract
Agriculture is the backbone of Punjab’s economy, and with much of India’s population dependent on agriculture, the requirement for accurate and timely monitoring of land has become even more crucial. Blending remote sensing with state-of-the-art machine learning algorithms enables the detailed classification of [...] Read more.
Agriculture is the backbone of Punjab’s economy, and with much of India’s population dependent on agriculture, the requirement for accurate and timely monitoring of land has become even more crucial. Blending remote sensing with state-of-the-art machine learning algorithms enables the detailed classification of agricultural lands through thematic mapping, which is critical for crop monitoring, land management, and sustainable development. Here, a Hyper-tuned Deep Neural Network (Hy-DNN) model was created and used for land use and land cover (LULC) classification into four classes: agricultural land, vegetation, water bodies, and built-up areas. The technique made use of multispectral data from Sentinel-2 and Landsat-8, processed on the Google Earth Engine (GEE) platform. To measure classification performance, Hy-DNN was contrasted with traditional classifiers—Convolutional Neural Network (CNN), Random Forest (RF), Classification and Regression Tree (CART), Minimum Distance Classifier (MDC), and Naive Bayes (NB)—using performance metrics including producer’s and consumer’s accuracy, Kappa coefficient, and overall accuracy. Hy-DNN performed the best, with overall accuracy being 97.60% using Sentinel-2 and 91.10% using Landsat-8, outperforming all base models. These results further highlight the superiority of the optimised Hy-DNN in agricultural land mapping and its potential use in crop health monitoring, disease diagnosis, and strategic agricultural planning. Full article
Show Figures

Figure 1

17 pages, 1340 KiB  
Article
Enhanced Respiratory Sound Classification Using Deep Learning and Multi-Channel Auscultation
by Yeonkyeong Kim, Kyu Bom Kim, Ah Young Leem, Kyuseok Kim and Su Hwan Lee
J. Clin. Med. 2025, 14(15), 5437; https://doi.org/10.3390/jcm14155437 (registering DOI) - 1 Aug 2025
Abstract
 Background/Objectives: Identifying and classifying abnormal lung sounds is essential for diagnosing patients with respiratory disorders. In particular, the simultaneous recording of auscultation signals from multiple clinically relevant positions offers greater diagnostic potential compared to traditional single-channel measurements. This study aims to improve [...] Read more.
 Background/Objectives: Identifying and classifying abnormal lung sounds is essential for diagnosing patients with respiratory disorders. In particular, the simultaneous recording of auscultation signals from multiple clinically relevant positions offers greater diagnostic potential compared to traditional single-channel measurements. This study aims to improve the accuracy of respiratory sound classification by leveraging multichannel signals and capturing positional characteristics from multiple sites in the same patient. Methods: We evaluated the performance of respiratory sound classification using multichannel lung sound data with a deep learning model that combines a convolutional neural network (CNN) and long short-term memory (LSTM), based on mel-frequency cepstral coefficients (MFCCs). We analyzed the impact of the number and placement of channels on classification performance. Results: The results demonstrated that using four-channel recordings improved accuracy, sensitivity, specificity, precision, and F1-score by approximately 1.11, 1.15, 1.05, 1.08, and 1.13 times, respectively, compared to using three, two, or single-channel recordings. Conclusion: This study confirms that multichannel data capture a richer set of features corresponding to various respiratory sound characteristics, leading to significantly improved classification performance. The proposed method holds promise for enhancing sound classification accuracy not only in clinical applications but also in broader domains such as speech and audio processing.  Full article
(This article belongs to the Section Respiratory Medicine)
23 pages, 3427 KiB  
Article
Visual Narratives and Digital Engagement: Decoding Seoul and Tokyo’s Tourism Identity Through Instagram Analytics
by Seung Chul Yoo and Seung Mi Kang
Tour. Hosp. 2025, 6(3), 149; https://doi.org/10.3390/tourhosp6030149 (registering DOI) - 1 Aug 2025
Abstract
Social media platforms like Instagram significantly shape destination images and influence tourist behavior. Understanding how different cities are represented and perceived on these platforms is crucial for effective tourism marketing. This study provides a comparative analysis of Instagram content and engagement patterns in [...] Read more.
Social media platforms like Instagram significantly shape destination images and influence tourist behavior. Understanding how different cities are represented and perceived on these platforms is crucial for effective tourism marketing. This study provides a comparative analysis of Instagram content and engagement patterns in Seoul and Tokyo, two major Asian metropolises, to derive actionable marketing insights. We collected and analyzed 59,944 public Instagram posts geotagged or location-tagged within Seoul (n = 29,985) and Tokyo (n = 29,959). We employed a mixed-methods approach involving content categorization using a fine-tuned convolutional neural network (CNN) model, engagement metric analysis (likes, comments), Valence Aware Dictionary and sEntiment Reasoner (VADER) sentiment analysis and thematic classification of comments, geospatial analysis (Kernel Density Estimation [KDE], Moran’s I), and predictive modeling (Gradient Boosting with SHapley Additive exPlanations [SHAP] value analysis). A validation analysis using balanced samples (n = 2000 each) was conducted to address Tokyo’s lower geotagged data proportion. While both cities showed ‘Person’ as the dominant content category, notable differences emerged. Tokyo exhibited higher like-based engagement across categories, particularly for ‘Animal’ and ‘Food’ content, while Seoul generated slightly more comments, often expressing stronger sentiment. Qualitative comment analysis revealed Seoul comments focused more on emotional reactions, whereas Tokyo comments were often shorter, appreciative remarks. Geospatial analysis identified distinct hotspots. The validation analysis confirmed these spatial patterns despite Tokyo’s data limitations. Predictive modeling highlighted hashtag counts as the key engagement driver in Seoul and the presence of people in Tokyo. Seoul and Tokyo project distinct visual narratives and elicit different engagement patterns on Instagram. These findings offer practical implications for destination marketers, suggesting tailored content strategies and location-based campaigns targeting identified hotspots and specific content themes. This study underscores the value of integrating quantitative and qualitative analyses of social media data for nuanced destination marketing insights. Full article
Show Figures

Figure 1

19 pages, 1889 KiB  
Article
Infrared Thermographic Signal Analysis of Bioactive Edible Oils Using CNNs for Quality Assessment
by Danilo Pratticò and Filippo Laganà
Signals 2025, 6(3), 38; https://doi.org/10.3390/signals6030038 (registering DOI) - 1 Aug 2025
Abstract
Nutrition plays a fundamental role in promoting health and preventing chronic diseases, with bioactive food components offering a therapeutic potential in biomedical applications. Among these, edible oils are recognised for their functional properties, which contribute to disease prevention and metabolic regulation. The proposed [...] Read more.
Nutrition plays a fundamental role in promoting health and preventing chronic diseases, with bioactive food components offering a therapeutic potential in biomedical applications. Among these, edible oils are recognised for their functional properties, which contribute to disease prevention and metabolic regulation. The proposed study aims to evaluate the quality of four bioactive oils (olive oil, sunflower oil, tomato seed oil, and pumpkin seed oil) by analysing their thermal behaviour through infrared (IR) imaging. The study designed a customised electronic system to acquire thermographic signals under controlled temperature and humidity conditions. The acquisition system was used to extract thermal data. Analysis of the acquired thermal signals revealed characteristic heat absorption profiles used to infer differences in oil properties related to stability and degradation potential. A hybrid deep learning model that integrates Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) units was used to classify and differentiate the oils based on stability, thermal reactivity, and potential health benefits. A signal analysis showed that the AI-based method improves both the accuracy (achieving an F1-score of 93.66%) and the repeatability of quality assessments, providing a non-invasive and intelligent framework for the validation and traceability of nutritional compounds. Full article
Show Figures

Figure 1

26 pages, 1790 KiB  
Article
A Hybrid Deep Learning Model for Aromatic and Medicinal Plant Species Classification Using a Curated Leaf Image Dataset
by Shareena E. M., D. Abraham Chandy, Shemi P. M. and Alwin Poulose
AgriEngineering 2025, 7(8), 243; https://doi.org/10.3390/agriengineering7080243 - 1 Aug 2025
Abstract
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the [...] Read more.
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the lack of domain-specific, high-quality datasets and the limited representational capacity of traditional architectures. This study addresses these challenges by introducing a novel, well-curated leaf image dataset consisting of 39 classes of medicinal and aromatic plants collected from the Aromatic and Medicinal Plant Research Station in Odakkali, Kerala, India. To overcome performance bottlenecks observed with a baseline Convolutional Neural Network (CNN) that achieved only 44.94% accuracy, we progressively enhanced model performance through a series of architectural innovations. These included the use of a pre-trained VGG16 network, data augmentation techniques, and fine-tuning of deeper convolutional layers, followed by the integration of Squeeze-and-Excitation (SE) attention blocks. Ultimately, we propose a hybrid deep learning architecture that combines VGG16 with Batch Normalization, Gated Recurrent Units (GRUs), Transformer modules, and Dilated Convolutions. This final model achieved a peak validation accuracy of 95.24%, significantly outperforming several baseline models, such as custom CNN (44.94%), VGG-19 (59.49%), VGG-16 before augmentation (71.52%), Xception (85.44%), Inception v3 (87.97%), VGG-16 after data augumentation (89.24%), VGG-16 after fine-tuning (90.51%), MobileNetV2 (93.67), and VGG16 with SE block (94.94%). These results demonstrate superior capability in capturing both local textures and global morphological features. The proposed solution not only advances the state of the art in plant classification but also contributes a valuable dataset to the research community. Its real-world applicability spans field-based plant identification, biodiversity conservation, and precision agriculture, offering a scalable tool for automated plant recognition in complex ecological and agricultural environments. Full article
(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)
Show Figures

Figure 1

30 pages, 4409 KiB  
Article
Accident Impact Prediction Based on a Deep Convolutional and Recurrent Neural Network Model
by Pouyan Sajadi, Mahya Qorbani, Sobhan Moosavi and Erfan Hassannayebi
Urban Sci. 2025, 9(8), 299; https://doi.org/10.3390/urbansci9080299 (registering DOI) - 1 Aug 2025
Abstract
Traffic accidents pose a significant threat to public safety, resulting in numerous fatalities, injuries, and a substantial economic burden each year. The development of predictive models capable of the real-time forecasting of post-accident impact using readily available data can play a crucial role [...] Read more.
Traffic accidents pose a significant threat to public safety, resulting in numerous fatalities, injuries, and a substantial economic burden each year. The development of predictive models capable of the real-time forecasting of post-accident impact using readily available data can play a crucial role in preventing adverse outcomes and enhancing overall safety. However, existing accident predictive models encounter two main challenges: first, a reliance on either costly or non-real-time data, and second, the absence of a comprehensive metric to measure post-accident impact accurately. To address these limitations, this study proposes a deep neural network model known as the cascade model. It leverages readily available real-world data from Los Angeles County to predict post-accident impacts. The model consists of two components: Long Short-Term Memory (LSTM) and a Convolutional Neural Network (CNN). The LSTM model captures temporal patterns, while the CNN extracts patterns from the sparse accident dataset. Furthermore, an external traffic congestion dataset is incorporated to derive a new feature called the “accident impact” factor, which quantifies the influence of an accident on surrounding traffic flow. Extensive experiments were conducted to demonstrate the effectiveness of the proposed hybrid machine learning method in predicting the post-accident impact compared to state-of-the-art baselines. The results reveal a higher precision in predicting minimal impacts (i.e., cases with no reported accidents) and a higher recall in predicting more significant impacts (i.e., cases with reported accidents). Full article
Show Figures

Figure 1

25 pages, 10331 KiB  
Article
Forest Fire Detection Method Based on Dual-Branch Multi-Scale Adaptive Feature Fusion Network
by Qinggan Wu, Chen Wei, Ning Sun, Xiong Xiong, Qingfeng Xia, Jianmeng Zhou and Xingyu Feng
Forests 2025, 16(8), 1248; https://doi.org/10.3390/f16081248 - 31 Jul 2025
Abstract
There are significant scale and morphological differences between fire and smoke features in forest fire detection. This paper proposes a detection method based on dual-branch multi-scale adaptive feature fusion network (DMAFNet). In this method, convolutional neural network (CNN) and transformer are used to [...] Read more.
There are significant scale and morphological differences between fire and smoke features in forest fire detection. This paper proposes a detection method based on dual-branch multi-scale adaptive feature fusion network (DMAFNet). In this method, convolutional neural network (CNN) and transformer are used to form a dual-branch backbone network to extract local texture and global context information, respectively. In order to overcome the difference in feature distribution and response scale between the two branches, a feature correction module (FCM) is designed. Through space and channel correction mechanisms, the adaptive alignment of two branch features is realized. The Fusion Feature Module (FFM) is further introduced to fully integrate dual-branch features based on the two-way cross-attention mechanism and effectively suppress redundant information. Finally, the Multi-Scale Fusion Attention Unit (MSFAU) is designed to enhance the multi-scale detection capability of fire targets. Experimental results show that the proposed DMAFNet has significantly improved in mAP (mean average precision) indicators compared with existing mainstream detection methods. Full article
(This article belongs to the Section Natural Hazards and Risk Management)
Show Figures

Figure 1

22 pages, 4399 KiB  
Article
Deep Learning-Based Fingerprint–Vein Biometric Fusion: A Systematic Review with Empirical Evaluation
by Sarah Almuwayziri, Abeer Al-Nafjan, Hessah Aljumah and Mashael Aldayel
Appl. Sci. 2025, 15(15), 8502; https://doi.org/10.3390/app15158502 (registering DOI) - 31 Jul 2025
Abstract
User authentication is crucial for safeguarding access to digital systems and services. Biometric authentication serves as a strong and user-friendly alternative to conventional security methods such as passwords and PINs, which are often susceptible to breaches. This study proposes a deep learning-based multimodal [...] Read more.
User authentication is crucial for safeguarding access to digital systems and services. Biometric authentication serves as a strong and user-friendly alternative to conventional security methods such as passwords and PINs, which are often susceptible to breaches. This study proposes a deep learning-based multimodal biometric system that combines fingerprint (FP) and finger vein (FV) modalities to improve accuracy and security. The system explores three fusion strategies: feature-level fusion (combining feature vectors from each modality), score-level fusion (integrating prediction scores from each modality), and a hybrid approach that leverages both feature and score information. The implementation involved five pretrained convolutional neural network (CNN) models: two unimodal (FP-only and FV-only) and three multimodal models corresponding to each fusion strategy. The models were assessed using the NUPT-FPV dataset, which consists of 33,600 images collected from 140 subjects with a dual-mode acquisition device in varied environmental conditions. The results indicate that the hybrid-level fusion with a dominant score weight (0.7 score, 0.3 feature) achieved the highest accuracy (99.79%) and the lowest equal error rate (EER = 0.0018), demonstrating superior robustness. Overall, the results demonstrate that integrating deep learning with multimodal fusion is highly effective for advancing scalable and accurate biometric authentication solutions suitable for real-world deployments. Full article
Show Figures

Figure 1

28 pages, 5699 KiB  
Article
Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputs
by Hyuk Soo Cho, Kamran Latif, Abubakar Sharafat and Jongwon Seo
Appl. Sci. 2025, 15(15), 8505; https://doi.org/10.3390/app15158505 (registering DOI) - 31 Jul 2025
Viewed by 41
Abstract
Recently, deep learning algorithms have been increasingly applied in construction for activity recognition, particularly for excavators, to automate processes and enhance safety and productivity through continuous monitoring of earthmoving activities. These deep learning algorithms analyze construction videos to classify excavator activities for earthmoving [...] Read more.
Recently, deep learning algorithms have been increasingly applied in construction for activity recognition, particularly for excavators, to automate processes and enhance safety and productivity through continuous monitoring of earthmoving activities. These deep learning algorithms analyze construction videos to classify excavator activities for earthmoving purposes. However, previous studies have solely focused on single-source external videos, which limits the activity recognition capabilities of the deep learning algorithm. This paper introduces a novel multi-modal deep learning-based methodology for recognizing excavator activities, utilizing multi-stream input data. It processes point clouds and RGB images using the two-stream long short-term memory convolutional neural network (CNN-LSTM) method to extract spatiotemporal features, enabling the recognition of excavator activities. A comprehensive dataset comprising 495,000 video frames of synchronized RGB and point cloud data was collected across multiple construction sites under varying conditions. The dataset encompasses five key excavator activities: Approach, Digging, Dumping, Idle, and Leveling. To assess the effectiveness of the proposed method, the performance of the two-stream CNN-LSTM architecture is compared with that of single-stream CNN-LSTM models on the same RGB and point cloud datasets, separately. The results demonstrate that the proposed multi-stream approach achieved an accuracy of 94.67%, outperforming existing state-of-the-art single-stream models, which achieved 90.67% accuracy for the RGB-based model and 92.00% for the point cloud-based model. These findings underscore the potential of the proposed activity recognition method, making it highly effective for automatic real-time monitoring of excavator activities, thereby laying the groundwork for future integration into digital twin systems for proactive maintenance and intelligent equipment management. Full article
(This article belongs to the Special Issue AI-Based Machinery Health Monitoring)
Show Figures

Figure 1

29 pages, 15488 KiB  
Article
GOFENet: A Hybrid Transformer–CNN Network Integrating GEOBIA-Based Object Priors for Semantic Segmentation of Remote Sensing Images
by Tao He, Jianyu Chen and Delu Pan
Remote Sens. 2025, 17(15), 2652; https://doi.org/10.3390/rs17152652 (registering DOI) - 31 Jul 2025
Viewed by 43
Abstract
Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability [...] Read more.
Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability in semantic segmentation. While convolutional neural networks (CNNs) excel at local feature extraction, they inherently struggle to capture long-range dependencies. In contrast, Transformer-based models are well suited for global context modeling but often lack fine-grained local detail. To overcome these limitations, we propose GOFENet (Geo-Object Feature Enhanced Network)—a hybrid semantic segmentation architecture that effectively fuses object-level priors into deep feature representations. GOFENet employs a dual-encoder design combining CNN and Swin Transformer architectures, enabling multi-scale feature fusion through skip connections to preserve both local and global semantics. An auxiliary branch incorporating cascaded atrous convolutions is introduced to inject information of segmented objects into the learning process. Furthermore, we develop a cross-channel selection module (CSM) for refined channel-wise attention, a feature enhancement module (FEM) to merge global and local representations, and a shallow–deep feature fusion module (SDFM) to integrate pixel- and object-level cues across scales. Experimental results on the GID and LoveDA datasets demonstrate that GOFENet achieves superior segmentation performance, with 66.02% mIoU and 51.92% mIoU, respectively. The model exhibits strong capability in delineating large-scale land cover features, producing sharper object boundaries and reducing classification noise, while preserving the integrity and discriminability of land cover categories. Full article
Show Figures

Figure 1

21 pages, 1928 KiB  
Article
A CNN-Transformer Hybrid Framework for Multi-Label Predator–Prey Detection in Agricultural Fields
by Yifan Lyu, Feiyu Lu, Xuaner Wang, Yakui Wang, Zihuan Wang, Yawen Zhu, Zhewei Wang and Min Dong
Sensors 2025, 25(15), 4719; https://doi.org/10.3390/s25154719 (registering DOI) - 31 Jul 2025
Viewed by 56
Abstract
Accurate identification of predator–pest relationships is essential for implementing effective and sustainable biological control in agriculture. However, existing image-based methods struggle to recognize insect co-occurrence under complex field conditions, limiting their ecological applicability. To address this challenge, we propose a hybrid deep learning [...] Read more.
Accurate identification of predator–pest relationships is essential for implementing effective and sustainable biological control in agriculture. However, existing image-based methods struggle to recognize insect co-occurrence under complex field conditions, limiting their ecological applicability. To address this challenge, we propose a hybrid deep learning framework that integrates convolutional neural networks (CNNs) and Transformer architectures for multi-label recognition of predator–pest combinations. The model leverages a novel co-occurrence attention mechanism to capture semantic relationships between insect categories and employs a pairwise label matching loss to enhance ecological pairing accuracy. Evaluated on a field-constructed dataset of 5,037 images across eight categories, the model achieved an F1-score of 86.5%, mAP50 of 85.1%, and demonstrated strong generalization to unseen predator–pest pairs with an average F1-score of 79.6%. These results outperform several strong baselines, including ResNet-50, YOLOv8, and Vision Transformer. This work contributes a robust, interpretable approach for multi-object ecological detection and offers practical potential for deployment in smart farming systems, UAV-based monitoring, and precision pest management. Full article
(This article belongs to the Special Issue Sensor and AI Technologies in Intelligent Agriculture: 2nd Edition)
Show Figures

Figure 1

Back to TopTop