Saved Queries

Agriculture is the backbone of Punjab’s economy, and with much of India’s population dependent on agriculture, the requirement for accurate and timely monitoring of land has become even more crucial. Blending remote sensing with state-of-the-art machine learning algorithms enables the detailed classification of agricultural lands through thematic mapping, which is critical for crop monitoring, land management, and sustainable development. Here, a Hyper-tuned Deep Neural Network (Hy-DNN) model was created and used for land use and land cover (LULC) classification into four classes: agricultural land, vegetation, water bodies, and built-up areas. The technique made use of multispectral data from Sentinel-2 and Landsat-8, processed on the Google Earth Engine (GEE) platform. To measure classification performance, Hy-DNN was contrasted with traditional classifiers—Convolutional Neural Network (CNN), Random Forest (RF), Classification and Regression Tree (CART), Minimum Distance Classifier (MDC), and Naive Bayes (NB)—using performance metrics including producer’s and consumer’s accuracy, Kappa coefficient, and overall accuracy. Hy-DNN performed the best, with overall accuracy being 97.60% using Sentinel-2 and 91.10% using Landsat-8, outperforming all base models. These results further highlight the superiority of the optimised Hy-DNN in agricultural land mapping and its potential use in crop health monitoring, disease diagnosis, and strategic agricultural planning. Full article

(This article belongs to the Special Issue Advances on Land Cover/Land Use Ontologies for Innovative Production/Utilization of Land Information)

►▼ Show Figures

Figure 1

17 pages, 1340 KiB

Open AccessArticle

Enhanced Respiratory Sound Classification Using Deep Learning and Multi-Channel Auscultation

by Yeonkyeong Kim, Kyu Bom Kim, Ah Young Leem, Kyuseok Kim and Su Hwan Lee

J. Clin. Med. 2025, 14(15), 5437; https://doi.org/10.3390/jcm14155437 (registering DOI) - 1 Aug 2025

Abstract

Background/Objectives: Identifying and classifying abnormal lung sounds is essential for diagnosing patients with respiratory disorders. In particular, the simultaneous recording of auscultation signals from multiple clinically relevant positions offers greater diagnostic potential compared to traditional single-channel measurements. This study aims to improve the accuracy of respiratory sound classification by leveraging multichannel signals and capturing positional characteristics from multiple sites in the same patient. Methods: We evaluated the performance of respiratory sound classification using multichannel lung sound data with a deep learning model that combines a convolutional neural network (CNN) and long short-term memory (LSTM), based on mel-frequency cepstral coefficients (MFCCs). We analyzed the impact of the number and placement of channels on classification performance. Results: The results demonstrated that using four-channel recordings improved accuracy, sensitivity, specificity, precision, and F1-score by approximately 1.11, 1.15, 1.05, 1.08, and 1.13 times, respectively, compared to using three, two, or single-channel recordings. Conclusion: This study confirms that multichannel data capture a richer set of features corresponding to various respiratory sound characteristics, leading to significantly improved classification performance. The proposed method holds promise for enhancing sound classification accuracy not only in clinical applications but also in broader domains such as speech and audio processing. Full article

(This article belongs to the Section Respiratory Medicine)

23 pages, 3427 KiB

Open AccessArticle

Visual Narratives and Digital Engagement: Decoding Seoul and Tokyo’s Tourism Identity Through Instagram Analytics

by Seung Chul Yoo and Seung Mi Kang

Tour. Hosp. 2025, 6(3), 149; https://doi.org/10.3390/tourhosp6030149 (registering DOI) - 1 Aug 2025

Abstract

Social media platforms like Instagram significantly shape destination images and influence tourist behavior. Understanding how different cities are represented and perceived on these platforms is crucial for effective tourism marketing. This study provides a comparative analysis of Instagram content and engagement patterns in Seoul and Tokyo, two major Asian metropolises, to derive actionable marketing insights. We collected and analyzed 59,944 public Instagram posts geotagged or location-tagged within Seoul (n = 29,985) and Tokyo (n = 29,959). We employed a mixed-methods approach involving content categorization using a fine-tuned convolutional neural network (CNN) model, engagement metric analysis (likes, comments), Valence Aware Dictionary and sEntiment Reasoner (VADER) sentiment analysis and thematic classification of comments, geospatial analysis (Kernel Density Estimation [KDE], Moran’s I), and predictive modeling (Gradient Boosting with SHapley Additive exPlanations [SHAP] value analysis). A validation analysis using balanced samples (n = 2000 each) was conducted to address Tokyo’s lower geotagged data proportion. While both cities showed ‘Person’ as the dominant content category, notable differences emerged. Tokyo exhibited higher like-based engagement across categories, particularly for ‘Animal’ and ‘Food’ content, while Seoul generated slightly more comments, often expressing stronger sentiment. Qualitative comment analysis revealed Seoul comments focused more on emotional reactions, whereas Tokyo comments were often shorter, appreciative remarks. Geospatial analysis identified distinct hotspots. The validation analysis confirmed these spatial patterns despite Tokyo’s data limitations. Predictive modeling highlighted hashtag counts as the key engagement driver in Seoul and the presence of people in Tokyo. Seoul and Tokyo project distinct visual narratives and elicit different engagement patterns on Instagram. These findings offer practical implications for destination marketers, suggesting tailored content strategies and location-based campaigns targeting identified hotspots and specific content themes. This study underscores the value of integrating quantitative and qualitative analyses of social media data for nuanced destination marketing insights. Full article

(This article belongs to the Special Issue Data-Driven Insights in Tourism and Hospitality: Smart Technologies and Data Science)

►▼ Show Figures

Figure 1

19 pages, 1889 KiB

Open AccessArticle

Infrared Thermographic Signal Analysis of Bioactive Edible Oils Using CNNs for Quality Assessment

by Danilo Pratticò and Filippo Laganà

Signals 2025, 6(3), 38; https://doi.org/10.3390/signals6030038 (registering DOI) - 1 Aug 2025

Abstract

Nutrition plays a fundamental role in promoting health and preventing chronic diseases, with bioactive food components offering a therapeutic potential in biomedical applications. Among these, edible oils are recognised for their functional properties, which contribute to disease prevention and metabolic regulation. The proposed study aims to evaluate the quality of four bioactive oils (olive oil, sunflower oil, tomato seed oil, and pumpkin seed oil) by analysing their thermal behaviour through infrared (IR) imaging. The study designed a customised electronic system to acquire thermographic signals under controlled temperature and humidity conditions. The acquisition system was used to extract thermal data. Analysis of the acquired thermal signals revealed characteristic heat absorption profiles used to infer differences in oil properties related to stability and degradation potential. A hybrid deep learning model that integrates Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) units was used to classify and differentiate the oils based on stability, thermal reactivity, and potential health benefits. A signal analysis showed that the AI-based method improves both the accuracy (achieving an F1-score of 93.66%) and the repeatability of quality assessments, providing a non-invasive and intelligent framework for the validation and traceability of nutritional compounds. Full article

►▼ Show Figures

Figure 1

26 pages, 1790 KiB

Open AccessArticle

A Hybrid Deep Learning Model for Aromatic and Medicinal Plant Species Classification Using a Curated Leaf Image Dataset

by Shareena E. M., D. Abraham Chandy, Shemi P. M. and Alwin Poulose

AgriEngineering 2025, 7(8), 243; https://doi.org/10.3390/agriengineering7080243 - 1 Aug 2025

Abstract

In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the lack of domain-specific, high-quality datasets and the limited representational capacity of traditional architectures. This study addresses these challenges by introducing a novel, well-curated leaf image dataset consisting of 39 classes of medicinal and aromatic plants collected from the Aromatic and Medicinal Plant Research Station in Odakkali, Kerala, India. To overcome performance bottlenecks observed with a baseline Convolutional Neural Network (CNN) that achieved only 44.94% accuracy, we progressively enhanced model performance through a series of architectural innovations. These included the use of a pre-trained VGG16 network, data augmentation techniques, and fine-tuning of deeper convolutional layers, followed by the integration of Squeeze-and-Excitation (SE) attention blocks. Ultimately, we propose a hybrid deep learning architecture that combines VGG16 with Batch Normalization, Gated Recurrent Units (GRUs), Transformer modules, and Dilated Convolutions. This final model achieved a peak validation accuracy of 95.24%, significantly outperforming several baseline models, such as custom CNN (44.94%), VGG-19 (59.49%), VGG-16 before augmentation (71.52%), Xception (85.44%), Inception v3 (87.97%), VGG-16 after data augumentation (89.24%), VGG-16 after fine-tuning (90.51%), MobileNetV2 (93.67), and VGG16 with SE block (94.94%). These results demonstrate superior capability in capturing both local textures and global morphological features. The proposed solution not only advances the state of the art in plant classification but also contributes a valuable dataset to the research community. Its real-world applicability spans field-based plant identification, biodiversity conservation, and precision agriculture, offering a scalable tool for automated plant recognition in complex ecological and agricultural environments. Full article

(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)

►▼ Show Figures

Figure 1

30 pages, 4409 KiB

Open AccessArticle

Accident Impact Prediction Based on a Deep Convolutional and Recurrent Neural Network Model

by Pouyan Sajadi, Mahya Qorbani, Sobhan Moosavi and Erfan Hassannayebi

Urban Sci. 2025, 9(8), 299; https://doi.org/10.3390/urbansci9080299 (registering DOI) - 1 Aug 2025

Abstract

Traffic accidents pose a significant threat to public safety, resulting in numerous fatalities, injuries, and a substantial economic burden each year. The development of predictive models capable of the real-time forecasting of post-accident impact using readily available data can play a crucial role in preventing adverse outcomes and enhancing overall safety. However, existing accident predictive models encounter two main challenges: first, a reliance on either costly or non-real-time data, and second, the absence of a comprehensive metric to measure post-accident impact accurately. To address these limitations, this study proposes a deep neural network model known as the cascade model. It leverages readily available real-world data from Los Angeles County to predict post-accident impacts. The model consists of two components: Long Short-Term Memory (LSTM) and a Convolutional Neural Network (CNN). The LSTM model captures temporal patterns, while the CNN extracts patterns from the sparse accident dataset. Furthermore, an external traffic congestion dataset is incorporated to derive a new feature called the “accident impact” factor, which quantifies the influence of an accident on surrounding traffic flow. Extensive experiments were conducted to demonstrate the effectiveness of the proposed hybrid machine learning method in predicting the post-accident impact compared to state-of-the-art baselines. The results reveal a higher precision in predicting minimal impacts (i.e., cases with no reported accidents) and a higher recall in predicting more significant impacts (i.e., cases with reported accidents). Full article

►▼ Show Figures

Figure 1

25 pages, 10331 KiB

Open AccessArticle

Forest Fire Detection Method Based on Dual-Branch Multi-Scale Adaptive Feature Fusion Network

by Qinggan Wu, Chen Wei, Ning Sun, Xiong Xiong, Qingfeng Xia, Jianmeng Zhou and Xingyu Feng

Forests 2025, 16(8), 1248; https://doi.org/10.3390/f16081248 - 31 Jul 2025

Abstract

There are significant scale and morphological differences between fire and smoke features in forest fire detection. This paper proposes a detection method based on dual-branch multi-scale adaptive feature fusion network (DMAFNet). In this method, convolutional neural network (CNN) and transformer are used to form a dual-branch backbone network to extract local texture and global context information, respectively. In order to overcome the difference in feature distribution and response scale between the two branches, a feature correction module (FCM) is designed. Through space and channel correction mechanisms, the adaptive alignment of two branch features is realized. The Fusion Feature Module (FFM) is further introduced to fully integrate dual-branch features based on the two-way cross-attention mechanism and effectively suppress redundant information. Finally, the Multi-Scale Fusion Attention Unit (MSFAU) is designed to enhance the multi-scale detection capability of fire targets. Experimental results show that the proposed DMAFNet has significantly improved in mAP (mean average precision) indicators compared with existing mainstream detection methods. Full article

(This article belongs to the Section Natural Hazards and Risk Management)

►▼ Show Figures

Figure 1

22 pages, 4399 KiB

Open AccessArticle

Deep Learning-Based Fingerprint–Vein Biometric Fusion: A Systematic Review with Empirical Evaluation

by Sarah Almuwayziri, Abeer Al-Nafjan, Hessah Aljumah and Mashael Aldayel

Appl. Sci. 2025, 15(15), 8502; https://doi.org/10.3390/app15158502 (registering DOI) - 31 Jul 2025

Abstract

User authentication is crucial for safeguarding access to digital systems and services. Biometric authentication serves as a strong and user-friendly alternative to conventional security methods such as passwords and PINs, which are often susceptible to breaches. This study proposes a deep learning-based multimodal biometric system that combines fingerprint (FP) and finger vein (FV) modalities to improve accuracy and security. The system explores three fusion strategies: feature-level fusion (combining feature vectors from each modality), score-level fusion (integrating prediction scores from each modality), and a hybrid approach that leverages both feature and score information. The implementation involved five pretrained convolutional neural network (CNN) models: two unimodal (FP-only and FV-only) and three multimodal models corresponding to each fusion strategy. The models were assessed using the NUPT-FPV dataset, which consists of 33,600 images collected from 140 subjects with a dual-mode acquisition device in varied environmental conditions. The results indicate that the hybrid-level fusion with a dominant score weight (0.7 score, 0.3 feature) achieved the highest accuracy (99.79%) and the lowest equal error rate (EER = 0.0018), demonstrating superior robustness. Overall, the results demonstrate that integrating deep learning with multimodal fusion is highly effective for advancing scalable and accurate biometric authentication solutions suitable for real-world deployments. Full article

(This article belongs to the Special Issue Applied Deep Learning in Sensitive and Biometric Information Protection)

►▼ Show Figures

Figure 1

28 pages, 5699 KiB

Open AccessArticle

Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputs

by Hyuk Soo Cho, Kamran Latif, Abubakar Sharafat and Jongwon Seo

Appl. Sci. 2025, 15(15), 8505; https://doi.org/10.3390/app15158505 (registering DOI) - 31 Jul 2025

Abstract

Recently, deep learning algorithms have been increasingly applied in construction for activity recognition, particularly for excavators, to automate processes and enhance safety and productivity through continuous monitoring of earthmoving activities. These deep learning algorithms analyze construction videos to classify excavator activities for earthmoving purposes. However, previous studies have solely focused on single-source external videos, which limits the activity recognition capabilities of the deep learning algorithm. This paper introduces a novel multi-modal deep learning-based methodology for recognizing excavator activities, utilizing multi-stream input data. It processes point clouds and RGB images using the two-stream long short-term memory convolutional neural network (CNN-LSTM) method to extract spatiotemporal features, enabling the recognition of excavator activities. A comprehensive dataset comprising 495,000 video frames of synchronized RGB and point cloud data was collected across multiple construction sites under varying conditions. The dataset encompasses five key excavator activities: Approach, Digging, Dumping, Idle, and Leveling. To assess the effectiveness of the proposed method, the performance of the two-stream CNN-LSTM architecture is compared with that of single-stream CNN-LSTM models on the same RGB and point cloud datasets, separately. The results demonstrate that the proposed multi-stream approach achieved an accuracy of 94.67%, outperforming existing state-of-the-art single-stream models, which achieved 90.67% accuracy for the RGB-based model and 92.00% for the point cloud-based model. These findings underscore the potential of the proposed activity recognition method, making it highly effective for automatic real-time monitoring of excavator activities, thereby laying the groundwork for future integration into digital twin systems for proactive maintenance and intelligent equipment management. Full article

(This article belongs to the Special Issue AI-Based Machinery Health Monitoring)

►▼ Show Figures

Figure 1

29 pages, 15488 KiB

Open AccessArticle

GOFENet: A Hybrid Transformer–CNN Network Integrating GEOBIA-Based Object Priors for Semantic Segmentation of Remote Sensing Images

by Tao He, Jianyu Chen and Delu Pan

Remote Sens. 2025, 17(15), 2652; https://doi.org/10.3390/rs17152652 (registering DOI) - 31 Jul 2025

Abstract

Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability in semantic segmentation. While convolutional neural networks (CNNs) excel at local feature extraction, they inherently struggle to capture long-range dependencies. In contrast, Transformer-based models are well suited for global context modeling but often lack fine-grained local detail. To overcome these limitations, we propose GOFENet (Geo-Object Feature Enhanced Network)—a hybrid semantic segmentation architecture that effectively fuses object-level priors into deep feature representations. GOFENet employs a dual-encoder design combining CNN and Swin Transformer architectures, enabling multi-scale feature fusion through skip connections to preserve both local and global semantics. An auxiliary branch incorporating cascaded atrous convolutions is introduced to inject information of segmented objects into the learning process. Furthermore, we develop a cross-channel selection module (CSM) for refined channel-wise attention, a feature enhancement module (FEM) to merge global and local representations, and a shallow–deep feature fusion module (SDFM) to integrate pixel- and object-level cues across scales. Experimental results on the GID and LoveDA datasets demonstrate that GOFENet achieves superior segmentation performance, with 66.02% mIoU and 51.92% mIoU, respectively. The model exhibits strong capability in delineating large-scale land cover features, producing sharper object boundaries and reducing classification noise, while preserving the integrity and discriminability of land cover categories. Full article

►▼ Show Figures

Figure 1

21 pages, 1928 KiB

Open AccessArticle

A CNN-Transformer Hybrid Framework for Multi-Label Predator–Prey Detection in Agricultural Fields

by Yifan Lyu, Feiyu Lu, Xuaner Wang, Yakui Wang, Zihuan Wang, Yawen Zhu, Zhewei Wang and Min Dong

Sensors 2025, 25(15), 4719; https://doi.org/10.3390/s25154719 (registering DOI) - 31 Jul 2025

Abstract

Accurate identification of predator–pest relationships is essential for implementing effective and sustainable biological control in agriculture. However, existing image-based methods struggle to recognize insect co-occurrence under complex field conditions, limiting their ecological applicability. To address this challenge, we propose a hybrid deep learning framework that integrates convolutional neural networks (CNNs) and Transformer architectures for multi-label recognition of predator–pest combinations. The model leverages a novel co-occurrence attention mechanism to capture semantic relationships between insect categories and employs a pairwise label matching loss to enhance ecological pairing accuracy. Evaluated on a field-constructed dataset of 5,037 images across eight categories, the model achieved an F1-score of 86.5%, mAP50 of 85.1%, and demonstrated strong generalization to unseen predator–pest pairs with an average F1-score of 79.6%. These results outperform several strong baselines, including ResNet-50, YOLOv8, and Vision Transformer. This work contributes a robust, interpretable approach for multi-object ecological detection and offers practical potential for deployment in smart farming systems, UAV-based monitoring, and precision pest management. Full article

(This article belongs to the Special Issue Sensor and AI Technologies in Intelligent Agriculture: 2nd Edition)

►▼ Show Figures

Figure 1

25 pages, 4145 KiB

Open AccessArticle

Advancing Early Blight Detection in Potato Leaves Through ZeroShot Learning

by Muhammad Shoaib Farooq, Ayesha Kamran, Syed Atir Raza, Muhammad Farooq Wasiq, Bilal Hassan and Nitsa J. Herzog

J. Imaging 2025, 11(8), 256; https://doi.org/10.3390/jimaging11080256 (registering DOI) - 31 Jul 2025

Abstract

Potatoes are one of the world’s most widely cultivated crops, but their yield is coming under mounting pressure from early blight, a fungal disease caused by Alternaria solani. Early detection and accurate identification are key to effective disease management and yield protection. This paper introduces a novel deep learning framework called ZeroShot CNN, which integrates convolutional neural networks (CNNs) and ZeroShot Learning (ZSL) for the efficient classification of seen and unseen disease classes. The model utilizes convolutional layers for feature extraction and employs semantic embedding techniques to identify previously untrained classes. Implemented on the Kaggle potato disease dataset, ZeroShot CNN achieved 98.50% accuracy for seen categories and 99.91% accuracy for unseen categories, outperforming conventional methods. The hybrid approach demonstrated superior generalization, providing a scalable, real-time solution for detecting agricultural diseases. The success of this solution validates the potential in harnessing deep learning and ZeroShot inference to transform plant pathology and crop protection practices. Full article

(This article belongs to the Section Image and Video Processing)

►▼ Show Figures

Figure 1

18 pages, 651 KiB

Open AccessArticle

Enhancing IoT Connectivity in Suburban and Rural Terrains Through Optimized Propagation Models Using Convolutional Neural Networks

by George Papastergiou, Apostolos Xenakis, Costas Chaikalis, Dimitrios Kosmanos and Menelaos Panagiotis Papastergiou

IoT 2025, 6(3), 41; https://doi.org/10.3390/iot6030041 (registering DOI) - 31 Jul 2025

Abstract

The widespread adoption of the Internet of Things (IoT) has driven major advancements in wireless communication, especially in rural and suburban areas where low population density and limited infrastructure pose significant challenges. Accurate Path Loss (PL) prediction is critical for the effective deployment and operation of Wireless Sensor Networks (WSNs) in such environments. This study explores the use of Convolutional Neural Networks (CNNs) for PL modeling, utilizing a comprehensive dataset collected in a smart campus setting that captures the influence of terrain and environmental variations. Several CNN architectures were evaluated based on different combinations of input features—such as distance, elevation, clutter height, and altitude—to assess their predictive accuracy. The findings reveal that CNN-based models outperform traditional propagation models (Free Space Path Loss (FSPL), Okumura–Hata, COST 231, Log-Distance), achieving lower error rates and more precise PL estimations. The best performing CNN configuration, using only distance and elevation, highlights the value of terrain-aware modeling. These results underscore the potential of deep learning techniques to enhance IoT connectivity in sparsely connected regions and support the development of more resilient communication infrastructures. Full article

►▼ Show Figures

Figure 1

24 pages, 4039 KiB

Open AccessReview

A Mathematical Survey of Image Deep Edge Detection Algorithms: From Convolution to Attention

by Gang Hu

Mathematics 2025, 13(15), 2464; https://doi.org/10.3390/math13152464 - 31 Jul 2025

Abstract

Edge detection, a cornerstone of computer vision, identifies intensity discontinuities in images, enabling applications from object recognition to autonomous navigation. This survey presents a mathematically grounded analysis of edge detection’s evolution, spanning traditional gradient-based methods, convolutional neural networks (CNNs), attention-driven architectures, transformer-backbone models, and generative paradigms. Beginning with Sobel and Canny’s kernel-based approaches, we trace the shift to data-driven CNNs like Holistically Nested Edge Detection (HED) and Bidirectional Cascade Network (BDCN), which leverage multi-scale supervision and achieve ODS (Optimal Dataset Scale) scores 0.788 and 0.806, respectively. Attention mechanisms, as in EdgeNAT (ODS 0.860) and RankED (ODS 0.824), enhance global context, while generative models like GED (ODS 0.870) achieve state-of-the-art precision via diffusion and GAN frameworks. Evaluated on BSDS500 and NYUDv2, these methods highlight a trajectory toward accuracy and robustness, yet challenges in efficiency, generalization, and multi-modal integration persist. By synthesizing mathematical formulations, performance metrics, and future directions, this survey equips researchers with a comprehensive understanding of edge detection’s past, present, and potential, bridging theoretical insights with practical advancements. Full article

(This article belongs to the Special Issue Artificial Intelligence and Algorithms with Their Applications)

►▼ Show Figures

Figure 1

13 pages, 11739 KiB

Open AccessArticle

DeepVinci: Organ and Tool Segmentation with Edge Supervision and a Densely Multi-Scale Pyramid Module for Robot-Assisted Surgery

by Li-An Tseng, Yuan-Chih Tsai, Meng-Yi Bai, Mei-Fang Li, Yi-Liang Lee, Kai-Jo Chiang, Yu-Chi Wang and Jing-Ming Guo

Diagnostics 2025, 15(15), 1917; https://doi.org/10.3390/diagnostics15151917 - 30 Jul 2025

Abstract

Background: Automated surgical navigation can be separated into three stages: (1) organ identification and localization, (2) identification of the organs requiring further surgery, and (3) automated planning of the operation path and steps. With its ideal visual and operating system, the da Vinci surgical system provides a promising platform for automated surgical navigation. This study focuses on the first step in automated surgical navigation by identifying organs in gynecological surgery. Methods: Due to the difficulty of collecting da Vinci gynecological endoscopy data, we propose DeepVinci, a novel end-to-end high-performance encoder–decoder network based on convolutional neural networks (CNNs) for pixel-level organ semantic segmentation. Specifically, to overcome the drawback of a limited field of view, we incorporate a densely multi-scale pyramid module and feature fusion module, which can also enhance the global context information. In addition, the system integrates an edge supervision network to refine the segmented results on the decoding side. Results: Experimental results show that DeepVinci can achieve state-of-the-art accuracy, obtaining dice similarity coefficient and mean pixel accuracy values of 0.684 and 0.700, respectively. Conclusions: The proposed DeepVinci network presents a practical and competitive semantic segmentation solution for da Vinci gynecological surgery. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 221.

Go to page 1 2 3 4 5

Search Results (11,025)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI