MDPI - Publisher of Open Access Journals

19 pages, 3520 KiB

Open AccessArticle

Vision-Guided Maritime UAV Rescue System with Optimized GPS Path Planning and Dual-Target Tracking

by Suli Wang, Yang Zhao, Chang Zhou, Xiaodong Ma, Zijun Jiao, Zesheng Zhou, Xiaolu Liu, Tianhai Peng and Changxing Shao

Drones 2025, 9(7), 502; https://doi.org/10.3390/drones9070502 - 16 Jul 2025

Viewed by 496

Abstract

With the global increase in maritime activities, the frequency of maritime accidents has risen, underscoring the urgent need for faster and more efficient search and rescue (SAR) solutions. This study presents an intelligent unmanned aerial vehicle (UAV)-based maritime rescue system that combines GPS-driven [...] Read more.

With the global increase in maritime activities, the frequency of maritime accidents has risen, underscoring the urgent need for faster and more efficient search and rescue (SAR) solutions. This study presents an intelligent unmanned aerial vehicle (UAV)-based maritime rescue system that combines GPS-driven dynamic path planning with vision-based dual-target detection and tracking. Developed within the Gazebo simulation environment and based on modular ROS architecture, the system supports stable takeoff and smooth transitions between multi-rotor and fixed-wing flight modes. An external command module enables real-time waypoint updates. This study proposes three path-planning schemes based on the characteristics of drones. Comparative experiments have demonstrated that the triangular path is the optimal route. Compared with the other schemes, this path reduces the flight distance by 30–40%. Robust target recognition is achieved using a darknet-ROS implementation of the YOLOv4 model, enhanced with data augmentation to improve performance in complex maritime conditions. A monocular vision-based ranging algorithm ensures accurate distance estimation and continuous tracking of rescue vessels. Furthermore, a dual-target-tracking algorithm—integrating motion prediction with color-based landing zone recognition—achieves a 96% success rate in precision landings under dynamic conditions. Experimental results show a 4% increase in the overall mission success rate compared to traditional SAR methods, along with significant gains in responsiveness and reliability. This research delivers a technically innovative and cost-effective UAV solution, offering strong potential for real-world maritime emergency response applications. Full article

(This article belongs to the Special Issue Innovative Applications of UAVs in Search and Rescue: Improving Safety and Effectiveness)

► Show Figures

Figure 1

17 pages, 4666 KiB

Open AccessArticle

Lightweight YOLOv5s Model for Early Detection of Agricultural Fires

by Saydirasulov Norkobil Saydirasulovich, Sabina Umirzakova, Abduazizov Nabijon Azamatovich, Sanjar Mukhamadiev, Zavqiddin Temirov, Akmalbek Abdusalomov and Young Im Cho

Fire 2025, 8(5), 187; https://doi.org/10.3390/fire8050187 - 8 May 2025

Viewed by 821

Abstract

Agricultural fires significantly threaten global food systems, ecosystems, and rural economies, necessitating timely detection to prevent widespread damage. This study presents a lightweight and enhanced version of the YOLOv5s model, optimized for early-stage agricultural fire detection. The core innovation involves deepening the C3 [...] Read more.

Agricultural fires significantly threaten global food systems, ecosystems, and rural economies, necessitating timely detection to prevent widespread damage. This study presents a lightweight and enhanced version of the YOLOv5s model, optimized for early-stage agricultural fire detection. The core innovation involves deepening the C3 block and integrating DarknetBottleneck modules to extract finer visual features from subtle fire indicators such as light smoke and small flames. Experimental evaluations were conducted on a custom dataset of 3200 annotated agricultural fire images. The proposed model achieved a precision of 88.9%, a recall of 85.7%, and a mean Average Precision (mAP) of 87.3%, outperforming baseline YOLOv5s and several state-of-the-art (SOTA) detectors such as YOLOv7-tiny and YOLOv8n. The model maintains a compact size (7.5 M parameters) and real-time capability (74 FPS), making it suitable for resource-constrained deployment. Our findings demonstrate that focused architectural refinement can significantly improve early fire detection accuracy, enabling more effective response strategies and reducing agricultural losses. Full article

(This article belongs to the Special Issue Integrating AI and Remote Sensing for Monitoring and Mapping Fire Impacts on Agroforestry and Wildlife Systems)

► Show Figures

Figure 1

21 pages, 3228 KiB

Open AccessArticle

TransECA-Net: A Transformer-Based Model for Encrypted Traffic Classification

by Ziao Liu, Yuanyuan Xie, Yanyan Luo, Yuxin Wang and Xiangmin Ji

Appl. Sci. 2025, 15(6), 2977; https://doi.org/10.3390/app15062977 - 10 Mar 2025

Cited by 2 | Viewed by 2122

Abstract

Encrypted network traffic classification remains a critical component in network security monitoring. However, existing approaches face two fundamental limitations: (1) conventional methods rely on manual feature engineering and are inadequate in handling high-dimensional features; and (2) they lack the capability to capture dynamic [...] Read more.

Encrypted network traffic classification remains a critical component in network security monitoring. However, existing approaches face two fundamental limitations: (1) conventional methods rely on manual feature engineering and are inadequate in handling high-dimensional features; and (2) they lack the capability to capture dynamic temporal patterns. This paper introduces TransECA-Net, a novel hybrid deep learning architecture that addresses these limitations through two key innovations. First, we integrate ECA-Net modules with CNN architecture to enable automated feature extraction and efficient dimension reduction via channel selection. Second, we incorporate a Transformer encoder to model global temporal dependencies through multi-head self-attention, supplemented by residual connections for optimal gradient flow. Extensive experiments on the ISCX VPN-nonVPN dataset demonstrate the superiority of our approach. TransECA-Net achieved an average accuracy of 98.25% in classifying 12 types of encrypted traffic, outperforming classical baseline models such as 1D-CNN, CNN + LSTM, and TFE-GNN by 6.2–14.8%. Additionally, it demonstrated a 37.44–48.84% improvement in convergence speed during the training process. Our proposed framework presents a new paradigm for encrypted traffic feature disentanglement and representation learning. This paradigm enables cybersecurity systems to achieve fine-grained service identification of encrypted traffic (e.g., 98.9% accuracy in VPN traffic detection) and real-time responsiveness (48.8% faster than conventional methods), providing technical support for combating emerging cybercrimes such as monitoring illegal transactions on darknet networks and contributing significantly to adaptive network security monitoring systems. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

14 pages, 7611 KiB

Open AccessArticle

Detection of Apple Trees in Orchard Using Monocular Camera

by Stephanie Nix, Airi Sato, Hirokazu Madokoro, Satoshi Yamamoto, Yo Nishimura and Kazuhito Sato

Agriculture 2025, 15(5), 564; https://doi.org/10.3390/agriculture15050564 - 6 Mar 2025

Viewed by 827

Abstract

This study proposes an object detector for apple trees as a first step in developing agricultural digital twins. An original dataset of orchard images was created and used to train Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) models. Performance [...] Read more.

This study proposes an object detector for apple trees as a first step in developing agricultural digital twins. An original dataset of orchard images was created and used to train Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) models. Performance was evaluated using mean Average Precision (mAP). YOLO significantly outperformed SSD, achieving 91.3% mAP compared to the SSD’s 46.7%. Results indicate YOLO’s Darknet-53 backbone extracts more complex features suited to tree detection. This work demonstrates the potential of deep learning for automated data collection in smart farming applications. Full article

(This article belongs to the Special Issue Innovations in Precision Farming for Sustainable Agriculture)

► Show Figures

Figure 1

21 pages, 17670 KiB

Open AccessArticle

Advancing Traffic Sign Recognition: Explainable Deep CNN for Enhanced Robustness in Adverse Environments

by Ilyass Benfaress, Afaf Bouhoute and Ahmed Zinedine

Computers 2025, 14(3), 88; https://doi.org/10.3390/computers14030088 - 4 Mar 2025

Viewed by 2047

Abstract

This paper presents a traffic sign recognition (TSR) system based on the deep convolutional neural network (CNN) architecture, which proves to be extremely accurate in recognizing traffic signs under challenging conditions such as bad weather, low-resolution images, and various environmental-impact factors. The proposed [...] Read more.

This paper presents a traffic sign recognition (TSR) system based on the deep convolutional neural network (CNN) architecture, which proves to be extremely accurate in recognizing traffic signs under challenging conditions such as bad weather, low-resolution images, and various environmental-impact factors. The proposed CNN is compared with other architectures, including GoogLeNet, AlexNet, DarkNet-53, ResNet-34, VGG-16, and MicronNet-BF. Experimental results confirm that the proposed CNN significantly improves recognition accuracy compared to existing models. In order to make our model interpretable, we utilize explainable AI (XAI) approaches, specifically Gradient-weighted Class Activation Mapping (Grad-CAM), that can give insight into how the system comes to its decision. The evaluation of the Tsinghua-Tencent 100K (TT100K) traffic sign dataset showed that the proposed method significantly outperformed existing state-of-the-art methods. Additionally, we evaluated our model on the German Traffic Sign Recognition Benchmark (GTSRB) dataset to ensure generalization, demonstrating its ability to perform well in diverse traffic sign conditions. Design issues such as noise, contrast, blurring, and zoom effects were added to enhance performance in real applications. These verified results indicate both the strength and reliability of the CNN architecture proposed for TSR tasks and that it is a good option for integration into intelligent transportation systems (ITSs). Full article

(This article belongs to the Special Issue Deep Learning and Machine Learning for Image Processing: Algorithms and Applications)

► Show Figures

Figure 1

28 pages, 14219 KiB

Open AccessArticle

Classification and Analysis of Agaricus bisporus Diseases with Pre-Trained Deep Learning Models

by Umit Albayrak, Adem Golcuk, Sinan Aktas, Ugur Coruh, Sakir Tasdemir and Omer Kaan Baykan

Agronomy 2025, 15(1), 226; https://doi.org/10.3390/agronomy15010226 - 17 Jan 2025

Cited by 2 | Viewed by 1629

Abstract

This research evaluates 20 advanced convolutional neural network (CNN) architectures for classifying mushroom diseases in Agaricus bisporus, utilizing a custom dataset of 3195 images (2464 infected and 731 healthy mushrooms) captured under uniform white-light conditions. The consistent illumination in the dataset enhances [...] Read more.

This research evaluates 20 advanced convolutional neural network (CNN) architectures for classifying mushroom diseases in Agaricus bisporus, utilizing a custom dataset of 3195 images (2464 infected and 731 healthy mushrooms) captured under uniform white-light conditions. The consistent illumination in the dataset enhances the robustness and practical usability of the assessed models. Using a weighted scoring system that incorporates precision, recall, F1-score, area under the ROC curve (AUC), and average precision (AP), ResNet-50 achieved the highest overall score of 99.70%, demonstrating outstanding performance across all disease categories. DenseNet-201 and DarkNet-53 followed closely, confirming their reliability in classification tasks with high recall and precision values. Confusion matrices and ROC curves further validated the classification capabilities of the models. These findings underscore the potential of CNN-based approaches for accurate and efficient early detection of mushroom diseases, contributing to more sustainable and data-driven agricultural practices. Full article

(This article belongs to the Special Issue Computer Vision and Deep Learning Technology in Agriculture: 2nd Edition)

► Show Figures

Figure 1

14 pages, 2060 KiB

Open AccessArticle

Detection of Acromion Types in Shoulder Magnetic Resonance Image Examination with Developed Convolutional Neural Network and Textural-Based Content-Based Image Retrieval System

by Mehmet Akçiçek, Mücahit Karaduman, Bülent Petik, Serkan Ünlü, Hursit Burak Mutlu and Muhammed Yildirim

J. Clin. Med. 2025, 14(2), 505; https://doi.org/10.3390/jcm14020505 - 14 Jan 2025

Viewed by 1189

Abstract

Background: The morphological type of the acromion may play a role in the etiopathogenesis of various pathologies, such as shoulder impingement syndrome and rotator cuff disorders. Therefore, it is important to determine the acromion’s morphological types accurately and quickly. In this study, it [...] Read more.

Background: The morphological type of the acromion may play a role in the etiopathogenesis of various pathologies, such as shoulder impingement syndrome and rotator cuff disorders. Therefore, it is important to determine the acromion’s morphological types accurately and quickly. In this study, it was aimed to detect the acromion shape, which is one of the etiological causes of chronic shoulder disorders that may cause a decrease in work capacity and quality of life, on shoulder MR images by developing a new model for image retrieval in Content-Based Image Retrieval (CBIR) systems. Methods: Image retrieval was performed in CBIR systems using Convolutional Neural Network (CNN) architectures and textural-based methods as the basis. Feature maps of the images were extracted to measure image similarities in the developed CBIR system. For feature map extraction, feature extraction was performed with Histogram of Gradient (HOG), Local Binary Pattern (LBP), Darknet53, and Densenet201 architectures, and the Minimum Redundancy Maximum Relevance (mRMR) feature selection method was used for feature selection. The feature maps obtained after the dimensionality reduction process were combined. The Euclidean distance and Peak Signal-to-Noise Ratio (PSNR) were used as similarity measurement methods. Image retrieval was performed using features obtained from CNN architectures and textural-based models to compare the performance of the proposed method. Results: The highest Average Precision (AP) value was reached in the PSNR similarity measurement method with 0.76 in the proposed model. Conclusions: The proposed model is promising for accurately and rapidly determining morphological types of the acromion, thus aiding in the diagnosis and understanding of chronic shoulder disorders. Full article

(This article belongs to the Section Nuclear Medicine & Radiology)

► Show Figures

Figure 1

20 pages, 1343 KiB

Open AccessArticle

Fast Design Space Exploration for Always-On Neural Networks

by Jeonghun Kim and Sunggu Lee

Electronics 2024, 13(24), 4971; https://doi.org/10.3390/electronics13244971 - 17 Dec 2024

Viewed by 829

Abstract

An analytical model can quickly predict performance and energy efficiency based on information about the neural network model and neural accelerator architecture, making it ideal for rapid pre-synthesis design space exploration. This paper proposes a new analytical model specifically targeted for convolutional neural [...] Read more.

An analytical model can quickly predict performance and energy efficiency based on information about the neural network model and neural accelerator architecture, making it ideal for rapid pre-synthesis design space exploration. This paper proposes a new analytical model specifically targeted for convolutional neural networks used in always-on applications. To validate the proposed model, the performance and energy efficiency estimated by the model were compared with actual hardware and post-synthesis gate-level simulations of hardware synthesized with a state-of-the-art electronic design automation (EDA) synthesis tool. Comparisons with hardware created for the Eyeriss neural accelerator showed average execution time and energy consumption error rates of 3.33% and 13.54%, respectively. Comparisons with hardware synthesis results showed an error of 3.18% to 9.44% for two example neural accelerator configurations used to execute MobileNet, EfficientNet, and DarkNet neural network models. Finally, the utility of the proposed model was demonstrated by using it to evaluate the effects of different channel sizes, pruning rates, and batch sizes in several neural network designs for always-on vision, text, and audio processing. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

14 pages, 2268 KiB

Open AccessArticle

Enhanced Occupational Safety in Agricultural Machinery Factories: Artificial Intelligence-Driven Helmet Detection Using Transfer Learning and Majority Voting

by Simge Özüağ and Ömer Ertuğrul

Appl. Sci. 2024, 14(23), 11278; https://doi.org/10.3390/app142311278 - 3 Dec 2024

Cited by 2 | Viewed by 1455

Abstract

The objective of this study was to develop an artificial intelligence (AI)-driven model for the detection of helmet usage among workers in tractor and agricultural machinery factories with the aim of enhancing occupational safety. A transfer learning approach was employed, utilizing nine pre-trained [...] Read more.

The objective of this study was to develop an artificial intelligence (AI)-driven model for the detection of helmet usage among workers in tractor and agricultural machinery factories with the aim of enhancing occupational safety. A transfer learning approach was employed, utilizing nine pre-trained neural networks for the extraction of deep features. The following neural networks were employed: MobileNetV2, ResNet50, DarkNet53, AlexNet, ShuffleNet, DenseNet201, InceptionV3, Inception-ResNetV2, and GoogLeNet. Subsequently, the extracted features were subjected to iterative neighborhood component analysis (INCA) for feature selection, after which they were classified using the k-nearest neighbor (kNN) algorithm. The classification outputs of all networks were combined through iterative majority voting (IMV) to achieve optimal results. To evaluate the model, an image dataset comprising 662 images of individuals wearing helmets and 722 images of individuals without helmets sourced from the internet was constructed. The proposed model achieved an accuracy of 90.39%, with DenseNet201 producing the most accurate results. This AI-driven helmet detection model demonstrates significant potential in improving occupational safety by assisting safety officers, especially in confined environments, reducing human error, and enhancing efficiency. Full article

(This article belongs to the Section Agricultural Science and Technology)

► Show Figures

Figure 1

18 pages, 3041 KiB

Open AccessArticle

Enhancing Traffic Accident Severity Prediction Using ResNet and SHAP for Interpretability

by Ilyass Benfaress, Afaf Bouhoute and Ahmed Zinedine

AI 2024, 5(4), 2568-2585; https://doi.org/10.3390/ai5040124 - 29 Nov 2024

Cited by 1 | Viewed by 3126

Abstract

Background/Objectives: This paper presents a Residual Neural Network (ResNet) based framework tailored for structured traffic accident data, aiming to improve accident severity prediction. The proposed model leverages residual learning to effectively model intricate relationships between numerical and categorical variables, resulting in a notable [...] Read more.

Background/Objectives: This paper presents a Residual Neural Network (ResNet) based framework tailored for structured traffic accident data, aiming to improve accident severity prediction. The proposed model leverages residual learning to effectively model intricate relationships between numerical and categorical variables, resulting in a notable increase in prediction accuracy. Methods: A comparative analysis was performed with other Deep Learning (DL) architectures, including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), Darknet, and Extreme Inception (Xception), showing superior performance of the proposed Resnet. Key factors influencing accident severity were identified, with Shapley Additive Explanations (SHAP) values helping to address the need for transparent and explainable Artificial Intelligence (AI) in critical decision-making areas. Results: The generalizability of the ResNet model was assessed by training it, initially, on a UK road accidents dataset and validating it on a distinct dataset from India. The model consistently demonstrated high predictive accuracy, underscoring its robustness across diverse contexts, despite regional differences. Conclusions: These results suggest that the adapted ResNet model could significantly enhance traffic safety evaluations and contribute to the formulation of more effective traffic management strategies. Full article

(This article belongs to the Special Issue Efficiency and Scalability of Advanced Machine Learning and Optimization Methods for Real-World Applications)

► Show Figures

Figure 1

20 pages, 2772 KiB

Open AccessArticle

Activities of Daily Living Object Dataset: Advancing Assistive Robotic Manipulation with a Tailored Dataset

by Md Tanzil Shahria and Mohammad H. Rahman

Sensors 2024, 24(23), 7566; https://doi.org/10.3390/s24237566 - 27 Nov 2024

Cited by 2 | Viewed by 1486

Abstract

The increasing number of individuals with disabilities—over 61 million adults in the United States alone—underscores the urgent need for technologies that enhance autonomy and independence. Among these individuals, millions rely on wheelchairs and often require assistance from another person with activities of daily [...] Read more.

The increasing number of individuals with disabilities—over 61 million adults in the United States alone—underscores the urgent need for technologies that enhance autonomy and independence. Among these individuals, millions rely on wheelchairs and often require assistance from another person with activities of daily living (ADLs), such as eating, grooming, and dressing. Wheelchair-mounted assistive robotic arms offer a promising solution to enhance independence, but their complex control interfaces can be challenging for users. Automating control through deep learning-based object detection models presents a viable pathway to simplify operation, yet progress is impeded by the absence of specialized datasets tailored for ADL objects suitable for robotic manipulation in home environments. To bridge this gap, we present a novel ADL object dataset explicitly designed for training deep learning models in assistive robotic applications. We curated over 112,000 high-quality images from four major open-source datasets—COCO, Open Images, LVIS, and Roboflow Universe—focusing on objects pertinent to daily living tasks. Annotations were standardized to the YOLO Darknet format, and data quality was enhanced through a rigorous filtering process involving a pre-trained YOLOv5x model and manual validation. Our dataset provides a valuable resource that facilitates the development of more effective and user-friendly semi-autonomous control systems for assistive robots. By offering a focused collection of ADL-related objects, we aim to advance assistive technologies that empower individuals with mobility impairments, addressing a pressing societal need and laying the foundation for future innovations in human–robot interaction within home settings. Full article

(This article belongs to the Special Issue Vision Sensors for Object Detection and Tracking)

► Show Figures

Figure 1

20 pages, 1025 KiB

Open AccessArticle

Empirical Evaluation and Analysis of YOLO Models in Smart Transportation

by Lan Anh Nguyen, Manh Dat Tran and Yongseok Son

AI 2024, 5(4), 2518-2537; https://doi.org/10.3390/ai5040122 - 26 Nov 2024

Cited by 5 | Viewed by 2071

Abstract

You Only Look Once (YOLO) and its variants have emerged as the most popular real-time object detection algorithms. They have been widely used in real-time smart transportation applications due to their low-latency detection and high accuracy. However, because of the diverse characteristics of [...] Read more.

You Only Look Once (YOLO) and its variants have emerged as the most popular real-time object detection algorithms. They have been widely used in real-time smart transportation applications due to their low-latency detection and high accuracy. However, because of the diverse characteristics of YOLO models, selecting the optimal model according to various applications and environments in smart transportation is critical. In this article, we conduct an empirical evaluation and analysis study for most YOLO versions to assess their performance in smart transportation. To achieve this, we first measure the average precision of YOLO models across multiple datasets (i.e., COCO and PASCAL VOC). Second, we analyze the performance of YOLO models on multiple object categories within each dataset, focusing on classes relevant to road transportation such as those commonly used in smart transportation applications. Third, multiple Intersection over Union (IoU) thresholds are considered in our performance measurement and analysis. By examining the performance of various YOLO models across datasets, IoU thresholds, and object classes, we make six observations on these three aspects while aiming to identify optimal models for road transportation scenarios. It was found that YOLOv5 and YOLOv8 outperform other models in all three aspects due to their novel performance features. For instance, YOLOv5 achieves stable performance thanks to its cross-stage partial darknet-53 (CSPDarknet53) backbone, auto-anchor mechanism, and efficient loss functions including IoU loss, complete IoU loss, focal loss, gradient harmonizing mechanism loss. Similarly, YOLOv8 outperforms others with its upgraded CSPDarknet53 backbone, anchor-free mechanism, and efficient loss functions like complete IoU loss and distribution focal loss. Full article

► Show Figures

Figure 1

13 pages, 5271 KiB

Open AccessArticle

Visualizing Plant Disease Distribution and Evaluating Model Performance for Deep Learning Classification with YOLOv8

by Abdul Ghafar, Caikou Chen, Syed Atif Ali Shah, Zia Ur Rehman and Gul Rahman

Pathogens 2024, 13(12), 1032; https://doi.org/10.3390/pathogens13121032 - 22 Nov 2024

Cited by 3 | Viewed by 1961

Abstract

This paper presents a novel methodology for plant disease detection using YOLOv8 (You Only Look Once version 8), a state-of-the-art object detection model designed for real-time image classification and recognition tasks. The proposed approach involves training a custom YOLOv8 model to detect and [...] Read more.

This paper presents a novel methodology for plant disease detection using YOLOv8 (You Only Look Once version 8), a state-of-the-art object detection model designed for real-time image classification and recognition tasks. The proposed approach involves training a custom YOLOv8 model to detect and classify various plant conditions accurately. The model was evaluated using a testing subset to measure its performance in detecting different plant diseases. To ensure the model’s robustness and generalizability beyond the training dataset, it was further tested on a set of unseen images sourced from Google Images. This additional testing aimed to assess the model’s effectiveness in real-world scenarios, where it might encounter new data. The evaluation results were auspicious, demonstrating the model’s capability to classify plant conditions, such as diseases, with high accuracy. Moreover, the use of YOLOv8 offers significant improvements in speed and precision, making it suitable for real-time plant disease monitoring applications. The findings highlight the potential of this methodology for broader agricultural applications, including early disease detection and prevention. Full article

► Show Figures

Figure 1

17 pages, 2432 KiB

Open AccessArticle

Non-Destructive Estimation of Paper Fiber Using Macro Images: A Comparative Evaluation of Network Architectures and Patch Sizes for Patch-Based Classification

by Naoki Kamiya, Kosuke Ashino, Yasuhiro Sakai, Yexin Zhou, Yoichi Ohyanagi and Koji Shibazaki

NDT 2024, 2(4), 487-503; https://doi.org/10.3390/ndt2040030 - 7 Nov 2024

Viewed by 1067

Abstract

Over the years, research in the field of cultural heritage preservation and document analysis has exponentially grown. In this study, we propose an advanced approach for non-destructive estimation of paper fibers using macro images. Expanding on studies that implemented EfficientNet-B0, we explore the [...] Read more.

Over the years, research in the field of cultural heritage preservation and document analysis has exponentially grown. In this study, we propose an advanced approach for non-destructive estimation of paper fibers using macro images. Expanding on studies that implemented EfficientNet-B0, we explore the effectiveness of six other deep learning networks, including DenseNet-201, DarkNet-53, Inception-v3, Xception, Inception-ResNet-v2, and NASNet-Large, in conjunction with enlarged patch sizes. We experimentally classified three types of paper fibers, namely, kozo, mitsumata, and gampi. During the experiments, patch sizes of 500, 750, and 1000 pixels were evaluated and their impact on classification accuracy was analyzed. The experiments demonstrated that Inception-ResNet-v2 with 1000-pixel patches achieved the highest patch classification accuracy of 82.7%, whereas Xception with 750-pixel patches exhibited the best macro-image-based fiber estimation performance at 84.9%. Additionally, we assessed the efficacy of the method for images containing text, observing consistent improvements in the case of larger patch sizes. However, limitations exist in background patch availability for text-heavy images. This comprehensive evaluation of network architectures and patch sizes can significantly advance the field of non-destructive paper analysis, offering valuable insights into future developments in historical document examination and conservation science. Full article

(This article belongs to the Special Issue Advances in Imaging-Based NDT Methods)

► Show Figures

Figure 1

8 pages, 2328 KiB

Open AccessProceeding Paper

Object Detection for Autonomous Logistics: A YOLOv4 Tiny Approach with ROS Integration and LOCO Dataset Evaluation

by Souhaila Khalfallah, Mohamed Bouallegue and Kais Bouallegue

Eng. Proc. 2024, 67(1), 65; https://doi.org/10.3390/engproc2024067065 - 12 Oct 2024

Cited by 4 | Viewed by 1549

Abstract

This paper presents an object detection model for logistics-centered objects deployed and used by autonomous warehouse robots. Using the Robot Operating System (ROS) infrastructure, our work leverages the set of provided models and a dataset to create a complex system that can meet [...] Read more.

This paper presents an object detection model for logistics-centered objects deployed and used by autonomous warehouse robots. Using the Robot Operating System (ROS) infrastructure, our work leverages the set of provided models and a dataset to create a complex system that can meet the guidelines of the Autonomous Mobile Robots (AMRs). We describe an innovative method, and the primary emphasis is placed on the Logistics Objects in Context (LOCO) dataset. The importance is on training the model and determining optimal performance and accuracy for the implemented object detection task. Using neural networks as pattern recognition tools, we took advantage of the one-stage detection architecture YOLO that prioritizes speed and accuracy. Focusing on a lightweight variant of this architecture, YOLOv4 Tiny, we were able to optimize for deployment on resource-constrained edge devices without compromising detection accuracy, resulting in a significant performance boost over previous benchmarks. The YOLOv4 Tiny model was implemented with Darknet, especially for its adaptability to ROS Melodic framework and capability to fit edge devices. Notably, our network achieved a mean average precision (mAP) of 46% and an intersection over union (IoU) of 50%, surpassing the baseline metrics established by the initial LOCO study. These results demonstrate a significant improvement in performance and accuracy for real-world logistics applications of AMRs. Our contribution lies in providing valuable insights into the capabilities of AMRs within the logistics environment, thus paving the way for further advancements in this field. Full article

(This article belongs to the Proceedings of The 3rd International Electronic Conference on Processes)

► Show Figures

Figure 1

Search Results (124)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (124)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI