Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (49)

Search Parameters:
Keywords = ORB-SIFT

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 7096 KB  
Article
An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms
by Guanhua Yi, Tianxiang Zhang, Yunfei Chen and Dapeng Yu
J. Mar. Sci. Eng. 2026, 14(2), 218; https://doi.org/10.3390/jmse14020218 - 21 Jan 2026
Viewed by 101
Abstract
Underwater optical images often exhibit severe color distortion, weak texture, and uneven illumination due to light absorption and scattering in water. These issues result in unstable feature detection and inaccurate image registration. To address these challenges, this paper proposes an underwater image stitching [...] Read more.
Underwater optical images often exhibit severe color distortion, weak texture, and uneven illumination due to light absorption and scattering in water. These issues result in unstable feature detection and inaccurate image registration. To address these challenges, this paper proposes an underwater image stitching method that integrates ORB (Oriented FAST and Rotated BRIEF) feature extraction with a fixed-ratio constraint matching strategy. First, lightweight color and contrast enhancement techniques are employed to restore color balance and improve local texture visibility. Then, ORB descriptors are extracted and matched via a KNN (K-Nearest Neighbors) nearest-neighbor search, and Lowe’s ratio test is applied to eliminate false matches caused by weak texture similarity. Finally, the geometric transformation between image frames is estimated by incorporating robust optimization, ensuring stable homography computation. Experimental results on real underwater datasets show that the proposed method significantly improves stitching continuity and structural consistency, achieving 40–120% improvements in SSIM (Structural Similarity Index) and PSNR (peak signal-to-noise ratio) over conventional Harris–ORB + KNN, SIFT (scale-invariant feature transform) + BF (brute force), SIFT + KNN, and AKAZE (accelerated KAZE) + BF methods while maintaining processing times within one second. These results indicate that the proposed method is well-suited for real-time underwater environment perception and panoramic mapping on low-cost, micro-sized underwater robotic platforms. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

16 pages, 87659 KB  
Article
UAV-TIRVis: A Benchmark Dataset for Thermal–Visible Image Registration from Aerial Platforms
by Costin-Emanuel Vasile, Călin Bîră and Radu Hobincu
J. Imaging 2025, 11(12), 432; https://doi.org/10.3390/jimaging11120432 - 4 Dec 2025
Viewed by 779
Abstract
Registering UAV-based thermal and visible images is a challenging task due to differences in appearance across spectra and the lack of public benchmarks. To address this issue, we introduce UAV-TIRVis, a dataset consisting of 80 accurately and manually registered UAV-based thermal (640 × [...] Read more.
Registering UAV-based thermal and visible images is a challenging task due to differences in appearance across spectra and the lack of public benchmarks. To address this issue, we introduce UAV-TIRVis, a dataset consisting of 80 accurately and manually registered UAV-based thermal (640 × 512) and visible (4K) image pairs, captured across diverse environments. We benchmark our dataset using well-known registration methods, including feature-based (ORB, SURF, SIFT, KAZE), correlation-based, and intensity-based methods, as well as a custom, heuristic intensity-based method. We evaluate the performance of these methods using four metrics: RMSE, PSNR, SSIM, and NCC, averaged per scenario and across the entire dataset. The results show that conventional methods often fail to generalize across scenes, yielding <0.6 NCC on average, whereas the heuristic method shows that it is possible to achieve 0.77 SSIM and 0.82 NCC, highlighting the difficulty of cross-spectral UAV alignment and the need for further research to improve optimization in existing registration methods. Full article
Show Figures

Figure 1

20 pages, 10851 KB  
Article
Evaluating Feature-Based Homography Pipelines for Dual-Camera Registration in Acupoint Annotation
by Thathsara Nanayakkara, Hadi Sedigh Malekroodi, Jaeuk Sul, Chang-Su Na, Myunggi Yi and Byeong-il Lee
J. Imaging 2025, 11(11), 388; https://doi.org/10.3390/jimaging11110388 - 1 Nov 2025
Viewed by 885
Abstract
Reliable acupoint localization is essential for developing artificial intelligence (AI) and extended reality (XR) tools in traditional Korean medicine; however, conventional annotation of 2D images often suffers from inter- and intra-annotator variability. This study presents a low-cost dual-camera imaging system that fuses infrared [...] Read more.
Reliable acupoint localization is essential for developing artificial intelligence (AI) and extended reality (XR) tools in traditional Korean medicine; however, conventional annotation of 2D images often suffers from inter- and intra-annotator variability. This study presents a low-cost dual-camera imaging system that fuses infrared (IR) and RGB views on a Raspberry Pi 5 platform, incorporating an IR ink pen in conjunction with a 780 nm emitter array to standardize point visibility. Among the tested marking materials, the IR ink showed the highest contrast and visibility under IR illumination, making it the most suitable for acupoint detection. Five feature detectors (SIFT, ORB, KAZE, AKAZE, and BRISK) were evaluated with two matchers (FLANN and BF) to construct representative homography pipelines. Comparative evaluations across multiple camera-to-surface distances revealed that KAZE + FLANN achieved the lowest mean 2D error (1.17 ± 0.70 px) and the lowest mean aspect-aware error (0.08 ± 0.05%) while remaining computationally feasible on the Raspberry Pi 5. In hand-image experiments across multiple postures, the dual-camera registration maintained a mean 2D error below ~3 px and a mean aspect-aware error below ~0.25%, confirming stable and reproducible performance. The proposed framework provides a practical foundation for generating high-quality acupoint datasets, supporting future AI-based localization, XR integration, and automated acupuncture-education systems. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

36 pages, 4464 KB  
Article
Efficient Image-Based Memory Forensics for Fileless Malware Detection Using Texture Descriptors and LIME-Guided Deep Learning
by Qussai M. Yaseen, Esraa Oudat, Monther Aldwairi and Salam Fraihat
Computers 2025, 14(11), 467; https://doi.org/10.3390/computers14110467 - 1 Nov 2025
Viewed by 1404
Abstract
Memory forensics is an essential cybersecurity tool that comprehensively examines volatile memory to detect the malicious activity of fileless malware that can bypass disk analysis. Image-based detection techniques provide a promising solution by visualizing memory data into images to be used and analyzed [...] Read more.
Memory forensics is an essential cybersecurity tool that comprehensively examines volatile memory to detect the malicious activity of fileless malware that can bypass disk analysis. Image-based detection techniques provide a promising solution by visualizing memory data into images to be used and analyzed by image processing tools and machine learning methods. However, the effectiveness of image-based data for detection and classification requires high computational efforts. This paper investigates the efficacy of texture-based methods in detecting and classifying memory-resident or fileless malware using different image resolutions, identifying the best feature descriptors, classifiers, and resolutions that accurately classify malware into specific families and differentiate them from benign software. Moreover, this paper uses both local and global descriptors, where local descriptors include Oriented FAST and Rotated BRIEF (ORB), Scale-Invariant Feature Transform (SIFT), and Histogram of Oriented Gradients (HOG) and global descriptors include Discrete Wavelet Transform (DWT), GIST, and Gray Level Co-occurrence Matrix (GLCM). The results indicate that as image resolution increases, most feature descriptors yield more discriminative features but require higher computational efforts in terms of time and processing resources. To address this challenge, this paper proposes a novel approach that integrates Local Interpretable Model-agnostic Explanations (LIME) with deep learning models to automatically identify and crop the most important regions of memory images. The LIME’s ROI was extracted based on ResNet50 and MobileNet models’ predictions separately, the images were resized to 128 × 128, and the sampling process was performed dynamically to speed up LIME computation. The ROIs of the images are cropped to new images with sizes of (100 × 100) in two stages: the coarse stage and the fine stage. The two generated LIME-based cropped images using ResNet50 and MobileNet are fed to the lightweight neural network to evaluate the effectiveness of the LIME-based identified regions. The results demonstrate that the LIME-based MobileNet model’s prediction improves the efficiency of the model by preserving important features with a classification accuracy of 85% on multi-class classification. Full article
(This article belongs to the Special Issue Using New Technologies in Cyber Security Solutions (2nd Edition))
Show Figures

Figure 1

16 pages, 7958 KB  
Article
Development and Evaluation of a Keypoint-Based Video Stabilization Pipeline for Oral Capillaroscopy
by Vito Gentile, Vincenzo Taormina, Luana Conte, Giorgio De Nunzio, Giuseppe Raso and Donato Cascio
Sensors 2025, 25(18), 5738; https://doi.org/10.3390/s25185738 - 15 Sep 2025
Viewed by 859
Abstract
Capillaroscopy imaging is a non-invasive technique used to examine the microcirculation of the oral mucosa. However, the acquired video sequences are often affected by motion noise and shaking, which can compromise diagnostic accuracy and hinder the development of automated systems for capillary identification [...] Read more.
Capillaroscopy imaging is a non-invasive technique used to examine the microcirculation of the oral mucosa. However, the acquired video sequences are often affected by motion noise and shaking, which can compromise diagnostic accuracy and hinder the development of automated systems for capillary identification and segmentation. To address these challenges, we implemented a comprehensive video stabilization model, structured as a multi-phase pipeline and visually represented through a flow-chart. The proposed method integrates keypoint extraction, optical flow estimation, and affine transformation-based frame alignment to enhance video stability. Within this framework, we evaluated the performance of three keypoint extraction algorithms—Scale-Invariant Feature Transform (SIFT), Oriented FAST and Rotated BRIEF (ORB) and Good Features to Track (GFTT)—on a curated dataset of oral capillaroscopy videos. To simulate real-world acquisition conditions, synthetic tremors were introduced via Gaussian affine transformations. Experimental results demonstrate that all three algorithms yield comparable stabilization performance, with GFTT offering slightly higher structural fidelity and ORB excelling in computational efficiency. These findings validate the effectiveness of the proposed model and highlight its potential for improving the quality and reliability of oral videocapillaroscopy imaging. Experimental evaluation showed that the proposed pipeline achieved an average SSIM of 0.789 and reduced jitter to 25.8, compared to the perturbed input sequences. In addition, path smoothness and RMS errors (translation and rotation) consistently indicated improved stabilization across all tested feature extractors. Compared to previous stabilization approaches in nailfold capillaroscopy, our method achieved comparable or superior structural fidelity while maintaining computational efficiency. Full article
(This article belongs to the Special Issue Biomedical Signals, Images and Healthcare Data Analysis: 2nd Edition)
Show Figures

Figure 1

14 pages, 949 KB  
Article
A New Approach to ORB Acceleration Using a Modern Low-Power Microcontroller
by Jorge Aráez, Santiago Real and Alvaro Araujo
Sensors 2025, 25(12), 3796; https://doi.org/10.3390/s25123796 - 18 Jun 2025
Viewed by 969
Abstract
A key component in visual Simultaneous Location And Mapping (SLAM) systems is feature extraction and description. One common algorithm that accomplishes this purpose is Oriented FAST and Rotated BRIEF (ORB), which is used in state-of-the-art SLAM systems like ORB-SLAM. While it is faster [...] Read more.
A key component in visual Simultaneous Location And Mapping (SLAM) systems is feature extraction and description. One common algorithm that accomplishes this purpose is Oriented FAST and Rotated BRIEF (ORB), which is used in state-of-the-art SLAM systems like ORB-SLAM. While it is faster than other feature detectors like SIFT (340 times faster) or SURF (15 times faster), it is one of the most computationally expensive algorithms in these types of systems. This problem has commonly been solved by delegating this task to hardware-accelerated solutions like FPGAs or ASICs. While this solution is useful, it incurs a greater economical cost. This work proposes a solution for feature extraction and description based on a modern low-power mainstream microcontroller. The execution time of ORB, along with power consumption, are analyzed in relation to the number of feature points and internal variables. The results show a maximum of 0.6 s for ORB execution in 1241 × 376 resolution images, which is significantly slower than other hardware-accelerated solutions but remains viable for certain applications. Additionally, the power consumption ranges between 30 and 40 milliwatts, which is lower than FPGA solutions. This work also allows for future optimizations that will improve the results of this paper. Full article
(This article belongs to the Special Issue Sensors and Sensory Algorithms for Intelligent Transportation Systems)
Show Figures

Figure 1

23 pages, 2719 KB  
Article
An Implementation of Web-Based Answer Platform in the Flutter Programming Learning Assistant System Using Docker Compose
by Lynn Htet Aung, Soe Thandar Aung, Nobuo Funabiki, Htoo Htoo Sandi Kyaw and Wen-Chung Kao
Electronics 2024, 13(24), 4878; https://doi.org/10.3390/electronics13244878 - 11 Dec 2024
Cited by 2 | Viewed by 2267
Abstract
Programming has gained significant importance worldwide as societies increasingly rely on computer application systems. To support novices in learning various programming languages, we have developed the Programming Learning Assistant System (PLAS). It offers several types of exercise problems with different learning goals [...] Read more.
Programming has gained significant importance worldwide as societies increasingly rely on computer application systems. To support novices in learning various programming languages, we have developed the Programming Learning Assistant System (PLAS). It offers several types of exercise problems with different learning goals and levels for step-by-step self-study. As a personal answer platform in PLAS, we have implemented a web application using Node.js and EJS for Java and Python programming. Recently, the Flutter framework with Dart programming has become popular, enabling developers to build applications for mobile, web, and desktop environments from a single codebase. Thus, we have extended PLAS by implementing the Flutter environment with Visual Studio Code to support it. Additionally, we have developed an image-based user interface (UI) testing tool to verify student source code by comparing its generated UI image with the standard one using the ORB and SIFT algorithms in OpenCV. For efficient distribution to students, we have generated Docker images of the answer platform, Flutter environment, and image-based UI testing tool. In this paper, we present the implementation of a web-based answer platform for the Flutter Programming Learning Assistant System (FPLAS) by integrating three Docker images using Docker Compose. Additionally, to capture UI images automatically, an Nginx web application server is adopted with its Docker image. For evaluations, we asked 10 graduate students at Okayama University, Japan, to install the answer platform on their PCs and solve five exercise problems. All the students successfully completed the problems, which confirms the validity and effectiveness of the proposed system. Full article
Show Figures

Figure 1

32 pages, 21133 KB  
Article
An Automated Feature-Based Image Registration Strategy for Tool Condition Monitoring in CNC Machine Applications
by Eden Lazar, Kristin S. Bennett, Andres Hurtado Carreon and Stephen C. Veldhuis
Sensors 2024, 24(23), 7458; https://doi.org/10.3390/s24237458 - 22 Nov 2024
Cited by 5 | Viewed by 2165
Abstract
The implementation of Machine Vision (MV) systems for Tool Condition Monitoring (TCM) plays a critical role in reducing the total cost of operation in manufacturing while expediting tool wear testing in research settings. However, conventional MV-TCM edge detection strategies process each image independently [...] Read more.
The implementation of Machine Vision (MV) systems for Tool Condition Monitoring (TCM) plays a critical role in reducing the total cost of operation in manufacturing while expediting tool wear testing in research settings. However, conventional MV-TCM edge detection strategies process each image independently to infer edge positions, rendering them susceptible to inaccuracies when tool edges are compromised by material adhesion or chipping, resulting in imprecise wear measurements. In this study, an MV system is developed alongside an automated, feature-based image registration strategy to spatially align tool wear images, enabling a more consistent and accurate detection of tool edge position. The MV system was shown to be robust to the machining environment, versatile across both turning and milling machining centers and capable of reducing tool wear image capturing time up to 85% in reference to standard approaches. A comparison of feature detector-descriptor algorithms found SIFT, KAZE, and ORB to be the most suitable for MV-TCM registration, with KAZE presenting the highest accuracy and ORB being the most computationally efficient. The automated registration algorithm was shown to be efficient, performing registrations in 1.3 s on average and effective across a wide range of tool geometries and coating variations. The proposed tool reference line detection strategy, based on spatially aligned tool wear images, outperformed standard methods, resulting in average tool wear measurement errors of 2.5% and 4.5% in the turning and milling tests, respectively. Such a system allows machine tool operators to more efficiently capture cutting tool images while ensuring more reliable tool wear measurements. Full article
(This article belongs to the Special Issue Feature Papers in Sensing and Imaging 2024)
Show Figures

Figure 1

22 pages, 9206 KB  
Article
An Enhanced Multiscale Retinex, Oriented FAST and Rotated BRIEF (ORB), and Scale-Invariant Feature Transform (SIFT) Pipeline for Robust Key Point Matching in 3D Monitoring of Power Transmission Line Icing with Binocular Vision
by Nalini Rizkyta Nusantika, Jin Xiao and Xiaoguang Hu
Electronics 2024, 13(21), 4252; https://doi.org/10.3390/electronics13214252 - 30 Oct 2024
Cited by 7 | Viewed by 1686
Abstract
Power transmission line icing (PTLI) poses significant threats to the reliability and safety of electrical power systems, particularly in cold regions. Accumulation of ice on power lines can lead to severe consequences, such as line breaks, tower collapses, and widespread power outages, resulting [...] Read more.
Power transmission line icing (PTLI) poses significant threats to the reliability and safety of electrical power systems, particularly in cold regions. Accumulation of ice on power lines can lead to severe consequences, such as line breaks, tower collapses, and widespread power outages, resulting in economic losses and infrastructure damage. This study proposes an enhanced image processing pipeline to accurately detect and match key points in PTLI images for 3D monitoring of ice thickness using binocular vision. The pipeline integrates established techniques such as multiscale retinex (MSR), oriented FAST and rotated BRIEF (ORB) and scale-invariant feature transform (SIFT) algorithms, further refined with m-estimator sample consensus (MAGSAC)-based random sampling consensus (RANSAC) optimization. The image processing steps include automatic cropping, image enhancement, feature detection, and robust key point matching, all designed to operate in challenging environments with poor lighting and noise. Experiments demonstrate that the proposed method significantly improves key point matching accuracy and computational efficiency, reducing processing time to make it suitable for real-time applications. The effectiveness of the pipeline is validated through 3D ice thickness measurements, with results showing high precision and low error rates, making it a valuable tool for monitoring power transmission lines in harsh conditions. Full article
Show Figures

Figure 1

20 pages, 7583 KB  
Article
Object/Scene Recognition Based on a Directional Pixel Voting Descriptor
by Abiel Aguilar-González, Alejandro Medina Santiago and J. A. de Jesús Osuna-Coutiño
Appl. Sci. 2024, 14(18), 8187; https://doi.org/10.3390/app14188187 - 11 Sep 2024
Viewed by 1625
Abstract
Detecting objects in images is crucial for several applications, including surveillance, autonomous navigation, augmented reality, and so on. Although AI-based approaches such as Convolutional Neural Networks (CNNs) have proven highly effective in object detection, in scenarios where the objects being recognized are unknow, [...] Read more.
Detecting objects in images is crucial for several applications, including surveillance, autonomous navigation, augmented reality, and so on. Although AI-based approaches such as Convolutional Neural Networks (CNNs) have proven highly effective in object detection, in scenarios where the objects being recognized are unknow, it is difficult to generalize an AI model for such tasks. In another trend, feature-based approaches like SIFT, SURF, and ORB offer the capability to search any object but have limitations under complex visual variations. In this work, we introduce a novel edge-based object/scene recognition method. We propose that utilizing feature edges, instead of feature points, offers high performance under complex visual variations. Our primary contribution is a directional pixel voting descriptor based on image segments. Experimental results are promising; compared to previous approaches, ours demonstrates superior performance under complex visual variations and high processing speed. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
Show Figures

Figure 1

15 pages, 7315 KB  
Article
Computer Vision Algorithms on a Raspberry Pi 4 for Automated Depalletizing
by Danilo Greco, Majid Fasihiany, Ali Varasteh Ranjbar, Francesco Masulli, Stefano Rovetta and Alberto Cabri
Algorithms 2024, 17(8), 363; https://doi.org/10.3390/a17080363 - 18 Aug 2024
Cited by 1 | Viewed by 4675
Abstract
The primary objective of a depalletizing system is to automate the process of detecting and locating specific variable-shaped objects on a pallet, allowing a robotic system to accurately unstack them. Although many solutions exist for the problem in industrial and manufacturing settings, the [...] Read more.
The primary objective of a depalletizing system is to automate the process of detecting and locating specific variable-shaped objects on a pallet, allowing a robotic system to accurately unstack them. Although many solutions exist for the problem in industrial and manufacturing settings, the application to small-scale scenarios such as retail vending machines and small warehouses has not received much attention so far. This paper presents a comparative analysis of four different computer vision algorithms for the depalletizing task, implemented on a Raspberry Pi 4, a very popular single-board computer with low computer power suitable for the IoT and edge computing. The algorithms evaluated include the following: pattern matching, scale-invariant feature transform, Oriented FAST and Rotated BRIEF, and Haar cascade classifier. Each technique is described and their implementations are outlined. Their evaluation is performed on the task of box detection and localization in the test images to assess their suitability in a depalletizing system. The performance of the algorithms is given in terms of accuracy, robustness to variability, computational speed, detection sensitivity, and resource consumption. The results reveal the strengths and limitations of each algorithm, providing valuable insights for selecting the most appropriate technique based on the specific requirements of a depalletizing system. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

26 pages, 15055 KB  
Article
Building Better Models: Benchmarking Feature Extraction and Matching for Structure from Motion at Construction Sites
by Carlos Roberto Cueto Zumaya, Iacopo Catalano and Jorge Peña Queralta
Remote Sens. 2024, 16(16), 2974; https://doi.org/10.3390/rs16162974 - 14 Aug 2024
Cited by 2 | Viewed by 6195
Abstract
The popularity of Structure from Motion (SfM) techniques has significantly advanced 3D reconstruction in various domains, including construction site mapping. Central to SfM, is the feature extraction and matching process, which identifies and correlates keypoints across images. Previous benchmarks have assessed traditional and [...] Read more.
The popularity of Structure from Motion (SfM) techniques has significantly advanced 3D reconstruction in various domains, including construction site mapping. Central to SfM, is the feature extraction and matching process, which identifies and correlates keypoints across images. Previous benchmarks have assessed traditional and learning-based methods for these tasks but have not specifically focused on construction sites, often evaluating isolated components of the SfM pipeline. This study provides a comprehensive evaluation of traditional methods (e.g., SIFT, AKAZE, ORB) and learning-based methods (e.g., D2-Net, DISK, R2D2, SuperPoint, SOSNet) within the SfM pipeline for construction site mapping. It also compares matching techniques, including SuperGlue and LightGlue, against traditional approaches such as nearest neighbor. Our findings demonstrate that deep learning-based methods such as DISK with LightGlue and SuperPoint with various matchers consistently outperform traditional methods like SIFT in both reconstruction quality and computational efficiency. Overall, the deep learning methods exhibited better adaptability to complex construction environments, leveraging modern hardware effectively, highlighting their potential for large-scale and real-time applications in construction site mapping. This benchmark aims to assist researchers in selecting the optimal combination of feature extraction and matching methods for SfM applications at construction sites. Full article
(This article belongs to the Special Issue Remote Sensing for 2D/3D Mapping)
Show Figures

Figure 1

17 pages, 3841 KB  
Article
An Image-Based User Interface Testing Method for Flutter Programming Learning Assistant System
by Soe Thandar Aung, Nobuo Funabiki, Lynn Htet Aung, Safira Adine Kinari, Khaing Hsu Wai and Mustika Mentari
Information 2024, 15(8), 464; https://doi.org/10.3390/info15080464 - 3 Aug 2024
Cited by 4 | Viewed by 2870
Abstract
Flutter has become popular for providing a uniform development environment for user interfaces (UIs) on smart phones, web browsers, and desktop applications. We have developed the Flutter programming learning assistant system (FPLAS) to assist its novice students’ self-study. We implemented the Docker-based Flutter [...] Read more.
Flutter has become popular for providing a uniform development environment for user interfaces (UIs) on smart phones, web browsers, and desktop applications. We have developed the Flutter programming learning assistant system (FPLAS) to assist its novice students’ self-study. We implemented the Docker-based Flutter environment with Visual Studio Code and three introductory exercise projects. However, the correctness of students’ answers is manually checked, although automatic checking is necessary to reduce teachers’ workload and provide quick responses to students. This paper presents an image-based user interface (UI) testing method to automate UI testing by the answer code using the Flask framework. This method produces the UI image by running the answer code and compares it with the image made by the model code for the assignment using ORB and SIFT algorithms in the OpenCV library. One notable aspect is the necessity to capture multiple UI screenshots through page transitions by user input actions for the accurate detection of changes in UI elements. For evaluations, we assigned five Flutter exercise projects to fourth-year bachelor and first-year master engineering students at Okayama University, Japan, and applied the proposed method to their answers. The results confirm the effectiveness of the proposal. Full article
Show Figures

Figure 1

19 pages, 2793 KB  
Article
Key-Point-Descriptor-Based Image Quality Evaluation in Photogrammetry Workflows
by Dalius Matuzevičius, Vytautas Urbanavičius, Darius Miniotas, Šarūnas Mikučionis, Raimond Laptik and Andrius Ušinskas
Electronics 2024, 13(11), 2112; https://doi.org/10.3390/electronics13112112 - 29 May 2024
Cited by 4 | Viewed by 3104
Abstract
Photogrammetry depends critically on the quality of the images used to reconstruct accurate and detailed 3D models. Selection of high-quality images not only improves the accuracy and resolution of the resulting 3D models, but also contributes to the efficiency of the photogrammetric process [...] Read more.
Photogrammetry depends critically on the quality of the images used to reconstruct accurate and detailed 3D models. Selection of high-quality images not only improves the accuracy and resolution of the resulting 3D models, but also contributes to the efficiency of the photogrammetric process by reducing data redundancy and computational demands. This study presents a novel approach to image quality evaluation tailored for photogrammetric applications that uses the key point descriptors typically encountered in image matching. Using a LightGBM ranker model, this research evaluates the effectiveness of key point descriptors such as SIFT, SURF, BRISK, ORB, KAZE, FREAK, and SuperPoint in predicting image quality. These descriptors are evaluated for their ability to indicate image quality based on the image patterns they capture. Experiments conducted on various publicly available image datasets show that descriptor-based methods outperform traditional no-reference image quality metrics such as BRISQUE, NIQE, PIQE, and BIQAA and a simple sharpness-based image quality evaluation method. The experimental results highlight the potential of using key-point-descriptor-based image quality evaluation methods to improve the photogrammetric workflow by selecting high-quality images for 3D modeling. Full article
(This article belongs to the Special Issue IoT-Enabled Smart Devices and Systems in Smart Environments)
Show Figures

Figure 1

21 pages, 5094 KB  
Article
TQU-SLAM Benchmark Dataset for Comparative Study to Build Visual Odometry Based on Extracted Features from Feature Descriptors and Deep Learning
by Thi-Hao Nguyen, Van-Hung Le, Huu-Son Do, Trung-Hieu Te and Van-Nam Phan
Future Internet 2024, 16(5), 174; https://doi.org/10.3390/fi16050174 - 17 May 2024
Cited by 4 | Viewed by 4524
Abstract
The problem of data enrichment to train visual SLAM and VO construction models using deep learning (DL) is an urgent problem today in computer vision. DL requires a large amount of data to train a model, and more data with many different contextual [...] Read more.
The problem of data enrichment to train visual SLAM and VO construction models using deep learning (DL) is an urgent problem today in computer vision. DL requires a large amount of data to train a model, and more data with many different contextual and conditional conditions will create a more accurate visual SLAM and VO construction model. In this paper, we introduce the TQU-SLAM benchmark dataset, which includes 160,631 RGB-D frame pairs. It was collected from the corridors of three interconnected buildings comprising a length of about 230 m. The ground-truth data of the TQU-SLAM benchmark dataset were prepared manually, including 6-DOF camera poses, 3D point cloud data, intrinsic parameters, and the transformation matrix between the camera coordinate system and the real world. We also tested the TQU-SLAM benchmark dataset using the PySLAM framework with traditional features such as SHI_TOMASI, SIFT, SURF, ORB, ORB2, AKAZE, KAZE, and BRISK and features extracted from DL such as VGG, DPVO, and TartanVO. The camera pose estimation results are evaluated, and we show that the ORB2 features have the best results (Errd = 5.74 mm), while the ratio of the number of frames with detected keypoints of the SHI_TOMASI feature is the best (rd=98.97%). At the same time, we also present and analyze the challenges of the TQU-SLAM benchmark dataset for building visual SLAM and VO systems. Full article
(This article belongs to the Special Issue Machine Learning Techniques for Computer Vision)
Show Figures

Figure 1

Back to TopTop