MDPI - Publisher of Open Access Journals

11 pages, 4106 KB

Open AccessArticle

UAV Detection in Low-Altitude Scenarios Based on the Fusion of Unaligned Dual-Spectrum Images

by Zishuo Huang, Guhao Zhao, Yarong Wu and Chuanjin Dai

Drones 2026, 10(1), 40; https://doi.org/10.3390/drones10010040 - 7 Jan 2026

Viewed by 706

The threat posed by unauthorized drones to public airspace has become increasingly critical. To address the challenge of UAV detection in unaligned visible–infrared dual-spectral images, we present a novel framework that comprises two sequential stages: image alignment and object detection. The Speeded-Up Robust [...] Read more.

The threat posed by unauthorized drones to public airspace has become increasingly critical. To address the challenge of UAV detection in unaligned visible–infrared dual-spectral images, we present a novel framework that comprises two sequential stages: image alignment and object detection. The Speeded-Up Robust Features (SURF) algorithm is applied for feature matching, combined with the gray centroid method to remove mismatched feature points. A plane-adaptive pixel remapping algorithm is further developed to achieve images fusion. In addition, an enhanced YOLOv11 model with a modified loss function is employed to achieve robust object detection in the fused images. Experimental results demonstrate that the proposed method enables precise pixel-level dual-spectrum fusion and reliable UAV detection under diverse and complex conditions. Full article

(This article belongs to the Special Issue Detection, Identification and Tracking of UAVs and Drones)

► Show Figures

Figure 1

18 pages, 2150 KB

Open AccessArticle

Balancing Feature Symmetry: IFEM-YOLOv13 for Robust Underwater Object Detection Under Degradation

by Zhen Feng and Fanghua Liu

Symmetry 2025, 17(9), 1531; https://doi.org/10.3390/sym17091531 - 13 Sep 2025

Cited by 4 | Viewed by 1658

Abstract

This paper proposes IFEM-YOLOv13, a high-precision underwater target detection method designed to address challenges such as image degradation, low contrast, and small target obscurity caused by light attenuation, scattering, and biofouling. Its core innovation is an end-to-end degradation-aware system featuring: (1) an Intelligent [...] Read more.

This paper proposes IFEM-YOLOv13, a high-precision underwater target detection method designed to address challenges such as image degradation, low contrast, and small target obscurity caused by light attenuation, scattering, and biofouling. Its core innovation is an end-to-end degradation-aware system featuring: (1) an Intelligent Feature Enhancement Module (IFEM) that employs learnable sharpening and pixel-level filtering for adaptive optical compensation, incorporating principles of symmetry in its multi-branch enhancement to balance color and structural recovery; (2) a degradation-aware Focal Loss incorporating dynamic gradient remapping and class balancing to mitigate sample imbalance through symmetry-preserving optimization; and (3) a cross-layer feature association mechanism for multi-scale contextual modeling that respects the inherent scale symmetry of natural objects. Evaluated on the J-EDI dataset, IFEM-YOLOv13 achieves 98.6% mAP@0.5 and 82.1% mAP@0.5:0.95, outperforming the baseline YOLOv13 by 0.7% and 3.0%, respectively. With only 2.5 M parameters and operating at 217 FPS, it surpasses methods including Faster R-CNN, YOLO variants, and RE-DETR. These results demonstrate its robust real-time detection capability for diverse underwater targets such as plastic debris, biofouled objects, and artificial structures, while effectively handling the symmetry-breaking distortions introduced by the underwater environment. Full article

(This article belongs to the Section Engineering and Materials)

► Show Figures

Figure 1

20 pages, 4527 KB

Open AccessArticle

Hyperspectral Image Classification with the Orthogonal Self-Attention ResNet and Two-Step Support Vector Machine

by Heting Sun, Liguo Wang, Haitao Liu and Yinbang Sun

Remote Sens. 2024, 16(6), 1010; https://doi.org/10.3390/rs16061010 - 13 Mar 2024

Cited by 12 | Viewed by 2778

Abstract

Hyperspectral image classification plays a crucial role in remote sensing image analysis by classifying pixels. However, the existing methods require more spatial–global information interaction and feature extraction capabilities. To overcome these challenges, this paper proposes a novel model for hyperspectral image classification using [...] Read more.

Hyperspectral image classification plays a crucial role in remote sensing image analysis by classifying pixels. However, the existing methods require more spatial–global information interaction and feature extraction capabilities. To overcome these challenges, this paper proposes a novel model for hyperspectral image classification using an orthogonal self-attention ResNet and a two-step support vector machine (OSANet-TSSVM). The OSANet-TSSVM model comprises two essential components: a deep feature extraction network and an improved support vector machine (SVM) classification module. The deep feature extraction network incorporates an orthogonal self-attention module (OSM) and a channel attention module (CAM) to enhance the spatial–spectral feature extraction. The OSM focuses on computing 2D self-attention weights for the orthogonal dimensions of an image, resulting in a reduced number of parameters while capturing comprehensive global contextual information. In contrast, the CAM independently learns attention weights along the channel dimension. The CAM autonomously learns attention weights along the channel dimension, enabling the deep network to emphasise crucial channel information and enhance the spectral feature extraction capability. In addition to the feature extraction network, the OSANet-TSSVM model leverages an improved SVM classification module known as the two-step support vector machine (TSSVM) model. This module preserves the discriminative outcomes of the first-level SVM subclassifier and remaps them as new features for the TSSVM training. By integrating the results of the two classifiers, the deficiencies of the individual classifiers were effectively compensated, resulting in significantly enhanced classification accuracy. The performance of the proposed OSANet-TSSVM model was thoroughly evaluated using public datasets. The experimental results demonstrated that the model performed well in both subjective and objective evaluation metrics. The superiority of this model highlights its potential for advancing hyperspectral image classification in remote sensing applications. Full article

(This article belongs to the Special Issue Recent Advances in the Processing of Hyperspectral Images)

► Show Figures

Figure 1

18 pages, 15469 KB

Open AccessArticle

Representation Learning Method for Circular Seal Based on Modified MLP-Mixer

by Yuan Cao, You Zhou, Zhiwen Zhang and Enyi Yao

Entropy 2023, 25(11), 1521; https://doi.org/10.3390/e25111521 - 6 Nov 2023

Viewed by 2386

Abstract

This study proposes Stamp-MLP, an enhanced seal impression representation learning technique based on MLP-Mixer. Instead of using the patch linear mapping preprocessing method, this technique uses circular seal remapping, which reserves the seals’ underlying pixel-level information. In the proposed Stamp-MLP, the average pooling [...] Read more.

This study proposes Stamp-MLP, an enhanced seal impression representation learning technique based on MLP-Mixer. Instead of using the patch linear mapping preprocessing method, this technique uses circular seal remapping, which reserves the seals’ underlying pixel-level information. In the proposed Stamp-MLP, the average pooling is replaced by a global pooling of attention to extract the information more comprehensively. There were three classification tasks in our proposed method: categorizing the seal surface, identifying the product type, and distinguishing individual seals. The three tasks shared an identical dataset comprising 81 seals, encompassing 16 distinct seal surfaces, with each surface featuring six diverse product types. The experiment results showed that, in comparison to MLP-Mixer, VGG16, and ResNet50, the proposed Stamp-MLP achieved the highest classification accuracy (89.61%) in seal surface classification tasks with fewer training samples. Meanwhile, Stamp-MLP outperformed the others with accuracy rates of 90.68% and 91.96% in the product type and seal impression classification tasks, respectively. Moreover, Stamp-MLP had the fewest model parameters (2.67 M). Full article

(This article belongs to the Special Issue Representation Learning: Theory, Applications and Ethical Issues II)

► Show Figures

Figure 1

26 pages, 5025 KB

Open AccessFeature PaperArticle

Memory-Efficient Fixed-Length Representation of Synchronous Event Frames for Very-Low-Power Chip Integration

by Ionut Schiopu and Radu Ciprian Bilcu

Electronics 2023, 12(10), 2302; https://doi.org/10.3390/electronics12102302 - 19 May 2023

Cited by 4 | Viewed by 2136

Abstract

The new event cameras are now widely used in many computer vision applications. Their high raw data bitrate levels require a more efficient fixed-length representation for low-bandwidth transmission from the event sensor to the processing chip. A novel low-complexity lossless compression framework is [...] Read more.

The new event cameras are now widely used in many computer vision applications. Their high raw data bitrate levels require a more efficient fixed-length representation for low-bandwidth transmission from the event sensor to the processing chip. A novel low-complexity lossless compression framework is proposed for encoding the synchronous event frames (EFs) by introducing a novel memory-efficient fixed-length representation suitable for hardware implementation in the very-low-power (VLP) event-processing chip. A first contribution proposes an improved representation of the ternary frames using pixel-group frame partitioning and symbol remapping. Another contribution proposes a novel low-complexity memory-efficient fixed-length representation using multi-level lookup tables (LUTs). Complex experimental analysis is performed using a set of group-size configurations. For very-large group-size configurations, an improved representation is proposed using a mask-LUT structure. The experimental evaluation on a public dataset demonstrates that the proposed fixed-length coding framework provides at least two times the compression ratio relative to the raw EF representation and a close performance compared with variable-length video coding standards and variable-length state-of-the-art image codecs for lossless compression of ternary EFs generated at frequencies bellow one KHz. To our knowledge, the paper is the first to introduce a low-complexity memory-efficient fixed-length representation for lossless compression of synchronous EFs, suitable for integration into a VLP event-processing chip. Full article

(This article belongs to the Special Issue Selected Papers from Young Researchers in Signal/Image/Video Coding and Processing)

► Show Figures

Figure 1

16 pages, 12607 KB

Open AccessArticle

Generation of Synthetic-Pseudo MR Images from Real CT Images

by Isam F. Abu-Qasmieh, Ihssan S. Masad, Hiam H. Al-Quran and Khaled Z. Alawneh

Tomography 2022, 8(3), 1244-1259; https://doi.org/10.3390/tomography8030103 - 3 May 2022

Cited by 5 | Viewed by 3401

Abstract

This study aimed to generate synthetic MR images from real CT images. CT# mean and standard deviation of a moving window across every pixel in the reconstructed CT images were mapped to their corresponding tissue-mimicking types. Identification of the tissue enabled remapping it [...] Read more.

This study aimed to generate synthetic MR images from real CT images. CT# mean and standard deviation of a moving window across every pixel in the reconstructed CT images were mapped to their corresponding tissue-mimicking types. Identification of the tissue enabled remapping it to its corresponding intrinsic parameters: T1, T2, and proton density (ρ). Lastly, synthetic weighted MR images of a selected slice were generated by simulating a spin-echo sequence using the intrinsic parameters and proper contrast parameters (TE and TR). Experiments were performed on a 3D multimodality abdominal phantom and on human knees at different TE and TR parameters to confirm the clinical effectiveness of the approach. Results demonstrated the validity of the approach of generating synthetic MR images at different weightings using only CT images and the three predefined mapping functions. The slope of the fitting line and percentage root-mean-square difference (PRD) between real and synthetic image vector representations were (0.73, 10%), (0.9, 18%), and (0.2, 8.7%) for T1-, T2-, and ρ-weighted images of the phantom, respectively. The slope and PRD for human knee images, on average, were 0.89% and 18.8%, respectively. The generated MR images provide valuable guidance for physicians with regard to deciding whether acquiring real MR images is crucial. Full article

► Show Figures

Figure 1

10 pages, 2061 KB

Open AccessArticle

Deep Vision for Breast Cancer Classification and Segmentation

by Lawrence Fulton, Alex McLeod, Diane Dolezel, Nathaniel Bastian and Christopher P. Fulton

Cancers 2021, 13(21), 5384; https://doi.org/10.3390/cancers13215384 - 27 Oct 2021

Cited by 13 | Viewed by 3575

Abstract

(1) Background: Female breast cancer diagnoses odds have increased from 11:1 in 1975 to 8:1 today. Mammography false positive rates (FPR) are associated with overdiagnoses and overtreatment, while false negative rates (FNR) increase morbidity and mortality. (2) Methods: Deep vision supervised learning classifies [...] Read more.

(1) Background: Female breast cancer diagnoses odds have increased from 11:1 in 1975 to 8:1 today. Mammography false positive rates (FPR) are associated with overdiagnoses and overtreatment, while false negative rates (FNR) increase morbidity and mortality. (2) Methods: Deep vision supervised learning classifies 299 × 299 pixel de-noised mammography images as negative or non-negative using models built on 55,890 pre-processed training images and applied to 15,364 unseen test images. A small image representation from the fitted training model is returned to evaluate the portion of the loss function gradient with respect to the image that maximizes the classification probability. This gradient is then re-mapped back to the original images, highlighting the areas of the original image that are most influential for classification (perhaps masses or boundary areas). (3) Results: initial classification results were 97% accurate, 99% specific, and 83% sensitive. Gradient techniques for unsupervised region of interest mapping identified areas most associated with the classification results clearly on positive mammograms and might be used to support clinician analysis. (4) Conclusions: deep vision techniques hold promise for addressing the overdiagnoses and treatment, underdiagnoses, and automated region of interest identification on mammography. Full article

(This article belongs to the Topic Application of Big Medical Data in Precision Medicine)

► Show Figures

Figure 1

24 pages, 21801 KB

Open AccessArticle

Assessment and Improvement of the Pattern Recognition Performance of Memdiode-Based Cross-Point Arrays with Randomly Distributed Stuck-at-Faults

by Fernando L. Aguirre, Sebastián M. Pazos, Félix Palumbo, Antoni Morell, Jordi Suñé and Enrique Miranda

Electronics 2021, 10(19), 2427; https://doi.org/10.3390/electronics10192427 - 6 Oct 2021

Cited by 3 | Viewed by 3549

Abstract

In this work, the effect of randomly distributed stuck-at faults (SAFs) in memristive cross-point array (CPA)-based single and multi-layer perceptrons (SLPs and MLPs, respectively) intended for pattern recognition tasks is investigated by means of realistic SPICE simulations. The quasi-static memdiode model (QMM) is [...] Read more.

In this work, the effect of randomly distributed stuck-at faults (SAFs) in memristive cross-point array (CPA)-based single and multi-layer perceptrons (SLPs and MLPs, respectively) intended for pattern recognition tasks is investigated by means of realistic SPICE simulations. The quasi-static memdiode model (QMM) is considered here for the modelling of the synaptic weights implemented with memristors. Following the standard memristive approach, the QMM comprises two coupled equations, one for the electron transport based on the double-diode equation with a single series resistance and a second equation for the internal memory state of the device based on the so-called logistic hysteron. By modifying the state parameter in the current-voltage characteristic, SAFs of different severeness are simulated and the final outcome is analysed. Supervised ex-situ training and two well-known image datasets involving hand-written digits and human faces are employed to assess the inference accuracy of the SLP as a function of the faulty device ratio. The roles played by the memristor’s electrical parameters, line resistance, mapping strategy, image pixelation, and fault type (stuck-at-ON or stuck-at-OFF) on the CPA performance are statistically analysed following a Monte-Carlo approach. Three different re-mapping schemes to help mitigate the effect of the SAFs in the SLP inference phase are thoroughly investigated. Full article

(This article belongs to the Special Issue RRAM Devices: Multilevel State Control and Applications)

► Show Figures

Graphical abstract

16 pages, 6987 KB

Open AccessArticle

Outdoor Scene Understanding Based on Multi-Scale PBA Image Features and Point Cloud Features

by Yisha Liu, Yufeng Gu, Fei Yan and Yan Zhuang

Sensors 2019, 19(20), 4546; https://doi.org/10.3390/s19204546 - 19 Oct 2019

Cited by 3 | Viewed by 3256

Abstract

Outdoor scene understanding based on the results of point cloud classification plays an important role in mobile robots and autonomous vehicles equipped with a light detection and ranging (LiDAR) system. In this paper, a novel model named Panoramic Bearing Angle (PBA) images is [...] Read more.

Outdoor scene understanding based on the results of point cloud classification plays an important role in mobile robots and autonomous vehicles equipped with a light detection and ranging (LiDAR) system. In this paper, a novel model named Panoramic Bearing Angle (PBA) images is proposed which is generated from 3D point clouds. In a PBA model, laser point clouds are projected onto the spherical surface to establish the correspondence relationship between the laser ranging point and the image pixels, and then we use the relative location relationship of the laser point in the 3D space to calculate the gray value of the corresponding pixel. To extract robust features from 3D laser point clouds, both image pyramid model and point cloud pyramid model are utilized to extract multiple-scale features from PBA images and original point clouds, respectively. A Random Forest classifier is used to accomplish feature screening on extracted high-dimensional features to obtain the initial classification results. Moreover, reclassification is carried out to correct the misclassification points by remapping the classification results into the PBA images and using superpixel segmentation, which makes full use of the contextual information between laser points. Within each superpixel block, the reclassification is carried out again based on the results of the initial classification results, so as to correct some misclassification points and improve the classification accuracy. Two datasets published by ETH Zurich and MINES ParisTech are used to test the classification performance, and the results show the precision and recall rate of the proposed algorithms. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

24 pages, 59965 KB

Open AccessFeature PaperArticle

A Parametric Method for Remapping and Calibrating Fisheye Images for Glare Analysis

by Ayman Wagdy, Veronica Garcia-Hansen, Gillian Isoardi and Kieu Pham

Buildings 2019, 9(10), 219; https://doi.org/10.3390/buildings9100219 - 16 Oct 2019

Cited by 17 | Viewed by 9196

Abstract

High Dynamic Range (HDR) imaging using a fisheye lens has provided new opportunities to evaluate the luminous environment in visual comfort research. For glare analysis, strict calibration is necessary to extract accurate luminous maps to achieve reliable glare results. Most studies have focused [...] Read more.

High Dynamic Range (HDR) imaging using a fisheye lens has provided new opportunities to evaluate the luminous environment in visual comfort research. For glare analysis, strict calibration is necessary to extract accurate luminous maps to achieve reliable glare results. Most studies have focused on correcting the vignetting effect in HDR imaging during post-calibration. However, the lens projection also contributes to luminous map errors because of its inherent distortion. To date, there is no simple method to correct this distortion phenomenon for glare analysis. This paper presents a parametric-based methodology to correct the projection distortion from fisheye lenses for the specific use in glare analysis. HDR images were captured to examine two devices: a 190° equisolid SIGMA 8 mm F3.5 EX DG fisheye lens mounted on a Canon 5D camera, and a 195° fisheye commercial lens with an unknown projection, mounted on the rear camera of a Samsung Galaxy S7. A mathematical and geometrical model was developed to remap each pixel to correct the projection distortion using Grasshopper and MATLAB. The parametric-based method was validated using Radiance and MATLAB through checking the accuracy of pixel remapping and measuring color distortion with Structural Similarity Index (SSIM). Glare scores were used to compare the results between both devices, which validates the use of mobile phones in photometric research. The results showed that this method can be used to correct HDR images projection distortion for more accurate evaluation of the luminous environment in glare research. Full article

► Show Figures

Graphical abstract

21 pages, 10301 KB

Open AccessArticle

An Assessment of Satellite-Derived Rainfall Products Relative to Ground Observations over East Africa

by Margaret Wambui Kimani, Joost C. B. Hoedjes and Zhongbo Su

Remote Sens. 2017, 9(5), 430; https://doi.org/10.3390/rs9050430 - 2 May 2017

Cited by 149 | Viewed by 11734

Abstract

Accurate and consistent rainfall observations are vital for climatological studies in support of better agricultural and water management decision-making and planning. In East Africa, accurate rainfall estimation with an adequate spatial distribution is limited due to sparse rain gauge networks. Satellite rainfall products [...] Read more.

Accurate and consistent rainfall observations are vital for climatological studies in support of better agricultural and water management decision-making and planning. In East Africa, accurate rainfall estimation with an adequate spatial distribution is limited due to sparse rain gauge networks. Satellite rainfall products can potentially play a role in increasing the spatial coverage of rainfall estimates; however, their performance needs to be understood across space–time scales and factors relating to their errors. This study assesses the performance of seven satellite products: Tropical Applications of Meteorology using Satellite and ground-based observations (TAMSAT), African Rainfall Climatology And Time series (TARCAT), Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS), Tropical Rainfall Measuring Mission (TRMM-3B43), Climate Prediction Centre (CPC) Morphing technique (CMORPH), Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks Climate Data Record (PERSIANN-CDR), CPC Merged Analysis of Precipitation (CMAP), and Global Precipitation Climatology Project (GPCP), using locally developed gridded (0.05°) rainfall data for 15 years (1998–2012) over East Africa. The products’ assessments were done at monthly and yearly timescales and were remapped to the gridded rain gauge data spatial scale during the March to May (MAM) and October to December (OND) rainy seasons. A grid-based statistical comparison between the two datasets was used, but only pixel values located at the rainfall stations were considered for validation. Additionally, the impact of topography on the performance of the products was assessed by analyzing the pixels in areas of highest negative bias. All the products could substantially replicate rainfall patterns, but their differences are mainly based on retrieving high rainfall amounts, especially of localized orographic types. The products exhibited systematic errors, which decreased with an increase in temporal resolution from a monthly to yearly scale. Challenges in retrieving orographic rainfall, especially during the OND season, were identified as the main cause of high underestimations. Underestimation was observed when elevation was <2500 m and above this threshold; overestimation was evident in mountainous areas. CMORPH, CHIRPS, and TRMM showed consistently high performance during both seasons, and this was attributed to their ability to retrieve rainfall of different rainfall regimes. Full article

(This article belongs to the Special Issue Uncertainties in Remote Sensing)

► Show Figures

Graphical abstract

11 pages, 1079 KB

Open AccessArticle

Dynamic Programming for Re-Mapping Noisy Fixations in Translation Tasks

by Michael Carl

J. Eye Mov. Res. 2013, 6(2), 1-11; https://doi.org/10.16910/jemr.6.2.5 - 5 Aug 2013

Cited by 11 | Viewed by 290

Abstract

Eyetrackers which allow for free head movements are in many cases imprecise to the extent that reading patterns become heavily distorted. The poor usability and interpretability of these gaze patterns is corroborated by a “naïve” fixation-to-symbol mapping, which often wrongly maps the possibly [...] Read more.

Eyetrackers which allow for free head movements are in many cases imprecise to the extent that reading patterns become heavily distorted. The poor usability and interpretability of these gaze patterns is corroborated by a “naïve” fixation-to-symbol mapping, which often wrongly maps the possibly drifted center of the observed fixation onto the symbol directly below it. In this paper I extend this naïve fixation-to-symbol mapping by introducing background knowledge about the translation task. In a first step, the sequence of fixation-tosymbol mappings is extended into a lattice of several possible fixated symbols, including those on the line above and below the naïve fixation mapping. In a second step a dynamic programming algorithm applies a number of heuristics to find the best path through the lattice, based on the probable distance in characters, in words and in pixels between successive fixations and the symbol locations, so as to smooth the gazing path according to the background gazing model. A qualitative and quantitative evaluation shows that the algorithm increases the accuracy of the re-mapped symbol sequence. Full article

► Show Figures

Figure 1

Search Results (12)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (12)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI