MDPI - Publisher of Open Access Journals

47 pages, 9682 KB

Open AccessArticle

Unsupervised Hierarchical Visual Taxonomy of Marble Natural Stone Using Cluster-Aware Self-Supervised Vision Transformers

by Margarida Figueiredo, Carlos M. A. Diogo, Gustavo Paneiro, Pedro Amaral and António Alves de Campos

Appl. Sci. 2026, 16(9), 4137; https://doi.org/10.3390/app16094137 - 23 Apr 2026

Viewed by 122

Abstract

The marble industry relies on proprietary commercial names rather than objective visual categories, creating market inefficiencies for stakeholders who select stones based on appearance. Supervised classification perpetuates this problem by replicating inconsistent commercial labels instead of discovering intrinsic visual structure. We propose an [...] Read more.

The marble industry relies on proprietary commercial names rather than objective visual categories, creating market inefficiencies for stakeholders who select stones based on appearance. Supervised classification perpetuates this problem by replicating inconsistent commercial labels instead of discovering intrinsic visual structure. We propose an unsupervised pipeline combining a two-stage training strategy: A pure self-supervised pretraining followed by cluster-aware fine-tuning of a DINO Vision Transformer, with empirically selected dimensionality reduction and agglomerative hierarchical clustering. Systematic ablation studies on 1480 marble images spanning 10 commercial varieties validate each design choice: cluster-aware training at k = 10 yields geometrically improved embeddings over the self-supervised baseline (mean Silhouette Score 0.693 ± 0.053 vs. 0.660 ± 0.030; mean Davies–Bouldin Index 0.386 ± 0.075 vs. 0.569 ± 0.012; N = 9 independent evaluations across 3 data partitions × 3 training initializations). The resulting taxonomy reveals three phenomena invisible to commercial classification: cross-category merging of visually indistinguishable stones carrying different market names, intra-category splitting of heterogeneous sub-populations within single varieties, and coherent grouping where commercial and visual boundaries coincide, with all three confirmed in every independent run. We further demonstrate that standard extrinsic metrics are misaligned with unsupervised taxonomy objectives when reference labels encode the inconsistencies the method aims to resolve. Validating this methodology across diverse stone types, larger datasets, and varied acquisition conditions represents a natural and necessary next step toward establishing its cross-domain generalizability. Full article

(This article belongs to the Special Issue Recent Advances and New Trends in Computer Vision and Image Processing)

21 pages, 956 KB

Open AccessArticle

GPU-Based Parallel Euclidean Distance Transform Algorithm

by Yucheng Lu, Xiaoying Zhu, Anlong Pang and Xi He

Mathematics 2026, 14(4), 597; https://doi.org/10.3390/math14040597 - 9 Feb 2026

Viewed by 469

Abstract

Euclidean distance transform (EDT) often suffers from high computational complexity and limited processing efficiency, especially when applied to large-scale images. To address these challenges, this paper proposes a GPU-based parallel EDT algorithm. The proposed approach first partitions the input image into multiple horizontal [...] Read more.

Euclidean distance transform (EDT) often suffers from high computational complexity and limited processing efficiency, especially when applied to large-scale images. To address these challenges, this paper proposes a GPU-based parallel EDT algorithm. The proposed approach first partitions the input image into multiple horizontal sub-blocks. For each sub-block, a row-wise recursive computation strategy is adopted to construct its Voronoi diagram in parallel, thereby reducing computational overhead by exploiting the strong structural similarity between the Voronoi diagrams of adjacent rows. Based on the Voronoi diagrams of all sub-blocks, the Euclidean distance from each pixel to the nearest background pixel is subsequently evaluated, completing the transform. Experimental results demonstrate that the proposed algorithm achieves up to a 52× speedup over traditional CPU-based EDT methods, leading to a substantial improvement in computational performance. Nevertheless, the scalability of the method is influenced by GPU memory capacity and the chosen sub-block partitioning strategy when processing extremely large images. Moreover, the core idea of leveraging inter-row Voronoi similarity to reduce redundant computation can be naturally extended to higher-dimensional exact EDT as well as approximate EDT variants. Full article

(This article belongs to the Special Issue Advances in High-Speed Computing and Parallel Algorithm)

► Show Figures

Figure 1

36 pages, 35595 KB

Open AccessArticle

Robust ISAR Autofocus for Maneuvering Ships Using Centerline-Driven Adaptive Partitioning and Resampling

by Wenao Ruan, Chang Liu and Dahu Wang

Remote Sens. 2026, 18(1), 105; https://doi.org/10.3390/rs18010105 - 27 Dec 2025

Viewed by 651

Abstract

Synthetic aperture radar (SAR) is a critical enabling technology for maritime surveillance. However, maneuvering ships often appear defocused in SAR images, posing significant challenges for subsequent ship detection and recognition. To address this problem, this study proposes an improved iteration phase gradient resampling [...] Read more.

Synthetic aperture radar (SAR) is a critical enabling technology for maritime surveillance. However, maneuvering ships often appear defocused in SAR images, posing significant challenges for subsequent ship detection and recognition. To address this problem, this study proposes an improved iteration phase gradient resampling autofocus (IIPGRA) method. First, we extract the defocused ships from SAR images, followed by azimuth decompression and translational motion compensation. Subsequently, a centerline-driven adaptive azimuth partitioning strategy is proposed: the geometric centerline of the vessel is extracted from coarsely focused images using an enhanced RANSAC algorithm, and the target is partitioned into upper and lower sub-blocks along the azimuth direction to maximize the separation of rotational centers between sub-blocks, establishing a foundation for the accurate estimation of spatially variant phase errors. Next, phase gradient autofocus (PGA) is employed to estimate the phase errors of each sub-block and compute their differential. Then, resampling the original echoes based on this differential phase error linearizes non-uniform rotational motion. Furthermore, this study introduces the Rotational Uniformity Coefficient (β) as the convergence criterion. This coefficient can stably and reliably quantify the linearity of the rotational phase, thereby ensuring robust termination of the iterative process. Simulation and real airborne SAR data validate the effectiveness of the proposed algorithm. Full article

(This article belongs to the Special Issue Advances in Imaging Radar Signal Processing, Target Feature Extraction and Recognition)

► Show Figures

Figure 1

26 pages, 14895 KB

Open AccessArticle

Robust Watermarking Algorithm Based on QGT and Neighborhood Coefficient Statistical Features

by Junlin Ouyang, Ruijie Wang and Tingjian Shi

Electronics 2025, 14(22), 4494; https://doi.org/10.3390/electronics14224494 - 18 Nov 2025

Viewed by 665

Abstract

The exponential advancement of the Internet of Things and artificial intelligence technologies has significantly accelerated digital content generation and dissemination, intensifying challenges in copyright protection, identity theft, and privacy breaches. Traditional digital watermarking techniques, constrained by vulnerabilities to geometric attacks and perceptual distortions, [...] Read more.

The exponential advancement of the Internet of Things and artificial intelligence technologies has significantly accelerated digital content generation and dissemination, intensifying challenges in copyright protection, identity theft, and privacy breaches. Traditional digital watermarking techniques, constrained by vulnerabilities to geometric attacks and perceptual distortions, fail to meet the demands of modern complex application scenarios. To address these limitations, this paper proposes a robust watermarking algorithm based on quaternion Gyrator transform and neighborhood coefficient statistical features, designed to enhance copyright protection efficacy. The methodology involves three key innovations: (1) The host image is partitioned into non-overlapping sub-blocks, with an inhomogeneity metric calculated from local texture and edge characteristics to prioritize embedding sequence optimization; (2) quaternion Gyrator transform is applied to each sub-block, where the real component of transformed coefficients is utilized as the feature carrier, harnessing the geometric invariance of quaternion transformations to mitigate distortions induced by rotational attacks; (3) Integration of an Improved Uniform Log-Polar Mapping algorithm to embed synchronization markers, reinforcing resistance to geometric attacks by preserving structural consistency under affine transformations. Prior to embedding, dynamic statistical analysis of neighborhood coefficients adjusts watermark intensity, ensuring compatibility with human visual system masking properties. Experimental results demonstrate dual advantages: The PSNR of the proposed method is 41.4921, showing good invisibility. The average NC value remains at around 0.9, demonstrating good robustness. The effectiveness and practicability of the algorithm in a complex attack environment have been verified. Full article

► Show Figures

Figure 1

19 pages, 4966 KB

Open AccessArticle

A Study on Geometrical Consistency of Surfaces Using Partition-Based PCA and Wavelet Transform in Classification

by Vignesh Devaraj, Thangavel Palanisamy and Kanagasabapathi Somasundaram

AppliedMath 2025, 5(4), 134; https://doi.org/10.3390/appliedmath5040134 - 3 Oct 2025

Cited by 1 | Viewed by 786

Abstract

The proposed study explores the consistency of the geometrical character of surfaces under scaling, rotation and translation. In addition to its mathematical significance, it also exhibits advantages over image processing and economic applications. In this paper, the authors used partition-based principal component analysis [...] Read more.

The proposed study explores the consistency of the geometrical character of surfaces under scaling, rotation and translation. In addition to its mathematical significance, it also exhibits advantages over image processing and economic applications. In this paper, the authors used partition-based principal component analysis similar to two-dimensional Sub-Image Principal Component Analysis (SIMPCA), along with a suitably modified atypical wavelet transform in the classification of 2D images. The proposed framework is further extended to three-dimensional objects using machine learning classifiers. To strengthen fairness, we benchmarked against both Random Forest (RF) and Support Vector Machine (SVM) classifiers using nested cross-validation, showing consistent gains when TIFV is included. In addition, we carried out a robustness analysis by introducing Gaussian noise to the intensity channel, confirming that TIFV degrades much more gracefully compared to traditional descriptors. Experimental results demonstrate that the method achieves improved performance compared to traditional hand-crafted descriptors such as measured values and histogram of oriented gradients. In addition, it is found to be useful that this proposed algorithm is capable of establishing consistency locally, which is never possible without partition. However, a reasonable amount of computational complexity is reduced. We note that comparisons with deep learning baselines are beyond the scope of this study, and our contribution is positioned within the domain of interpretable, affine-invariant descriptors that enhance classical machine learning pipelines. Full article

► Show Figures

Figure 1

25 pages, 5278 KB

Open AccessArticle

Developing a Quality Flag for SAR Ocean Wave Spectrum Partitioning with Machine Learning

by Amine Benchaabane, Romain Husson, Muriel Pinheiro and Guillaume Hajduch

Remote Sens. 2025, 17(18), 3191; https://doi.org/10.3390/rs17183191 - 15 Sep 2025

Cited by 2 | Viewed by 1183

Abstract

Synthetic Aperture Radar (SAR) is one of the few instruments capable of providing high-resolution global two-dimensional (2D) measurements of ocean waves. Since 2014 and then 2016, the Sentinel-1A/B satellites, whenever operating in a specific wave mode (WV), have been providing ocean swell spectrum [...] Read more.

Synthetic Aperture Radar (SAR) is one of the few instruments capable of providing high-resolution global two-dimensional (2D) measurements of ocean waves. Since 2014 and then 2016, the Sentinel-1A/B satellites, whenever operating in a specific wave mode (WV), have been providing ocean swell spectrum data as Level-2 (L2) OCeaN products (OCN), derived through a quasi-linear inversion process. This WV acquires small SAR images of 20 × 20 km footprints alternating between two sub-beams, WV1 and WV2, with incidence angles of approximately 23° and 36°, respectively, to capture ocean surface dynamics. The SAR imaging process is influenced by various modulations, including hydrodynamic, tilt, and velocity bunching. While hydrodynamic and tilt modulations can be approximated as linear processes, velocity bunching introduces significant distortion due to the satellite’s relative motion with respect to the ocean surface and leads to constructive but also destructive effects on the wave imaging process. Due to the associated azimuth cut-off, the quasi-linear inversion primarily detects ocean swells with, on average, wavelengths longer than 200 m in the SAR azimuth direction, limiting the resolution of smaller-scale wave features in azimuth but reaching 10 m resolution along range. The 2D spectral partitioning technique used in the Sentinel-1 WV OCN product separates different swell systems, known as partitions, based on their frequency, directional, and spectral characteristics. The accuracy of these partitions can be affected by several factors, including non-linear effects, large-scale surface features, and the relative direction of the swell peak to the satellite’s flight path. To address these challenges, this study proposes a novel quality control framework using a machine learning (ML) approach to develop a quality flag (QF) parameter associated with each swell partition provided in the OCN products. By pairing collocated data from Sentinel-1 (S1) and WaveWatch III (WW3) partitions, the QF parameter assigns each SAR-derived swell partition one of five quality levels: “very good,” “good,” “medium,” “low,” or “poor”. This ML-based method enhances the accuracy of wave partitions, especially in cases where non-linear effects or large-scale oceanic features distort the data. The proposed algorithm provides a robust tool for filtering out problematic partitions, improving the overall quality of ocean wave measurements obtained from SAR. Moreover, the variability in the accuracy of swell partitions, depending on the swell direction relative to the satellite’s flight heading, is effectively addressed, enabling more reliable data for oceanographic studies. This work contributes to a better understanding of ocean swell dynamics derived from SAR observations and supports the numerical swell modeling community by aiding in the refinement of models and their integration into operational systems, thereby advancing both theoretical and practical aspects of ocean wave forecasting. Full article

(This article belongs to the Special Issue Calibration and Validation of SAR Data and Derived Products)

► Show Figures

Figure 1

21 pages, 11254 KB

Open AccessArticle

Research on Two-Dimensional Linear Canonical Transformation Series and Its Applications

by Weikang Zhao, Huibin Luo, Guifang Zhang and KinTak U

Fractal Fract. 2025, 9(9), 596; https://doi.org/10.3390/fractalfract9090596 - 12 Sep 2025

Viewed by 1086

Abstract

In light of the computational efficiency bottleneck and inadequate regional feature representation in traditional global data approximation methods, this paper introduces the concept of non-uniform partition to transform global continuous approximation into multi-region piecewise approximation. Additionally, we propose an image representation algorithm based [...] Read more.

In light of the computational efficiency bottleneck and inadequate regional feature representation in traditional global data approximation methods, this paper introduces the concept of non-uniform partition to transform global continuous approximation into multi-region piecewise approximation. Additionally, we propose an image representation algorithm based on linear canonical transformation and non-uniform partitioning, which enables the regional representation of sub-signal features while reducing computational complexity. The algorithm first demonstrates that the two-dimensional linear canonical transformation series has a least squares solution within each region. Then, it adopts the maximum likelihood estimation method and the scale transformation characteristics to achieve conversion between the nonlinear and linear expressions of the two-dimensional linear canonical transformation series. It then uses the least squares method and the recursive method to convert the image information into mathematical expressions, realize image vectorization, and solve the approximation coefficients in each region more quickly. The proposed algorithm better represents complex image texture areas while reducing image quality loss, effectively retains high-frequency details, and improves the quality of reconstructed images. Full article

► Show Figures

Figure 1

24 pages, 7521 KB

Open AccessArticle

High-Resolution High-Squint Large-Scene Spaceborne Sliding Spotlight SAR Processing via Joint 2D Time and Frequency Domain Resampling

by Mingshan Ren, Heng Zhang and Weidong Yu

Remote Sens. 2025, 17(1), 163; https://doi.org/10.3390/rs17010163 - 6 Jan 2025

Cited by 2 | Viewed by 1875

Abstract

A frequency domain imaging algorithm, featured as joint two-dimensional (2D) time and frequency domain resampling, used for high-resolution high-squint large-scene (HHL) spaceborne sliding spotlight synthetic aperture radar (SAR) processing is proposed in this paper. Due to the nonlinear beam rotation during HHL data [...] Read more.

A frequency domain imaging algorithm, featured as joint two-dimensional (2D) time and frequency domain resampling, used for high-resolution high-squint large-scene (HHL) spaceborne sliding spotlight synthetic aperture radar (SAR) processing is proposed in this paper. Due to the nonlinear beam rotation during HHL data acquisition, the Doppler centroid varies nonlinearly with azimuth time and traditional sub-aperture approaches and two step approach fail to remove the inertial Doppler aliasing of spaceborne sliding spotlight SAR data. In addition, curved orbit effect and long synthetic aperture time make the range histories difficult to model and introduce space-variants in both range and azimuth. In this paper, we use the azimuth deramping and 2D time-domain azimuth resampling, collectively referred to as preprocessing, to eliminate the aliasing in Doppler domain and correct the range-dependent azimuth-variants of range histories. After preprocessing, the squint sliding spotlight SAR data could be considered as equivalent broadside strip-map SAR during processing. Frequency domain focusing, mainly involves phase multiplication and resampling in 2D frequency and RD domain, is then applied to compensate for the residual space-variants and achieve the focusing of SAR data. Moreover, in order to adapt higher resolution and larger scene cases, the combination of the proposed algorithm and partitioning strategy is also discussed in this paper. Processing results of simulation data and Gaofen-3 experimental data are presented to demonstrate the feasibility of the proposed methods. Full article

(This article belongs to the Special Issue Processing Methods and Techniques of Spaceborne SAR with Ultra-High Resolution)

► Show Figures

Figure 1

18 pages, 8899 KB

Open AccessArticle

Feature Coding and Graph via Transformer: Different Granularities Classification for Aircraft

by Jianghao Rao, Senlin Qin, Zongyan An, Jianlin Zhang, Qiliang Bao and Zhenming Peng

Aerospace 2024, 11(12), 976; https://doi.org/10.3390/aerospace11120976 - 26 Nov 2024

Viewed by 1353

Abstract

Against the background of the sky, imaging and perception of aircraft are crucial for various vision applications. Thanks to the ever-evolving nature of the convolutional neural network (CNN), it has become easier to distinguish and recognize different types of aircraft. Nevertheless, accurate classification [...] Read more.

Against the background of the sky, imaging and perception of aircraft are crucial for various vision applications. Thanks to the ever-evolving nature of the convolutional neural network (CNN), it has become easier to distinguish and recognize different types of aircraft. Nevertheless, accurate classification for sub-categories of aircraft still poses great challenges. On one hand, fine-grained recognition focuses on exploring and studying such problems. On the other hand, aircraft under different sub-categories and granularities put forward higher requirements for feature representation to classify, which led us to rethink the in-depth application of features. We noticed that information in the swin-transformer effectively represents the features in neural network layers, fully showcasing encoding and indexing for information. Through further research based on this, we proposed a better understanding of encoding and reuse for features, and innovatively performed feature coding graphically for classification. In this paper, our approach shows the effects on aircraft feature representation and classification, manifested from the flexible recognition effect at different aircraft category granularities, and outperforms other famous fine-grained classification models on this vision task. Not only did the approach we proposed demonstrate adaptability to aircraft at different classification granularities, but it also revealed the mechanisms and characteristics of feature encoding under different sample space partitions for classification. The relationship between the oriented representation of aircraft features and various classification granularities, which is manifested through different classification criteria, shows that feature coding and graph construction via the transformer opens a new door for specific defined classification tasks where objects are divided under various partition criteria, and provides another perspective on calculation and feature extraction in fine-grained classification. Full article

(This article belongs to the Section Aeronautics)

► Show Figures

Figure 1

30 pages, 9836 KB

Open AccessArticle

Comparing Three Freeze-Thaw Schemes Using C-Band Radar Data in Southeastern New Hampshire, USA

by Mahsa Moradi, Simon Kraatz, Jeremy Johnston and Jennifer M. Jacobs

Remote Sens. 2024, 16(15), 2784; https://doi.org/10.3390/rs16152784 - 30 Jul 2024

Cited by 2 | Viewed by 2519

Abstract

Soil freeze-thaw (FT) cycles over agricultural lands are of great importance due to their vital role in controlling soil moisture distribution, nutrient availability, health of microbial communities, and water partitioning during flood events. Active microwave sensors such as C-band Sentinel-1 synthetic aperture radar [...] Read more.

Soil freeze-thaw (FT) cycles over agricultural lands are of great importance due to their vital role in controlling soil moisture distribution, nutrient availability, health of microbial communities, and water partitioning during flood events. Active microwave sensors such as C-band Sentinel-1 synthetic aperture radar (SAR) can serve as powerful tools to detect field-scale soil FT state. Using Sentinel-1 SAR observations, this study compares the performance of two FT detection approaches, a commonly used seasonal threshold approach (STA) and a computationally inexpensive general threshold approach (GTA) at an agricultural field in New Hampshire, US. It also explores the applicability of an interferometric coherence approach (ICA) for FT detection. STA and GTA achieved 85% and 78% accuracy, respectively, using VH polarization. We find a marginal degradation in the performance of STA (82%) and GTA (76%) when employing VV-polarized data. While there was approximately a 6 percentage point difference between STA’s and GTA‘s overall accuracy, we recommend GTA for FT detection using SAR images at sub-field-scale over extended regions because of its higher computational efficiency. Our analysis shows that interferometric coherence is not suitable for detecting FT transitions under mild and highly dynamic winter conditions. We hypothesize that the relatively mild winter conditions and therefore the subtle FT transitions are not able to significantly reduce the correlation between the phase values. Also, the ephemeral nature of snowpack in our study area, further compounded by frequent rainfall, could cause decorrelation of SAR images even in the absence of a FT transition. We conclude that despite Sentinel-1’s ~80% mapping accuracy at a mid-latitude site, understanding the cause of misclassification remains challenging, even when detailed ground data are readily available and employed in error attribution efforts. Full article

(This article belongs to the Section Earth Observation Data)

► Show Figures

Figure 1

14 pages, 2421 KB

Open AccessArticle

Optimization of Memristor Crossbar’s Mapping Using Lagrange Multiplier Method and Genetic Algorithm for Reducing Crossbar’s Area and Delay Time

by Seung-Myeong Cho, Rina Yoon, Ilpyeong Yoon, Jihwan Moon, Seokjin Oh and Kyeong-Sik Min

Information 2024, 15(7), 409; https://doi.org/10.3390/info15070409 - 15 Jul 2024

Cited by 6 | Viewed by 3060

Abstract

Memristor crossbars offer promising low-power and parallel processing capabilities, making them efficient for implementing convolutional neural networks (CNNs) in terms of delay time, area, etc. However, mapping large CNN models like ResNet-18, ResNet-34, VGG-Net, etc., onto memristor crossbars is challenging due to the [...] Read more.

Memristor crossbars offer promising low-power and parallel processing capabilities, making them efficient for implementing convolutional neural networks (CNNs) in terms of delay time, area, etc. However, mapping large CNN models like ResNet-18, ResNet-34, VGG-Net, etc., onto memristor crossbars is challenging due to the line resistance problem limiting crossbar size. This necessitates partitioning full-image convolution into sub-image convolution. To do so, an optimized mapping of memristor crossbars should be considered to divide full-image convolution into multiple crossbars. With limited crossbar resources, especially in edge devices, it is crucial to optimize the crossbar allocation per layer to minimize the hardware resource in term of crossbar area, delay time, and area–delay product. This paper explores three optimization scenarios: (1) optimizing total delay time under a crossbar’s area constraint, (2) optimizing total crossbar area with a crossbar’s delay time constraint, and (3) optimizing a crossbar’s area–delay-time product without constraints. The Lagrange multiplier method is employed for the constrained cases 1 and 2. For the unconstrained case 3, a genetic algorithm (GA) is used to optimize the area–delay-time product. Simulation results demonstrate that the optimization can have significant improvements over the unoptimized results. When VGG-Net is simulated, the optimization can show about 20% reduction in delay time for case 1 and 22% area reduction for case 2. Case 3 highlights the benefits of optimizing the crossbar utilization ratio for minimizing the area–delay-time product. The proposed optimization strategies can substantially enhance the neural network’s performance of memristor crossbar-based processing-in-memory architectures, especially for resource-constrained edge computing platforms. Full article

(This article belongs to the Special Issue Neuromorphic Engineering and Machine Learning)

► Show Figures

Figure 1

21 pages, 13934 KB

Open AccessArticle

A Robust Tie-Points Matching Method with Regional Feature Representation for Synthetic Aperture Radar Images

by Yifan Zhang, Yan Zhu, Liqun Liu, Xun Du, Kun Han, Junhui Wu, Zhiqiang Li, Lingshuai Kong and Qiwei Lin

Remote Sens. 2024, 16(13), 2491; https://doi.org/10.3390/rs16132491 - 8 Jul 2024

Cited by 3 | Viewed by 2343

Abstract

The precise tie-points (TPs) on synthetic aperture radar (SAR) images are a critical cornerstone in the global digital elevation model (DEM) and digital ortho map (DOM) production process. While there are abundant studies on SAR TPs matching, improvement opportunities persist in large areas. [...] Read more.

The precise tie-points (TPs) on synthetic aperture radar (SAR) images are a critical cornerstone in the global digital elevation model (DEM) and digital ortho map (DOM) production process. While there are abundant studies on SAR TPs matching, improvement opportunities persist in large areas. The correspondences have pixel-level errors during geocoding, which result in misalignment between global products. Consequently, this paper proposed a robust method for SAR images TPs matching, which consists of three key steps: (1) interest point extraction based on the dynamic Harris area entropy (DHAE) grid; (2) adaptive determination of template size; (3) normalized cross correlation (NCC) template matching. DHAE is a regional texture information grid based on the SAR-Harris map, and it is achieved through dynamic block division. Generating the DHAE grid over SAR images enables the extraction of interest points that have regional feature representation and distribution uniformity. A variable-size matching template is adaptively determined based on DHAE to enhance template quality while maintaining computational efficiency. Subsequently, the NCC algorithm is employed to find subpixel-precise correspondences. The proposed method is applied on TPs matching in 57 Terra-SAR images, which cover a large geographical area. Furthermore, the overlapping area is partitioned into five segments according to different coverage types. The experimental results demonstrate that the proposed method outperforms other template matching methods. For all coverage types, the proposed method exhibits high-precision sub-pixel results that reach up to 38.64% in terms of the relative positioning error (RPE), particularly in texture-weak and large areas. Full article

(This article belongs to the Special Issue Advances and Innovative Applications in Multi-temporal InSAR Technology)

► Show Figures

Figure 1

12 pages, 3266 KB

Open AccessFeature PaperArticle

Sequential Learning of Flame Objects Sorted by Size for Early Fire Detection in Surveillance Videos

by Widia A. Samosir, Duy B. Nguyen and Seong G. Kong

Electronics 2024, 13(12), 2232; https://doi.org/10.3390/electronics13122232 - 7 Jun 2024

Viewed by 1791

Abstract

This paper presents a sequential learning method aimed at improving the performance of a lightweight deep learning model used for detecting fires at an early stage in surveillance video streams. The proposed approach involves a sequence of supervised learning steps, wherein the entire [...] Read more.

This paper presents a sequential learning method aimed at improving the performance of a lightweight deep learning model used for detecting fires at an early stage in surveillance video streams. The proposed approach involves a sequence of supervised learning steps, wherein the entire training dataset is partitioned into multiple sub-datasets based on the size of fire objects. The size of fire objects is measured by object size ratio, which is the ratio of the bounding box area of the detected fire flame object relative to the entire image area. The initial training sub-dataset contains the largest-sized fire objects, progressing to the final sub-dataset containing the smallest-sized fire objects. The objective is to employ sequential learning to enhance the detection of small-sized fire objects relative to the image area using a lightweight model suitable for edge computing devices. Experiment results demonstrate that a deep learning fire detection model trained sequentially with a descending order of object size can effectively detect small flame objects with an object size ratio less than 0.006, achieving an F1 score of 93.1%, representing a 27% improvement compared to traditional supervised learning with no sequential learning steps. Additionally, performance in detecting tiny flame objects with an object size ratio less than 0.0016 achieves an F1 score of 94.5%, showing a 17.5% increase compared to the baseline without sequential learning. Full article

(This article belongs to the Special Issue AI Security and Safety)

► Show Figures

Figure 1

13 pages, 21597 KB

Open AccessArticle

SGNet: Efficient Snow Removal Deep Network with a Global Windowing Transformer

by Lie Shan, Haoxiang Zhang and Bodong Cheng

Mathematics 2024, 12(10), 1424; https://doi.org/10.3390/math12101424 - 7 May 2024

Cited by 5 | Viewed by 2251

Abstract

Image restoration under adverse weather conditions poses a challenging task. Previous research efforts have predominantly focused on eliminating rain and fog phenomena from images. However, snow, being another common atmospheric occurrence, also significantly impacts advanced computer vision tasks such as object detection and [...] Read more.

Image restoration under adverse weather conditions poses a challenging task. Previous research efforts have predominantly focused on eliminating rain and fog phenomena from images. However, snow, being another common atmospheric occurrence, also significantly impacts advanced computer vision tasks such as object detection and semantic segmentation. Recently, there has been a surge of methods specifically targeting snow removal, with the majority employing visual Transformers as the backbone network to enhance restoration effectiveness. Nevertheless, due to the quadratic computations required by Transformers to model long-range dependencies, this significantly escalates the time and space consumption of deep learning models. To address this issue, this paper proposes an efficient snow removal Transformer with a global windowing network (SGNet). This method forgoes the localized windowing strategy of previous visual Transformers, opting instead to partition the image into multiple low-resolution subimages containing global information using wavelet sampling, thereby ensuring higher performance while reducing computational overhead. Extensive experimentation demonstrates that our approach achieves outstanding performance across a wide range of benchmark datasets and can rival methods employing CNNs in terms of computational cost. Full article

(This article belongs to the Special Issue Mathematical Techniques and Artificial Intelligence in Image Processing)

► Show Figures

Figure 1

17 pages, 26503 KB

Open AccessArticle

A Robust Zero-Watermarking Scheme in Spatial Domain by Achieving Features Similar to Frequency Domain

by Musrrat Ali and Sanoj Kumar

Electronics 2024, 13(2), 435; https://doi.org/10.3390/electronics13020435 - 20 Jan 2024

Cited by 10 | Viewed by 3366

Abstract

In recent years, there has been a substantial surge in the application of image watermarking, which has evolved into an essential tool for identifying multimedia material, ensuring security, and protecting copyright. Singular value decomposition (SVD) and discrete cosine transform (DCT) are widely utilized [...] Read more.

In recent years, there has been a substantial surge in the application of image watermarking, which has evolved into an essential tool for identifying multimedia material, ensuring security, and protecting copyright. Singular value decomposition (SVD) and discrete cosine transform (DCT) are widely utilized in digital image watermarking despite the considerable computational burden they involve. By combining block-based direct current (DC) values with matrix norm, this research article presents a novel, robust zero-watermarking approach. It generates a zero-watermark without attempting to modify the contents of the image. The image is partitioned into non-overlapping blocks, and DC values are computed without applying DCT. This sub-image is further partitioned into non-overlapping blocks, and the maximum singular value of each block is calculated by matrix norm instead of SVD to obtain the binary feature matrix. A piecewise linear chaotic map encryption technique is utilized to improve the security of the watermark image. After that, the feature image is created via XOR procedure between the encrypted watermark image and the binary feature matrix. The proposed scheme is tested using a variety of distortion attacks including noise, filter, geometric, and compression attacks. It is also compared with the other relevant image watermarking methods and outperformed them in most cases. Full article

(This article belongs to the Special Issue Recent Developments and Applications of Image Watermarking, 2nd Edition)

► Show Figures

Figure 1

Search Results (50)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (50)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI