Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (43)

Search Parameters:
Keywords = quad tree

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 23718 KB  
Article
A Mamba-Based Hierarchical Partitioning Framework for Upper-Level Wind Field Reconstruction
by Wantong Chen, Yifan Zhang, Ruihua Liu, Shuguang Sun and Qing Feng
Aerospace 2025, 12(9), 842; https://doi.org/10.3390/aerospace12090842 - 18 Sep 2025
Viewed by 332
Abstract
An accurate perception of upper-level wind fields is essential for improving civil aviation safety and route optimization. However, the sparsity of observational data and the structural complexity of wind fields make reconstruction highly challenging. To address this, we propose QuadMamba-WindNet (QMW-Net), a structure-enhanced [...] Read more.
An accurate perception of upper-level wind fields is essential for improving civil aviation safety and route optimization. However, the sparsity of observational data and the structural complexity of wind fields make reconstruction highly challenging. To address this, we propose QuadMamba-WindNet (QMW-Net), a structure-enhanced deep neural network that integrates a hierarchical state-space modeling framework with a learnable quad-tree-based regional partitioning mechanism, enabling multi-scale adaptive encoding and efficient dynamic modeling. The model is trained end-to-end on ERA5 reanalysis data and validated with simulated flight trajectory observation masks, allowing the reconstruction of complete horizontal wind fields at target altitude levels. Experimental results show that QMW-Net achieves a mean absolute error (MAE) of 1.62 m/s and a mean relative error (MRE) of 6.68% for wind speed reconstruction at 300 hPa, with a mean directional error of 4.85° and an R2 of 0.93, demonstrating high accuracy and stable error convergence. Compared with Physics-Informed Neural Networks (PINNs) and Gaussian Process Regression (GPR), QMW-Net delivers superior predictive performance and generalization across multiple test sets. The proposed model provides refined wind field support for civil aviation forecasting and trajectory planning, and shows potential for broader applications in high-dynamic flight environments and atmospheric sensing. Full article
(This article belongs to the Section Air Traffic and Transportation)
Show Figures

Figure 1

19 pages, 2675 KB  
Article
Fast Intra-Coding Unit Partitioning for 3D-HEVC Depth Maps via Hierarchical Feature Fusion
by Fangmei Liu, He Zhang and Qiuwen Zhang
Electronics 2025, 14(18), 3646; https://doi.org/10.3390/electronics14183646 - 15 Sep 2025
Viewed by 443
Abstract
As a new generation 3D video coding standard, 3D-HEVC offers highly efficient compression. However, its recursive quadtree partitioning mechanism and frequent rate-distortion optimization (RDO) computations lead to a significant increase in coding complexity. Particularly, intra-frame coding in depth maps, which incorporates tools like [...] Read more.
As a new generation 3D video coding standard, 3D-HEVC offers highly efficient compression. However, its recursive quadtree partitioning mechanism and frequent rate-distortion optimization (RDO) computations lead to a significant increase in coding complexity. Particularly, intra-frame coding in depth maps, which incorporates tools like depth modeling modes (DMMs), substantially prolongs the decision-making process for coding unit (CU) partitioning, becoming a critical bottleneck in compression encoding time. To address this issue, this paper proposes a fast CU partitioning framework based on hierarchical feature fusion convolutional neural networks (HFF-CNNs). It aims to significantly accelerate the overall encoding process while ensuring excellent encoding quality by optimizing depth map CU partitioning decisions. This framework synergistically captures CU’s global structure and local details through multi-scale feature extraction and channel attention mechanisms (SE module). It introduces the wavelet energy ratio designed for quantifying the texture complexity of depth map CU and the quantization parameter (QP) that reflects the encoding quality as external features, enhancing the dynamic perception ability of the model from different dimensions. Ultimately, it outputs depth-corresponding partitioning predictions through three fully connected layers, strictly adhering to HEVC’s quad-tree recursive segmentation mechanism. Experimental results demonstrate that, across eight standard test sequences, the proposed method achieves an average encoding time reduction of 48.43%, significantly lowering intra-frame encoding complexity with a BDBR increment of only 0.35%. The model exhibits outstanding lightweight characteristics with minimal inference time overhead. Compared with the representative methods under comparison, this method achieves a better balance between cross-resolution adaptability and computational efficiency, providing a feasible optimization path for real-time 3D-HEVC applications. Full article
Show Figures

Figure 1

19 pages, 4574 KB  
Article
A WebGL-Based Interactive Visualization Framework for Large-Scale Urban Seismic Simulations with a Dual Multi-LOD Strategy
by Jinping Wang, Zekun Xu and Yang Li
Buildings 2025, 15(16), 2916; https://doi.org/10.3390/buildings15162916 - 18 Aug 2025
Viewed by 803
Abstract
The effective visualization of urban-scale earthquake simulations is pivotal for disaster assessment but presents significant challenges in terms of computational performance and accessibility. This paper introduces a lightweight, browser-based visualization framework that leverages Web Graphics Library (WebGL) to provide real-time, interactive 3D rendering [...] Read more.
The effective visualization of urban-scale earthquake simulations is pivotal for disaster assessment but presents significant challenges in terms of computational performance and accessibility. This paper introduces a lightweight, browser-based visualization framework that leverages Web Graphics Library (WebGL) to provide real-time, interactive 3D rendering without requiring specialized software. The proposed framework implements a novel dual multi-level-of-detail (LOD) strategy that optimizes both data representation and rendering performance. At the data level, urban buildings are classified into simplified or detailed geometric and computational models based on structural importance. At the rendering level, a dynamic graphics LOD approach adjusts visual complexity based on camera proximity. To realistically reproduce dynamic behaviors of complex structures, skeletal animation is introduced, while a quad tree-based spatial index ensures efficient object culling. The framework’s scalability and efficacy were validated by successfully visualizing the seismic response of approximately 100,000 buildings in New York City. Experimental results demonstrate that the proposed strategy maintains interactive frame rates (>24 frames per second) for views containing up to 4000 detailed buildings undergoing simultaneous and dynamic seismic behaviors. This approach significantly reduces rendering latency and proves extensible to other urban regions. The source code supporting this study is available from the corresponding author upon reasonable request. Full article
Show Figures

Figure 1

22 pages, 10717 KB  
Article
Interpretable Multi-Sensor Fusion of Optical and SAR Data for GEDI-Based Canopy Height Mapping in Southeastern North Carolina
by Chao Wang, Conghe Song, Todd A. Schroeder, Curtis E. Woodcock, Tamlin M. Pavelsky, Qianqian Han and Fangfang Yao
Remote Sens. 2025, 17(9), 1536; https://doi.org/10.3390/rs17091536 - 25 Apr 2025
Cited by 1 | Viewed by 3353
Abstract
Accurately monitoring forest canopy height is crucial for sustainable forest management, particularly in southeastern North Carolina, USA, where dense forests and limited accessibility pose substantial challenges. This study presents an explainable machine learning framework that integrates sparse GEDI LiDAR samples with multi-sensor remote [...] Read more.
Accurately monitoring forest canopy height is crucial for sustainable forest management, particularly in southeastern North Carolina, USA, where dense forests and limited accessibility pose substantial challenges. This study presents an explainable machine learning framework that integrates sparse GEDI LiDAR samples with multi-sensor remote sensing data to improve both the accuracy and interpretability of forest canopy height estimation. This framework incorporates multitemporal optical observations from Sentinel-2; C-band backscatter and InSAR coherence from Sentinel-1; quad-polarization L-Band backscatter and polarimetric decompositions from the Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR); texture features from the National Agriculture Imagery Program (NAIP) aerial photography; and topographic data derived from an airborne LiDAR-based digital elevation model. We evaluated four machine learning algorithms, K-nearest neighbors (KNN), random forest (RF), support vector machine (SVM), and eXtreme gradient boosting (XGB), and found consistent accuracy across all models. Our evaluation highlights our method’s robustness, evidenced by closely matched R2 and RMSE values across models: KNN (R2 of 0.496, RMSE of 5.13 m), RF (R2 of 0.510, RMSE of 5.06 m), SVM (R2 of 0.544, RMSE of 4.88 m), and XGB (R2 of 0.548, RMSE of 4.85 m). The integration of comprehensive feature sets, as opposed to subsets, yielded better results, underscoring the value of using multisource remotely sensed data. Crucially, SHapley Additive exPlanations (SHAP) revealed the multi-seasonal red-edge spectral bands of Sentinel-2 as dominant predictors across models, while volume scattering from UAVSAR emerged as a key driver in tree-based algorithms. This study underscores the complementary nature of multi-sensor data and highlights the interpretability of our models. By offering spatially continuous, high-quality canopy height estimates, this cost-effective, data-driven approach advances large-scale forest management and environmental monitoring, paving the way for improved decision-making and conservation strategies. Full article
Show Figures

Graphical abstract

16 pages, 433 KB  
Article
A Fast Coding Unit Partitioning Decision Algorithm for Versatile Video Coding Based on Gradient Feedback Hierarchical Convolutional Neural Network and Light Gradient Boosting Machine Decision Tree
by Fangmei Liu, Jiyuan Wang and Qiuwen Zhang
Electronics 2024, 13(24), 4908; https://doi.org/10.3390/electronics13244908 - 12 Dec 2024
Viewed by 1102
Abstract
Video encoding technology is a foundational component in the advancement of modern technological applications. The latest standard in universal video coding, H.266/VVC, features a quad-tree with nested multi-type tree (QTMT) partitioning structure, which represents an improvement over its predecessor, High-Efficiency Video Coding (H.265/HEVC). [...] Read more.
Video encoding technology is a foundational component in the advancement of modern technological applications. The latest standard in universal video coding, H.266/VVC, features a quad-tree with nested multi-type tree (QTMT) partitioning structure, which represents an improvement over its predecessor, High-Efficiency Video Coding (H.265/HEVC). This configuration facilitates adaptable block segmentation, albeit at the cost of heightened encoding complexity. In view of the aforementioned considerations, this paper puts forth a deep learning-based approach to facilitate CU partitioning, with the aim of supplanting the intricate CU partitioning process observed in the Versatile Video Coding Test Model (VTM). We begin by presenting the Gradient Feedback Hierarchical CNN (GFH-CNN) model, an advanced convolutional neural network derived from the ResNet architecture, enabling the extraction of features from 64 × 64 coding unit (CU) blocks. Following this, a hierarchical network diagram (HND) is crafted to depict the delineation of partition boundaries corresponding to the various levels of the CU block’s layered structure. This diagram maps the features extracted by the GFH-CNN model to the partitioning at each level and boundary. In conclusion, a LightGBM-based decision tree classification model (L-DT) is constructed to predict the corresponding partition structure based on the prediction vector output from the GFH-CNN model. Subsequently, any errors in the partitioning results are corrected in accordance with the encoding constraints specified by the VTM, which ultimately determines the final CU block partitioning. The experimental results demonstrate that, in comparison with VTM-10.0, the proposed algorithm achieves a 48.14% reduction in complexity with only a 0.83% increase in bitrate under the top-three configuration, which is negligible. In comparison, the top-two configuration resulted in a higher complexity reduction of 63.78%, although this was accompanied by a 2.08% increase in bitrate. These results demonstrate that, in comparison to existing solutions, our approach provides an optimal balance between encoding efficiency and computational complexity. Full article
Show Figures

Figure 1

23 pages, 5821 KB  
Article
Optimizing Charging Pad Deployment by Applying a Quad-Tree Scheme
by Rei-Heng Cheng, Chang-Wu Yu and Zuo-Li Zhang
Algorithms 2024, 17(6), 264; https://doi.org/10.3390/a17060264 - 14 Jun 2024
Cited by 4 | Viewed by 1217
Abstract
The recent advancement in wireless power transmission (WPT) has led to the development of wireless rechargeable sensor networks (WRSNs), since this technology provides a means to replenish sensor nodes wirelessly, offering a solution to the energy challenges faced by WSNs. Most of the [...] Read more.
The recent advancement in wireless power transmission (WPT) has led to the development of wireless rechargeable sensor networks (WRSNs), since this technology provides a means to replenish sensor nodes wirelessly, offering a solution to the energy challenges faced by WSNs. Most of the recent previous work has focused on charging sensor nodes using wireless charging vehicles (WCVs) equipped with high-capacity batteries and WPT devices. In these schemes, a vehicle can move close to a sensor node and wirelessly charge it without physical contact. While these schemes can mitigate the energy problem to some extent, they overlook two primary challenges of applied WCVs: off-road navigation and vehicle speed limitations. To overcome these challenges, previous work proposed a new WRSN model equipped with one drone coupled with several pads deployed to charge the drone when it cannot reach the subsequent stop. This wireless charging pad deployment aims to deploy the minimum number of pads so that at least one feasible routing path from the base station can be established for the drone to reach every SN in a given WRSN. The major weakness of previous studies is that they only consider deploying a wireless charging pad at the locations of the wireless sensor nodes. Their schemes are limited and constrained because usually every point in the deployed area can be considered to deploy a pad. Moreover, the deployed pads suggested by these schemes may not be able to meet the connected requirements due to sparse environments. In this work, we introduce a new scheme that utilizes the Quad-Tree concept to address the wireless charging pad deployment problem and reduce the number of deployed pads at the same time. Extensive simulations were conducted to illustrate the merits of the proposed schemes by comparing them with different previous schemes on maps of varying sizes. In the case of large maps, the proposed schemes surpassed all previous works, indicating that our approach is more suitable for large-scale network environments. Full article
(This article belongs to the Collection Feature Paper in Algorithms and Complexity Theory)
Show Figures

Graphical abstract

16 pages, 5431 KB  
Article
A Fast Detection Algorithm for Change Detection in National Forestland “One Map” Based on NLNE Quad-Tree
by Fei Gao, Xiaohui Su, Yuling Chen, Baoguo Wu, Yingze Tian, Wenjie Zhang and Tao Li
Forests 2024, 15(4), 646; https://doi.org/10.3390/f15040646 - 2 Apr 2024
Cited by 1 | Viewed by 1758
Abstract
The National Forestland “One Map” applies the boundaries and attributes of sub-elements to mountain plots by means of spatial data to achieve digital management of forest resources. The change detection and analysis of forest space and property is the key to determining the [...] Read more.
The National Forestland “One Map” applies the boundaries and attributes of sub-elements to mountain plots by means of spatial data to achieve digital management of forest resources. The change detection and analysis of forest space and property is the key to determining the change characteristics, evolution trend and management effectiveness of forest land. The existing spatial overlay method, rasterization method, object matching method, etc., cannot meet the requirements of high efficiency and high precision at the same time. In this paper, we investigate a fast algorithm for the detection of changes in “One Map”, taking Sichuan Province as an example. The key spatial characteristic extraction method is used to uniquely determine the sub-compartments. We construct an unbalanced quadtree based on the number of maximum leaf node elements (NLNE Quad-Tree) to narrow down the query range of the target sub-compartments and quickly locate the sub-compartments. Based on NLNE Quad-Tree, we establish a change detection model for “One Map” (NQT-FCDM). The results show that the spatial feature combination of barycentric coordinates and area can ensure the spatial uniqueness of 44.45 million sub-compartments in Sichuan Province with 1 m~0.000001 m precision. The NQT-FCDM constructed with 1000–6000 as the maximum number of leaf nodes has the best retrieval efficiency in the range of 100,000–500,000 sub-compartments. The NQT-FCDM shortens the time by about 75% compared with the traditional spatial union analysis method, shortens the time by about 50% compared with the normal quadtree and effectively solves the problem of generating a large amount of intermediate data in the spatial union analysis method. The NQT-FCDM proposed in this paper improves the efficiency of change detection in “One Map” and can be generalized to other industries applying geographic information systems to carry out change detection, providing a basis for the detection of changes in vector spatial data. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Figure 1

13 pages, 3387 KB  
Technical Note
Polarimetric Measures in Biomass Change Prediction Using ALOS-2 PALSAR-2 Data
by Henrik J. Persson and Ivan Huuva
Remote Sens. 2024, 16(6), 953; https://doi.org/10.3390/rs16060953 - 8 Mar 2024
Cited by 2 | Viewed by 2216
Abstract
The use of multiple synthetic aperture radar polarizations can improve biomass estimations compared to using a single polarization. In this study, we compared predictions of aboveground biomass change from ALOS-2 PALSAR-2 backscatter using linear regression based on (1) the cross-polarization channels, (2) the [...] Read more.
The use of multiple synthetic aperture radar polarizations can improve biomass estimations compared to using a single polarization. In this study, we compared predictions of aboveground biomass change from ALOS-2 PALSAR-2 backscatter using linear regression based on (1) the cross-polarization channels, (2) the co- and cross-polarizations from fully polarimetric SAR, (3) Freeman–Durden polarimetric decomposition, and (4) the polarimetric radar vegetation index (RVI). Additionally, the impact of forest structure on the sensitivity of the polarimetric backscatter to AGB and AGB change was assessed. The biomass consisted of mainly coniferous trees at the hemi-boreal test site Remningstorp, located in southern Sweden. We found some improvements in the predictions when quad-polarized data (RMSE = 79.4 tons/ha) were used instead of solely cross-polarized data (RMSE = 84.9 tons/ha). However, when using Freeman–Durden decomposition, the prediction accuracy improved further (RMSE = 69.7 tons/ha), and the highest accuracy was obtained with the radar vegetation index (RMSE = 50.4 tons/ha). The corresponding R2 values ranged from 0.45 to 0.82. The bias was less than 1 t/ha for all models. An analysis of forest variables showed that the sensitivity to AGB was reduced for high values of basal-area-weighted mean height, basal area, and stem density when predicting absolute AGB, but the best change prediction model was sensitive to changes larger than the apparent saturation point for AGB state estimates. We conclude that by using fully polarimetric SAR images, forest biomass changes can be estimated more accurately compared to using single- or dual-polarization images. The results were improved the most (in terms of RMSE and R2) by using the Freeman–Durden decomposition model or the RVI, which captured especially the large changes better. Full article
(This article belongs to the Special Issue SAR for Forest Mapping III)
Show Figures

Figure 1

17 pages, 5884 KB  
Article
Quad-Rotor Unmanned Aerial Vehicle Path Planning Based on the Target Bias Extension and Dynamic Step Size RRT* Algorithm
by Haitao Gao, Xiaozhu Hou, Jiangpeng Xu and Banggui Guan
World Electr. Veh. J. 2024, 15(1), 29; https://doi.org/10.3390/wevj15010029 - 16 Jan 2024
Cited by 10 | Viewed by 2881
Abstract
For the path planning of quad-rotor UAVs, the traditional RRT* algorithm has weak exploration ability, low planning efficiency, and a poor planning effect. A TD-RRT* algorithm based on target bias expansion and dynamic step size is proposed herein. First, random-tree expansion is combined [...] Read more.
For the path planning of quad-rotor UAVs, the traditional RRT* algorithm has weak exploration ability, low planning efficiency, and a poor planning effect. A TD-RRT* algorithm based on target bias expansion and dynamic step size is proposed herein. First, random-tree expansion is combined with the target bias strategy to remove the blindness of the random tree, and we assign different weights to the sampling point and the target point so that the target point can be quickly approached and the search speed can be improved. Then, the dynamic step size is introduced to speed up the search speed, effectively solving the problem of invalid expansion in the process of trajectory generation. We then adjust the step length required for the expansion tree and obstacles in real time, solve the opposition between smoothness and real time in path planning, and improve the algorithm’s search efficiency. Finally, the cubic B-spline interpolation method is used to modify the local inflection point of the path of the improved RRT* algorithm to smooth the path. The simulation results show that compared with the traditional RRT* algorithm, the number of iterations of path planning of the TD-RRT* algorithm is reduced, the travel distance from the starting position to the end position is shortened, the time consumption is reduced, the path route is smoother, and the path optimization effect is better. The TD-RRT* algorithm based on target bias expansion and dynamic step size significantly improves the planning efficiency and planning effect of quad-rotor UAVs in a three-dimensional-space environment. Full article
Show Figures

Figure 1

25 pages, 5910 KB  
Article
High-Capacity Reversible Data Hiding in Encrypted Images Based on Pixel Prediction and QuadTree Decomposition
by Muhannad Alqahtani and Atef Masmoudi
Appl. Sci. 2023, 13(23), 12706; https://doi.org/10.3390/app132312706 - 27 Nov 2023
Cited by 2 | Viewed by 1817
Abstract
Over the past few years, a considerable number of researchers have shown great interest in reversible data hiding for encrypted images (RDHEI). One popular category among various RDHEI methods is the reserving room before encryption (RRBE) approach, which leverages data redundancy in the [...] Read more.
Over the past few years, a considerable number of researchers have shown great interest in reversible data hiding for encrypted images (RDHEI). One popular category among various RDHEI methods is the reserving room before encryption (RRBE) approach, which leverages data redundancy in the original image before encryption to create space for data hiding and to achieve high embedding rates (ERs). This paper introduces an RRBE-based RDHEI method that employs pixel prediction, quadtree decomposition, and bit plane reordering to provide high embedding capacity and error-free reversibility. Initially, the content owner predicts the error image using a prediction method, followed by mapping it to a new error image with positive pixel values and a compressed binary label map is generated for overhead pixels. Subsequently, quadtree decomposition is applied to each bit plane of the mapped prediction error image to identify homogeneous blocks, which are then reordered to create room for data embedding. After generating the encrypted image with the encryption key, the data hider employs the data hiding key to embed the data based on the auxiliary information added to each embeddable bit plane’s beginning. Finally, the receiver is able to retrieve the secret message without any error, decrypt the image, and restore it without any loss or distortion. The experimental results demonstrate that the proposed RDHEI method achieves significantly higher ERs than previous competitors, with an average ER exceeding 3.6 bpp on the BOSSbase and BOWS-2 datasets. Full article
(This article belongs to the Topic Trends and Prospects in Security, Encryption and Encoding)
Show Figures

Figure 1

18 pages, 4394 KB  
Article
Efficient CU Decision Algorithm for VVC 3D Video Depth Map Using GLCM and Extra Trees
by Fengqin Wang, Zhiying Wang and Qiuwen Zhang
Electronics 2023, 12(18), 3914; https://doi.org/10.3390/electronics12183914 - 17 Sep 2023
Cited by 2 | Viewed by 1804
Abstract
The new generation of 3D video is an international frontier research hotspot. However, the large amount of data and high complexity are core problems to be solved urgently in 3D video coding. The latest generation of video coding standard versatile video coding (VVC) [...] Read more.
The new generation of 3D video is an international frontier research hotspot. However, the large amount of data and high complexity are core problems to be solved urgently in 3D video coding. The latest generation of video coding standard versatile video coding (VVC) adopts the quad-tree with nested multi-type tree (QTMT) partition structure, and the coding efficiency is much higher than other coding standards. However, the current research work undertaken for VVC is less for 3D video. In light of this context, we propose a fast coding unit (CU) decision algorithm based on the gray level co-occurrence matrix (GLCM) and Extra trees for the characteristics of the depth map in 3D video. In the first stage, we introduce an edge detection algorithm using GLCM to classify the CU in the depth map into smooth and complex edge blocks based on the extracted features. Subsequently, the extracted features from the CUs, classified as complex edge blocks in the first stage, are fed into the constructed Extra trees model to make a fast decision on the partition type of that CU and avoid calculating unnecessary rate-distortion cost. Experimental results show that the overall algorithm can effectively reduce the coding time by 36.27–51.98%, while the Bjøntegaard delta bit rate (BDBR) is only increased by 0.24% on average which is negligible, all reflecting the superior performance of our method. Moreover, our algorithm can effectively ensure video quality while saving much encoding time compared with other algorithms. Full article
Show Figures

Figure 1

20 pages, 6779 KB  
Article
Fast CU Partition Algorithm for Intra Frame Coding Based on Joint Texture Classification and CNN
by Ting Wang, Geng Wei, Huayu Li, ThiOanh Bui, Qian Zeng and Ruliang Wang
Sensors 2023, 23(18), 7923; https://doi.org/10.3390/s23187923 - 15 Sep 2023
Cited by 3 | Viewed by 1899
Abstract
High-efficiency video coding (HEVC/H.265) is one of the most widely used video coding standards. HEVC introduces a quad-tree coding unit (CU) partition structure to improve video compression efficiency. The determination of the optimal CU partition is achieved through the brute-force search rate-distortion optimization [...] Read more.
High-efficiency video coding (HEVC/H.265) is one of the most widely used video coding standards. HEVC introduces a quad-tree coding unit (CU) partition structure to improve video compression efficiency. The determination of the optimal CU partition is achieved through the brute-force search rate-distortion optimization method, which may result in high encoding complexity and hardware implementation challenges. To address this problem, this paper proposes a method that combines convolutional neural networks (CNN) with joint texture recognition to reduce encoding complexity. First, a classification decision method based on the global and local texture features of the CU is proposed, efficiently dividing the CU into smooth and complex texture regions. Second, for the CUs in smooth texture regions, the partition is determined by terminating early. For the CUs in complex texture regions, a proposed CNN is used for predictive partitioning, thus avoiding the traditional recursive approach. Finally, combined with texture classification, the proposed CNN achieves a good balance between the coding complexity and the coding performance. The experimental results demonstrate that the proposed algorithm reduces computational complexity by 61.23%, while only increasing BD-BR by 1.86% and decreasing BD-PSNR by just 0.09 dB. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

22 pages, 4694 KB  
Article
Reducing Video Coding Complexity Based on CNN-CBAM in HEVC
by Huayu Li, Geng Wei, Ting Wang, ThiOanh Bui, Qian Zeng and Ruliang Wang
Appl. Sci. 2023, 13(18), 10135; https://doi.org/10.3390/app131810135 - 8 Sep 2023
Cited by 4 | Viewed by 2035
Abstract
High-efficiency video coding (HEVC) outperforms H.264 in coding efficiency. However, the rate–distortion optimization (RDO) process in coding tree unit (CTU) partitioning requires an exhaustive exploration of all possible quad-tree partitions, resulting in high encoding complexity. To simplify this process, this paper proposed a [...] Read more.
High-efficiency video coding (HEVC) outperforms H.264 in coding efficiency. However, the rate–distortion optimization (RDO) process in coding tree unit (CTU) partitioning requires an exhaustive exploration of all possible quad-tree partitions, resulting in high encoding complexity. To simplify this process, this paper proposed a convolution neural network (CNN) based optimization algorithm combined with a hybrid attention mechanism module. Firstly, we designed a CNN compatible with the current coding unit (CU) size to accurately predict the CU partitions. In addition, we also designed a convolution block to enhance the information interaction between CU blocks. Then, we introduced the convolution block attention module (CBAM) into CNN, called CNN-CBAM. This module concentrates on important regions in the image and attends to the target object correctly. Finally, we integrated the CNN-CBAM into the HEVC coding framework for CU partition prediction in advance. The proposed network was trained, validated, and tested using a large scale dataset covering various scenes and objects, which provides extensive samples for intra-frame CU partition prediction in HEVC. The experimental findings demonstrate that our scheme can reduce the coding time by up to 64.05% on average compared to a traditional HM16.5 encoder, with only 0.09 dB degradation in BD-PSNR and a 1.94% increase in BD-BR. Full article
Show Figures

Figure 1

18 pages, 20818 KB  
Article
A Visual Odometry Pipeline for Real-Time UAS Geopositioning
by Jianli Wei and Alper Yilmaz
Drones 2023, 7(9), 569; https://doi.org/10.3390/drones7090569 - 5 Sep 2023
Cited by 3 | Viewed by 3553
Abstract
The state-of-the-art geopositioning is the Global Navigation Satellite System (GNSS), which operates based on the satellite constellation providing positioning, navigation, and timing services. While the Global Positioning System (GPS) is widely used to position an Unmanned Aerial System (UAS), it is not always [...] Read more.
The state-of-the-art geopositioning is the Global Navigation Satellite System (GNSS), which operates based on the satellite constellation providing positioning, navigation, and timing services. While the Global Positioning System (GPS) is widely used to position an Unmanned Aerial System (UAS), it is not always available and can be jammed, introducing operational liabilities. When the GPS signal is degraded or denied, the UAS navigation solution cannot rely on incorrect positions GPS provides, resulting in potential loss of control. This paper presents a real-time pipeline for geopositioning functionality using a down-facing monocular camera. The proposed approach is deployable using only a few initialization parameters, the most important of which is the map of the area covered by the UAS flight plan. Our pipeline consists of an offline geospatial quad-tree generation for fast information retrieval, a choice from a selection of landmark detection and matching schemes, and an attitude control mechanism that improves reference to acquired image matching. To evaluate our method, we collected several image sequences using various flight patterns with seasonal changes. The experiments demonstrate high accuracy and robustness to seasonal changes. Full article
(This article belongs to the Special Issue Advances in AI for Intelligent Autonomous Systems)
Show Figures

Figure 1

16 pages, 5003 KB  
Article
A Method to Reduce the Intra-Frame Prediction Complexity of HEVC Based on D-CNN
by Ting Wang, Geng Wei, Huayu Li, ThiOanh Bui, Qian Zeng and Ruliang Wang
Electronics 2023, 12(9), 2091; https://doi.org/10.3390/electronics12092091 - 4 May 2023
Cited by 7 | Viewed by 2213
Abstract
Among a series of video coding standards jointly developed by ITU-T, VCEG, and MPEG, high-efficiency video coding (HEVC) is one of the most widely used video coding standards today. Therefore, it is still necessary to further reduce the coding complexity of HEVC. In [...] Read more.
Among a series of video coding standards jointly developed by ITU-T, VCEG, and MPEG, high-efficiency video coding (HEVC) is one of the most widely used video coding standards today. Therefore, it is still necessary to further reduce the coding complexity of HEVC. In the HEVC standard, a flexible partitioning procedure entitled “quad-tree partition” is proposed to significantly improve the coding efficiency, which, however, leads to high coding complexity. To reduce the coding complexity of the intra-frame prediction, this paper proposes a scheme based on a densely connected convolution neural network (D-CNN) to predict the partition of coding units (CUs). Firstly, a densely connected block was designed to improve the efficiency of the CU partition by fully extracting the pixel features of CTU. Then, efficient channel attention (ECA) and adaptive convolution kernel size were applied to a fast CU partition for the first time to capture the information of the D-CNN convolution channels. Finally, a threshold optimization strategy was formulated to select the best threshold for each depth to further balance the computation complexity of video coding and the performance of RD. The experimental results show that the proposed method reduces the encoding time of HEVC by 60.14%, with a negligible reduction in RD performance, which is better than the existing fast partitioning methods. Full article
Show Figures

Figure 1

Back to TopTop