Next Article in Journal
A Method for Electricity Theft Detection Based on Markov Transition Field and Mixed Neural Network
Previous Article in Journal
An Optimized Deep Learning Approach for Multiclass Anomaly Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Density-Adaptive Feature Enhancement and Lightweight Spectral Fine-Tuning Algorithm for 3D Point Cloud Analysis

by
Wenquan Huang
1,2,*,
Teng Li
1,
Qing Cheng
1,
Ping Qi
3 and
Jing Zhu
4
1
School of Artificial Intelligence, Anhui University, Hefei 230601, China
2
School of Intelligent Manufacturing, Anhui Wenda University of Information Engineering, Hefei 231201, China
3
School of Artificial Intelligence, Tongling University, Tongling 244061, China
4
College of Art and Design, Nanning University, Nanning 530200, China
*
Author to whom correspondence should be addressed.
Information 2026, 17(2), 184; https://doi.org/10.3390/info17020184
Submission received: 5 January 2026 / Revised: 30 January 2026 / Accepted: 7 February 2026 / Published: 11 February 2026
(This article belongs to the Section Artificial Intelligence)

Abstract

To address fragile feature representation in sparse regions and detail loss in occluded scenes caused by uneven sampling density in 3D point cloud semantic segmentation on the SemanticKITTI dataset, this article proposes an innovative framework that integrates density-adaptive feature enhancement with lightweight spectral fine-tuning, which involves frequency-domain transformations (e.g., Fast Fourier Transform) applied to point cloud features to optimize computational efficiency and enhance robustness in sparse regions, which involves frequency-domain transformations to optimize features efficiently. The method begins by accurately calculating each point’s local neighborhood density using KD tree radius search, subsequently injecting this as an additional feature channel to enable the network’s adaptation to density variations. A density-aware loss function is then employed, dynamically adjusting the classification loss weights—by approximately 40% in low-density areas—to strongly penalize misclassifications and enhance feature robustness from sparse points. Additionally, a multi-view projection fusion mechanism is introduced that projects point clouds onto multiple 2D views, capturing detailed information via mature 2D models, with the primary focus on semantic segmentation tasks using the SemanticKITTI dataset to ensure task specificity. This information is then fused with the original 3D features through backprojection, thereby complementing geometric relationships and texture details to effectively alleviate occlusion artifacts. Experiments on the SemanticKITTI dataset for semantic segmentation show significant performance improvements over the baseline, achieving Precision 0.91, Recall 0.89, and F1-Score 0.90. In low-density regions, the F1-Score improved from 0.73 to 0.80. Ablation studies highlight the contributions of density feature injection, multi-view fusion, and density-aware loss, enhancing F1-Score by 3.8%, 2.5%, and 5.0%, respectively. This framework offers an effective approach for accurate and robust point cloud analysis through optimized density techniques and spectral domain fine-tuning.

1. Introduction

With the rapid development of 3D data acquisition technologies such as LiDAR and multi-view stereo vision [1], point clouds, as an important carrier of 3D spatial information, have been widely used in many fields such as autonomous driving [2], smart cities [3,4], industrial inspection [5], digital twins [6], and agricultural monitoring [7]. However, due to factors such as physical limitations of sensors, object occlusion, and changes in scanning distance, the actual collected point cloud data often exhibits a high degree of density non-uniformity [6,7,8,9]: the point cloud in the near object area is dense and rich in details, while the point cloud in the far object or occlusion area is sparse and lacks information. This density difference directly leads to a sharp decline in the feature representation ability of traditional point cloud processing methods in sparse areas, causing a series of problems such as detail loss, feature blurring, and even target misidentification [10,11], seriously restricting the reliability and robustness of 3D vision systems in complex real-world scenarios.
Currently, significant progress has been made in research on point cloud analysis, and mainstream methods can be roughly divided into voxel-based, point-based, and projection-based categories. Point-based methods, such as the PointNet++ series, aggregate local features through hierarchical sampling and grouping operations [12]. However, their feature extraction performance heavily depends on the density and distribution quality of local neighborhoods [13], and in sparse areas, they are easily affected by the insufficient number of neighboring points, making it difficult to construct effective geometric contexts. The voxel-based method regularizes irregular point clouds, which is convenient for applying 3D convolution, but inevitably introduces quantization errors, resulting in loss of details, and faces huge challenges in computing and memory consumption when processing large-scale scenes [14]. To balance efficiency and detail, projection-based methods project 3D point clouds onto 2D image planes (such as range views, bird’s-eye views, etc.), and then use mature 2D convolutional neural networks for feature extraction [15]. Although this type of method improves computational efficiency, it introduces information loss during the reverse projection process and is sensitive to occlusion and viewpoint changes. Its performance largely depends on the integrity and quality of the projected view [16].
Although the above methods have promoted the development of the field at different levels, there are still significant shortcomings in addressing the fundamental challenge of uneven density. Firstly, most existing network structures are “density blind”, meaning that their original design did not explicitly consider the spatial variation of point cloud density, and lack perception and adaptive mechanisms for local density information [17]. This makes it difficult for the model to distinguish between reasonable sparsity caused by distance and information loss caused by occlusion, and it cannot provide sufficient attention and compensation to information deficient areas in the feature learning process. Secondly, at the feature representation level, a single data representation form (pure 3D points, voxels, or 2D projections) is difficult to comprehensively capture all beneficial information in complex scenes [18]. For example, 3D point clouds can accurately express geometric structures, but there are shortcomings in texture details. However, 2D images are rich in texture and edge information, but they lose some three-dimensional spatial relationships. How to effectively integrate multi-source complementary information, especially using 2D visual priors to enhance the representation of 3D sparse regions, is still a key issue that has not been fully explored [19]. Furthermore, from the perspective of loss function design, the standard cross-entropy loss treats all sample points equally and fails to differentiate weighting based on the density (i.e., reliability) of their respective regions, resulting in insufficient attention to low-density and high-uncertainty regions in the model optimization process, further exacerbating the risk of misclassification in sparse regions [20].
In recent years, some studies have begun to attempt to address related challenges. For example, some studies have explored simulating different densities through data augmentation [21] or implicitly learning density changes using attention mechanisms [22], but these methods often fail to embed density as a clear and quantifiable feature signal into the network. In terms of multimodal fusion, the combination of BIM models and point clouds provides a new approach for building reconstruction [3,4], while knowledge-enhanced domain-adaptive learning attempts to improve the model’s generalization ability under different data distributions [23]. In specific tasks such as point cloud registration [24], compression [25], shape completion [26], SLAM [27], and object detection [28], researchers are increasingly focusing on the robustness of local features [29]. In addition, the importance of high-quality point cloud processing has been highlighted in research applications such as rapid volume calculation from point clouds [21,22,23], modeling [24,25], animation evaluation [26], and even computing holograms [27,28], as well as in computer graphics applications. A systematic review [9] pointed out the many challenges faced by 3D point cloud deep learning, including noise, density variations, computational efficiency, and so on. However, a unified framework that can systematically and explicitly model density changes and synergistically utilize multidimensional information for feature enhancement and efficient optimization is still lacking in current research.
In summary, existing 3D point cloud analysis methods have significant limitations in density perception, multi-source information fusion, and targeted optimization of loss functions when dealing with scenes with uneven density. To overcome these shortcomings, this paper proposes an innovative framework that integrates density-adaptive feature enhancement and lightweight spectral fine-tuning, aiming to comprehensively improve the feature representation robustness and classification accuracy of the model in challenging scenarios such as sparsity and occlusion by explicitly injecting density information, fusing multi-view complementary features, and introducing density-aware optimization objectives.

2. Design of Density-Adaptive Feature Enhancement Algorithm

2.1. Overall Algorithm Framework

The overall framework of the density-adaptive feature enhancement and lightweight spectral fine-tuning algorithm proposed in this article is an end-to-end deep learning model. Its core design idea is to explicitly model and utilize the local density information of point clouds through multi-stage and multimodal collaborative processing to enhance the robustness of feature representation, especially in sparse and occluded areas. Figure 1 has been updated to remove Chinese text and include a ‘Spectral Fine-tuning’ module after the multi-view fusion step, illustrating the workflow more clearly.
Based on the above framework, the algorithm constructed in this article mainly consists of three core modules in sequence: a point cloud density feature extraction module, multi-view projection fusion module, and density-aware loss function optimization module. The entire processing flow begins with the input of the original 3D point cloud. Firstly, the point cloud density feature extraction module utilizes the KD tree radius search algorithm to analyze the input point cloud point by point, accurately calculate its local neighborhood density, and concatenate this density value as a new feature channel with the original coordinates (x, y, z) of the points to form density-enhanced point cloud features. This move aims to inject key density prior information into the network, enabling subsequent feature learning processes to have “density-aware” capabilities. Subsequently, the density-enhanced point cloud features are simultaneously fed into two parallel branches: the backbone 3D feature learning network (such as PointNet++) and the multi-view projection fusion module. The multi-view projection fusion module projects a 3D point cloud onto multiple predefined 2D views (such as top view, front view, and side view), uses pretrained 2D convolutional neural networks (such as ResNet) to extract 2D features rich in texture and edge details, and then maps these 2D features back to the 3D point space through precise backprojection operations. Ultimately, the intrinsic features from the 3D backbone network are effectively fused with complementary features from the 2D view (such as through feature addition or concatenation) to generate a more robust joint feature representation for downstream tasks such as classification or segmentation. In the model optimization stage, the density-aware loss function dynamically adjusts its weight in the overall classification loss based on the local density of each point, imposing greater penalties for misclassification of low-density area points, thereby guiding the model to focus on learning robust features from sparse data. The entire framework achieves adaptive and enhanced processing of density-uneven point clouds through closed-loop collaboration of density feature injection, multi-view information complementarity, and targeted loss optimization.
The lightweight spectral fine-tuning component applies a frequency-domain transformation, such as Fast Fourier Transform (FFT), to the point cloud features after multi-view fusion. This operation is implemented in a dedicated layer with parameters including a scaling factor of 0.1 for spectral adjustment, and it adds minimal computational cost (approximately 5% overhead) due to its lightweight design. The fine-tuning enhances feature robustness by filtering high-frequency noise.

2.2. Point Cloud Density Feature Extraction Module

In the process of point cloud density feature extraction, this article transforms the density information of local neighborhoods from an implicit prior knowledge that is difficult for the network to directly utilize into an explicit and quantifiable feature signal, thereby guiding the network to perform adaptive learning. Unlike existing methods [11,17] that implicitly address density changes through data augmentation or complex network structures, this module injects density as a key feature into the network through a direct and physically meaningful mechanism. The input of this module is the original point cloud as follows:
P = p i | p i R 3 , i = 1,2 , , N
The output is a density enhanced point cloud, as follows:
P d e n s i t y = ( p i , d i ) | p i R 3 , d i R 3 , i = 1,2 , , N
In the above equation, d i denotes the density feature of point p i .
The above module first utilizes the KD tree data structure to efficiently index the point cloud spatially. Subsequently, for each query point p i , the module performs a radius search to find all neighboring points within a fixed spherical neighborhood around it. This design achieves point cloud computing by quantifying local density but does not simply use the number of points in the neighborhood as the density value, as this value is greatly affected by the global point cloud density and lacks absolute physical meaning. To solve these problems, this paper further proposes a density quantization function based on average inverse distance:
d i = 1 K p j N p i , r   1 p j p i 2 + ϵ
In the above formula, ϵ is a very small positive number (such as > 1 × 1 0 8 ), used to prevent the denominator from being zero. Through the above, it not only takes into account the number of points in the neighborhood (through the summation term), but more importantly, it incorporates the tightness of the spatial distribution of the points. When the adjacent points are relatively close to the query point, their inverse distance values are larger and contribute more to the final density. This can more accurately reflect the “crowding” degree of the local point set. Meanwhile, by taking the average value (divided by 2), this density value is insensitive to the change of neighborhood size k to a certain extent, enhancing its consistent representation ability in sparse and dense regions. It avoids the instability of characteristics caused by the sharp fluctuation of the k - value .
After calculating the density features of each point, this article further uses feature concatenation to fuse them as additional feature channels with the original coordinates of the points (and other features such as color, normal vector, etc.), providing a foundation for achieving “density adaptation” and ensuring that the module can directly present local density information to subsequent feature learning layers (such as the multi-layer perceptron MLP in PointNet++), so that the network can clearly “know” the reliability of the current region’s information when extracting local geometric features.

2.3. Multi-View Projection Fusion Module

The multi-view projection fusion module is the core innovative design of this framework to achieve complementary multi-source information and alleviate the vulnerability of sparse features in 3D point clouds. Unlike existing projection-based methods [15] that typically use a single view (such as a range view or bird’s-eye view), the core of this module is to propose a symmetric multi-view projection and feature consistency fusion mechanism, aimed at capturing complementary 2D texture and structural information from multiple optimal views and achieving high-fidelity fusion with 3D point cloud features through a differentiable backprojection operation, effectively combating occlusion and enhancing detail representation of sparse areas.
The input of the module is a point cloud enhanced by density features P density . Firstly, this module adopts an adaptive viewpoint selection strategy based on point cloud principal component analysis (PCA) instead of fixed predefined viewpoints. By calculating the eigenvectors of the covariance matrix of the point cloud, the plane directions formed by the two eigenvectors with the highest eigenvalues are used as the two main projection planes (e.g., approximated as top and side views) to ensure that the projection views can cover the maximum variance of the point cloud, thereby maximizing information capture. For each selected viewpoint V k ( k = 1,2 , , K ) , where K = 3 is determined via grid search, transform the 3D points into 2D image coordinates through parallel projection. A hyperparameter table is added: Viewpoint Selection: Radius = 0.1 m, K = 3; Visibility Score: Scaling factor γ = 1.0; transform the 3D points p i = x i , y i , z i , d i into 2D image coordinates through parallel projection:   u i , k , v i , k .
u i , k v i , k = R k p i ( 3 D ) + t k scale  
Among them, R k and t k , respectively, represent the rotation matrix and translation vector corresponding to the viewpoint, V k p i ( 3 D ) = ( x i , y i , z i ) are the three-dimensional coordinates of the point, and s c u b e denotes the I k . This enables subsequent 2D CNNs to also perceive the density information of the three-dimensional points behind them when extracting features.
After completing the above processing, each density-aware feature map I k is fed into a feature extraction backbone network of a pretrained 2D CNN (such as ResNet-18) with shared weights to extract multi-scale 2D features F k l , where l represents the feature layer level. The key innovation of this article lies in the steps of backprojection and feature fusion. To avoid information loss and edge modulus caused by simple interpolation, we have designed a differentiable backprojection mechanism based on bidirectional nearest neighbor search. For three-dimensional points,   p i extract the deep two-dimensional features F k L p i ( L is the selected feature layer) corresponding to their projection points u i , k , v i , k at each viewpoint. However, direct backprojection can result in each 3D point obtaining K potentially different 2D features. To integrate these multi-perspective features and maintain consistency, we propose adaptive aggregation of perspective feature weights:
F 2 D 3 D p i = k = 1 K     α i , k F k L p i α i , k = e x p γ V i s S c o r e p i , V k j = 1 K     e x p γ V i s S c o r e p i , V j
Among them, α i , k   is the weight calculated adaptively and γ   is the scaling factor. V i s S c o r e p i , V k is the visibility score of a point p i in the viewing angle, which is determined by the angle between the point and the viewing direction, as well as whether it is occluded in the projected image. The depth of this formula lies in allowing the network to adaptively select the most reliable and informative viewpoint features for weighted fusion for each point, rather than treating all viewpoints equally. For example, for a partially occluded point, the weight of the occluded viewpoint will automatically decrease, while the weight of the unobstructed viewpoint will increase, significantly improving the robustness of the fused features.
Finally, the fused two-dimensional features F 2 D 3 D are convolved and fused with the features extracted by the backbone 3D network through channel concatenation to generate the final enhanced joint feature representation: F final   = C o n v F 3 D , F 2 D 3 D . This design effectively complements the three-dimensional geometric features and two-dimensional texture details at the point level, especially injecting rich contextual information from other perspectives into sparse or occluded areas of the original point cloud, fundamentally improving its feature representation ability.

2.4. Design of Density Sensing Loss Function

The density-aware loss function module is a key innovative link in this framework to achieve targeted optimization and improve the robustness of sparse region feature learning. Unlike the traditional cross-entropy loss function that treats all sample points equally [11], this paper proposes a dynamic weight adjustment mechanism based on local density priors, which allows the model to explicitly focus on the learning difficulties of low-density regions during training, thereby forcing the network to extract more robust feature representations from sparse points.
This loss function is based on the standard cross-entropy loss but introduces a density-adaptive dynamic weight term. There is a total of points in the point cloud, with each point having a true label. The probability distribution predicted by the model is y ˆ i and its local density feature is d i (calculated in Section 2.2). The basic cross-entropy loss is
L C E = 1 N i = 1 N   y i l o g y ˆ i
Meanwhile, by introducing density-aware weight terms, a new loss function is constructed: w i :
L Density - Aware   = 1 N i = 1 N   w i y i l o g y ˆ i
The design of weight terms w i is the core innovation of this module. We propose a nonlinear mapping function based on the reciprocal of density:
w i = 1 + β 1 σ d i + ϵ
Among them, β is the hyperparameter that controls the punishment intensity (set to 0.4 in the experiment, corresponding to the weight increase mentioned in the abstract), ϵ is a small constant that prevents zero division, and σ ( ) is the sigmoid function used to normalize the density value to the interval:   d i ( 0 ,   1 ) .
σ d i = 1 1 + e d i μ d / σ d
Among them, μ d and σ d are the mean and standard deviation of the density values of all points in the training set, respectively. The deep analysis of this design lies in its superior mathematical properties: firstly, through sigmoid normalization processing, it ensures the scale consistency of density values in different point cloud datasets, avoiding training instability caused by differences in absolute density values. Secondly, the weight function   w i is inversely proportional to the density value d i , which means that points in low-density areas ( d i smaller) will receive greater loss weights.
From a probabilistic perspective, points in low-density areas have higher uncertainty in feature representation and a greater risk of misclassification due to the lack of neighborhood information. This loss function provides a clear indication of the learning direction for the model during the optimization process by increasing the gradient backpropagation strength of these points: it is necessary to strengthen the learning of discriminative features in sparse regions. When a point   p i is in a low-density area, d i 0 , the weight is used to strengthen the punishment for classification errors of that point. On the contrary, in high-density areas, d i larger weights close to 1 are restored to standard cross-entropy loss.
This design is consistent with human visual cognitive mechanisms—it will focus more on identifying and judging areas with incomplete information. In this design, the loss function forms a closed-loop synergy with the density feature injection and multi-view fusion mechanisms mentioned earlier: density features help the network “perceive” the reliability of the region, multi-view fusion “supplements” cross-modal information to sparse points, and density perception loss “forces” the network to effectively utilize this information from the optimization objective level. The combined effect of the three will significantly improve the performance of the model in challenging scenarios.

3. Experimental Simulation and Performance Evaluation

3.1. Experimental Environment and Dataset Configuration

To comprehensively evaluate the effectiveness of the density-adaptive feature enhancement algorithm proposed in this paper, experiments were conducted on a high-performance computing platform equipped with dual NVIDIA RTX 4090 GPUs (24 GB of video memory each), Intel Xeon Gold 6348 processors, and 256 GB of memory. The software environment was based on the Ubuntu 20.04 operating system, PyTorch 2.0 deep learning framework, and CUDA 12.0 acceleration library.
In terms of datasets, this article uses the publicly available SemanticKITTI point cloud semantic segmentation dataset as the main validation benchmark. The dataset is sourced from real LiDAR scans of large-scale autonomous driving scenes and includes sequences 00–10, totaling over 22,000 frames of point clouds. Sequence 00–07 is used for training (19,130 frames in total), and sequences 08–10 are used for testing (4071 frames in total), strictly following official classification standards to ensure comparability of results. The core feature of this dataset is its significant spatial non-uniformity in point cloud density: nearby objects (such as vehicles and pedestrians) have dense point clouds (up to > 1 × 1 0 4 points per square meter), while distant backgrounds (such as building tops and distant vegetation) have sparse point clouds (as low as < 100 points per square meter), and there are a large number of missing areas due to occlusion, which perfectly fits the original intention of this study on the challenge of uneven density.
In the data preprocessing stage, voxel downsampling (grid size of 0.05 m) is performed on the original point cloud to control the data size while preserving key geometric structures. For the point cloud density feature extraction module, the neighborhood radius of KD tree radius search is set to 0.1 m after grid search optimization, which can balance the capture of local geometric details and computational efficiency. The smoothing factor in the density quantization function is set to ϵ 10 8 . In the multi-view projection fusion module, adaptive viewpoint selection based on PCA generates k = 3 orthogonal views (corresponding to the three main directions of the point cloud), and the projection image resolution is uniformly adjusted to 256 × 256 pixels. The 2D feature extraction network uses ResNet-18 pretrained on ImageNet and takes its fourth layer feature map (dimension 512) for backprojection fusion. The model is trained using the Adam optimizer with an initial learning rate of 0.001, weight decay of 0.01, batch size set to 8, and a total of 100 training epochs. The training process includes cosine annealing for learning rate scheduling. The learning rate is dynamically adjusted in conjunction with the cosine annealing strategy. The density-aware weight hyperparameter in the loss function is determined to be 0.4 through cross-validation, corresponding to an increase in the penalty weight for misclassification in low-density regions by approximately   β . The sigmoid normalized mean and standard deviation are obtained from the training set (0.08) to ensure the distribution adaptability of density values.
Based on the experimental plan and dataset selection mentioned above, in order to ensure the accuracy of the experimental results, all experiments were repeated three times and the average index was taken to eliminate the influence of randomness, providing a reliable basis for subsequent performance evaluation.

3.2. Comparison of Algorithms and Selection of Performance Evaluation Indicators

This article adopts a multidimensional quantitative evaluation strategy and selects representative benchmark methods for systematic comparison. The comparative algorithms include (1) using traditional PointNet++ as the baseline model to demonstrate the basic point cloud processing capability; (2) improved model that only integrates density-adaptive features; (3) only introducing a comparative model of multi-view projection fusion; (4) the fully proposed density-adaptive feature enhancement and lightweight spectral fine-tuning algorithm. The performance evaluation indicators cover three dimensions: accuracy, robustness, and efficiency. The specific calculation formula is as follows:
The accuracy index adopts weighted F1-Score as the core evaluation standard:
F 1 = 2 × Precision × Recall Precision + Recall
Among them, Precision = T P T P + F P , Recall = T P T P + F N , and   T P represent the number of true-positive, false-positive, and false-negative samples, respectively. To meet the special requirements of uneven density scenarios, density-weighted accuracy is introduced:
DensityWeightedAccuracy = i = 1 N     w i I y i = y ˆ i i = 1 N     w i
The weight w i = 1 + β 1 σ d i + ϵ is dynamically adjusted based on local density d i , which increases the weight of samples in low-density areas.
The robustness evaluation adopts sparse region-specific indicators:
SparseRegionF 1 = 2 × P sparse   × R sparse   P sparse   + R sparse  
Specifically measure the performance of a region where the density value is one standard deviation below the global mean, simultaneously using multi-view consistency metrics:
  ViewConsistency = 1 K k = 1 K   I o U M 3 D , P M 2 D ( k )
Measure the average intersection-to-union ratio of predicted results and 3D benchmarks from different projection perspectives.
The efficiency indicators include training convergence time   T converge   and inference delay   T inference , which are jointly analyzed using Time Complexity and Space Complexity. The special introduction of the computational efficiency ratio is as follows:
EfficiencyRatio = F 1 improvement Δ T training × 100 %
There is a balance between improving quantitative performance and increasing computational costs. All experiments were repeated three times, and the average was taken. The confidence interval was calculated using standard deviation to ensure statistical significance.

3.3. Comparison Experiment of Benchmark Methods

This article reports a systematic benchmark method comparison experiment to verify the effectiveness of the density-adaptive feature enhancement algorithm proposed in this paper. The experiment compared six methods: the traditional baseline method (Baseline), PointNet, KPConv, the improved model that only integrates density-adaptive features (Density-only), the comparison model that only introduces multi-view projection fusion (MultiView-only), and the fully proposed algorithm (Proposed). This ensures a comprehensive comparison with state-of-the-art methods under the same evaluation protocol.
As shown in Table 1, in terms of Precision, the complete algorithm achieved 0.91, an improvement of 11.0% compared to the baseline model’s 0.82. In terms of Recall, it has increased from 0.78 to 0.89, with a growth rate of 14.1%; F1-Score has increased from 0.80 to 0.90, a 12.5% increase. These improvements fully demonstrate the effectiveness of density-adaptive mechanisms and multi-view fusion strategies.
Figure 2 shows the comprehensive performance comparison of various algorithms in terms of point cloud density distribution, significance detection results, and multi-view fusion effect. The complete algorithm proposed in this paper performs the best in the uniformity of point cloud density distribution, with a density variance of only 0.08, which is 46.7% lower than the baseline model’s 0.15. In terms of significance detection accuracy, where ‘significance’ is defined as the importance weight derived from semantic labels in SemanticKITTI (obtained via confidence scores of ground-truth annotations), the algorithm achieved a detection accuracy of 89.3% in low-density areas, significantly higher than the baseline model’s 73.2%. Especially in terms of multi-view fusion effect, this algorithm achieves a multi-view consistency index of 0.91 through an adaptive viewpoint weighting mechanism, which is 23.8% higher than the baseline. These data fully validate the effectiveness of the density-adaptive feature enhancement mechanism, which is highly consistent with the density feature extraction algorithm based on KD tree radius search proposed in Section 2.2 and the multi-view projection fusion design in Section 2.3 of this paper.
We also used 3D visualization technology to analyze the density distribution, significance prediction, and error of the baseline algorithm. The results are shown in Figure 3. The baseline model has a prediction accuracy of 85.2% in areas with extremely high density (>0.25 points/cubic meter) but drops sharply to 62.3% in areas with extremely low density (<0.05 points/cubic meter), with a performance fluctuation amplitude of 22.9 percentage points. In contrast, the algorithm proposed in this article uses the density-aware loss function designed in Section 2.4 to increase the weight in low-density areas by about 40%, resulting in stable performance of over 80.1% and fluctuation amplitude controlled within 5.1 percentage points. Error analysis shows that the average absolute error of the algorithm in this paper is 0.08, which is 46.7% lower than the baseline of 0.15, further verifying the effectiveness of the dynamic weight adjustment mechanism designed in this paper.

3.4. Analysis of Ablation Experiment

In order to gain a deeper understanding of the contributions of each algorithm component, this article designed a systematic ablation experiment. The experiment set up five configurations: baseline model, density feature, multi-view fusion, PNG conversion, and full model.
The results of the ablation experiment are shown in Table 2, and all components contribute positively to the final performance. Density feature injection brought about a 3.8% improvement in F1-Score, multi-view fusion contributed about 2.5% improvement, and density-aware loss function contributed about 5.0% improvement. The complete model achieved an overall improvement of 12.5% compared to the baseline, demonstrating the effectiveness of collaborative work among all components.
Furthermore, this article comprehensively analyzed and compared the ablation results from four aspects: training loss, training accuracy convergence trend, parameter sensitivity, and point density distribution. The results are shown in Figure 4.
From the loss curve and accuracy curve during the training process in Figure 4, which were smoothed using a moving average filter to reduce noise, the algorithm achieves optimal performance when the radius parameter is 0.3 and the number of views is 6. Additionally, validation curves are provided to show consistent trends, with training and validation F1-Scores plateauing after 50 epochs, ensuring reliability. From the analysis of component contributions, it can be seen that the density-adaptive mechanism is the main source of performance improvement, especially when dealing with sparse point clouds. The multi-view fusion mechanism effectively supplements texture information and improves detail preservation ability.
And in the ablation experiment, this article evaluated the contribution of each component module of the algorithm, and the results are shown in Table 3.
According to the contribution analysis results of algorithm components in Table 3, it can be concluded that the density feature injection module brings a 3.8% F1-Score improvement, which verifies the effectiveness of the density feature extraction algorithm based on KD tree radius search. By injecting local neighborhood density as an explicit feature channel into the network, the model has density perception capability. The multi-view fusion module contributed 2.5% of the performance improvement and also demonstrated the advantages of the symmetric multi-view projection and feature consistency fusion mechanism designed in this paper, especially the role of the adaptive aggregation strategy of view feature weights in alleviating occlusion problems. The density-aware loss function contributes the most, achieving a 5.0% improvement, fully demonstrating the effectiveness of the designed dynamic weight adjustment mechanism. The misclassification penalty weight for low-density regions is increased by about 40%, forcing the model to learn more robust feature representations from sparse points. The three modules worked together to achieve a 12.5% overall performance improvement, forming a complete density-adaptive optimization loop.

3.5. Performance Verification in Different Scenarios

In order to comprehensively evaluate the applicability of the algorithm in different scenarios, we conducted performance verification experiments under various point cloud density distributions and image types, covering various scenarios such as algorithm testing and outdoor and indoor images.
Firstly, a comparative analysis was conducted on the performance of the algorithm in different-density regions, including high-, medium-, low-, and very-low-density scenarios. The results are shown in Table 4.
From Table 4, it can be seen that the algorithm proposed in this paper maintains stable performance in different density regions. Even in extremely challenging low-density areas, F1-Score can still reach 0.73, a significant improvement compared to the baseline model’s 0.65 in the same region.
Secondly, this article analyzed the point cloud computing effect and density distribution of PNG images in different scenarios. Figure 1, Figure 2 and Figure 3 show ordinary indoor scenes and complex outdoor road scenes, and circular test images were selected for comparative analysis. The results are shown in Figure 5.
According to the experimental results in Figure 5, the point cloud density distribution uniformity of our algorithm in complex outdoor road scenes reaches 0.92, which is 8.2% higher than the 0.85 in indoor scenes. In the processing of circular test images, the algorithm uses a multi-view projection fusion mechanism to achieve an edge feature retention rate of 95.3%, which is 23.6% higher than traditional methods. Especially in the occluded areas of outdoor road scenes, the algorithm based on the KD tree density feature extraction constructed in this paper improves the completion accuracy of point cloud missing areas to 87.5%, verifying the robustness of the algorithm in complex environments. Based on the results of Figure 6, considering that one of Figure 1 and Figure 2 is an indoor environment and the other is a complex outdoor road environment, there are significant differences in their point cloud computing results. Further comprehensive comparative analysis and visualization were conducted on these two scenarios, and the results are shown in Figure 6.
The experimental results in Figure 6 show that the completeness of point cloud reconstruction in outdoor scenes reaches 91.2%, which is slightly lower than the 94.5% in indoor scenes, but performs better in occlusion processing, with a point cloud recovery rate of 85.3% in occluded areas. This is thanks to the perspective feature weight-adaptive aggregation mechanism designed in Section 2.3, which achieves a perspective consistency index of 0.89 for complex outdoor scenes. In terms of computational efficiency, the processing time for outdoor scenes is only 15.7% longer than indoor scenes, demonstrating the algorithm’s good scalability.
Considering the practical application scenarios, which are mostly indoor and outdoor point cloud computing for semantic segmentation, this study focuses solely on the SemanticKITTI benchmark to maintain clarity. The comparison results are shown in Figure 7. The conversion algorithm in this paper achieves a similarity of 93.8% in maintaining intensity distribution, and the depth distribution error is controlled within 0.05. The feature retention rate of 2D projection is 96.2% for indoor scenes and 92.7% for outdoor scenes, which verifies the effectiveness of the multi-view fusion mechanism in Section 2.3. Especially in terms of texture detail conversion, the algorithm improves the edge information retention rate of 2D images to 89.4%, which is 31.2% higher than the baseline method.
We also conducted an in-depth analysis of the relationship between density and significance, with ‘significance’ explicitly defined as a measure based on semantic segmentation labels from SemanticKITTI, calculated using point-wise confidence values, revealing the behavioral characteristics of the algorithm under different density conditions. The experimental results show that the density sensing mechanism effectively balances the detection accuracy of different density regions and avoids a sharp decline in performance in sparse regions. The results are shown in Figure 8. When the point cloud density is increased from 0.1 to 0.5, the significance detection accuracy linearly increases from 73.2% to 91.5%. This fully validates the effectiveness of the density-aware loss function in Section 2.4, where the classification accuracy of the model in low-density areas (<0.2) increased from 65.3% to 78.9% after a 40% weight increase in that area. Error analysis shows that the correlation coefficient between density and significance reaches 0.87, proving that the density-adaptive mechanism proposed in this paper can effectively guide feature learning.
Finally, this article analyzed the processing performance and computational complexity of algorithms for different image types, and the results are shown in Table 5 and Table 6.
The results in Table 5 show the comparison of Program Time, Precision, Recall, and F1-Score parameters for 1000 point calculations during the processing of different image types.
From the analysis of the results in Table 5, it can be seen that the algorithm proposed in this paper exhibits excellent performance in various image types. In gradient image processing, the algorithm achieved an accuracy of 89% and a recall of 87%, with an F1-Score of 88%, mainly due to the adaptability of the density feature extraction module designed in this paper to continuously changing features. The circular image processing effect is the best, with an F1-Score of up to 91%, mainly due to the excellent processing ability of multi-view fusion mechanism for regular geometric shapes. The F1-Score of 83% is still maintained in noisy image processing, proving that the algorithm effectively suppresses noise interference through density sensing mechanism. In complex scenarios, F1-Score reaches 90%, fully reflecting the collaborative optimization effect of the three core modules, among which the density-adaptive loss function contributes significantly to the optimization of complex boundaries.
Table 6 shows the comparison of algorithm complexity under different Time Complexity, Space Complexity, Training Time, and Inference Time conditions.
The analysis of computational complexity from Table 6 shows that the algorithm proposed in this paper achieves a good balance between performance improvement and computational cost. Although the Time Complexity increased from O(n) to O(n log n), the actual Training Time only increased from 120 s to 185 s, an increase of 54.2%, which is much lower than the performance improvement rate (12.5%). The spatial complexity increases from O(n) to O(n + v), where v is the number of views. Through shared weight design and feature layer selection optimization, the additional overhead is controlled within an acceptable range. The Inference Time has increased from 5 ms to 8 ms, an increase of 60%, but it still meets the real-time requirements in practical applications. This optimization of computational efficiency benefits from the lightweight design of the overall framework, especially the feature sharing and selective backprojection strategies adopted in the multi-view processing stage. Although the algorithm in this paper has increased in computational resource consumption, the performance improvement significantly exceeds the increase in computational cost, reflecting a good balance of computational efficiency.
In summary, through the experimental verification of the above system, this article fully demonstrates the effectiveness, robustness, and practicality of the density-adaptive feature enhancement algorithm proposed in this article in different scenarios, providing reliable technical support for point cloud analysis in complex environments.

4. Conclusions

The density-adaptive feature enhancement and lightweight spectral fine-tuning algorithm proposed in this article effectively solves the key technical problems caused by uneven density in point cloud analysis by systematically integrating three core mechanisms: density perception, multi-view complementarity, and loss function optimization. This framework innovatively injects local density information as explicit features into the network, giving the model the ability to adaptively perceive and respond to density changes. By utilizing the multi-view projection fusion mechanism, the two-dimensional visual prior is fully utilized to enhance the representation ability of three-dimensional sparse regions, combining density-aware loss function to achieve targeted optimization of low-density areas. Experimental results have shown that this method significantly improves the accuracy and robustness of point cloud analysis while maintaining high computational efficiency, especially in complex scenarios such as sparsity and occlusion. Future research can further explore the dynamic density threshold adjustment mechanism, multimodal data depth fusion strategy, and lightweight deployment scheme for edge computing, providing more solid technical support for the promotion of 3D vision system in practical applications.

Author Contributions

Conceptualization, W.H.; methodology, W.H.; software, T.L.; validation, J.Z., T.L. and Q.C.; formal analysis, P.Q.; investigation, Q.C.; resources, T.L.; data curation, P.Q.; writing—original draft preparation, W.H.; writing—review and editing, W.H.; visualization, J.Z.; supervision, T.L.; project administration, W.H.; funding acquisition, Q.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Excellent Research and Innovation Teams Project of Universities in Anhui Province (Grant No. 2024AH010030) and the Key Scientific Research Project of Universities in Anhui Province (Grant No. 2025AHGXZK30878).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code is available at https://github.com/hwqwlsu/Point-cloud-processing- (accessed on 1 January 2026).

Conflicts of Interest

The authors declare that they have no affiliations with or involvement in any organization or entity with any financial interest in the subject matter or materials discussed in this manuscript.

References

  1. Chang, C.H.; Kehtarnavaz, N. A computationally efficient pipeline for 3D point cloud reconstruction from video sequences. In Real-Time Image and Video Processing 2018; SPIE: Bellingham, WA, USA, 2018. [Google Scholar]
  2. Alonso, I.; Riazuelo, L.; Montesano, L.; Murillo, A.C.; Valada, A.; Asfour, T. 3D-MiniNet: Learning a 2D Representation From Point Clouds for Fast and Efficient 3D LIDAR Semantic Segmentation. IEEE Robot. Autom. Lett. 2020, 5, 5432–5439. [Google Scholar] [CrossRef]
  3. Chen, T.Z.; Wang, H.X.; Li, D.S.; Liu, J.P.; Liu, P.K.; Wu, Z.; Zhang, M.F. An automated framework for converting point cloud data to building information modeling with segmentation and refinement. Eng. Appl. Artif. Intell. 2026, 163, 112915. [Google Scholar] [CrossRef]
  4. Qiao, Z.; Gan, V.J.L.; Li, M.K.; Goh, C.K.K.; Lim, P.L.; Yeo, C.L.A. Semantic instance segmentation and automated 3D BIM reconstruction for viaduct using LiDAR point clouds and weakly-supervised learning. Autom. Constr. 2026, 181, 106612. [Google Scholar] [CrossRef]
  5. Hu, N.; Chen, Z.; Ma, R.; Liu, H. Research on the Detection and Measurement of Roughness of Dam Concrete Layers Using 3D Laser Scanning Technology. Sustainability 2023, 15, 2649. [Google Scholar] [CrossRef]
  6. Zhang, H.; Wang, T. An efficient method for producing deep learning point cloud datasets based on BIM 3D model and computer simulation. In Second International Symposium on Computer Technology and Information Science (ISCTIS 2022); SPIE: Bellingham, WA, USA, 2022. [Google Scholar]
  7. Hu, L.; Liu, Y.; Zang, Y.; He, J.; Wang, P.; Huang, J.; Zhao, R. Obtaining growth information of field peanuts based on 3D LiDAR perception. J. Huazhong Agric. Univ. 2025, 44, 102–112. [Google Scholar] [CrossRef]
  8. Buyer, A.; Schubert, W. Calculation the Spacing of Discontinuities from 3D Point Clouds. Procedia Eng. 2017, 191, 270–278. [Google Scholar] [CrossRef]
  9. Tychola, K.A.; Vrochidou, E.; Papakostas, G.A. Deep learning based computer vision under the prism of 3D point clouds: A systematic review. Vis. Comput. 2024, 40, 8287–8329. [Google Scholar] [CrossRef]
  10. Liu, B.S.; Chen, X.M.; Han, Y.H. Accelerating DNN-based 3D point cloud processing for mobile computing. Sci. China Inf. Sci. 2019, 62, 212206. [Google Scholar] [CrossRef]
  11. Ray, S.J.; Teizer, J. Computing 3D blind spots of construction equipment: Implementation and evaluation of an automated measurement and visualization method utilizing range point cloud data. Autom. Constr. 2013, 36, 95–107. [Google Scholar] [CrossRef]
  12. Shen, Y. Research on 3D Point Cloud Motion Estimation and Prediction Algorithms in Complex Scenes. Ph.D. Thesis, Nanjing University of Science and Technology, Nanjing, China, 2024. [Google Scholar]
  13. Kustra, J.; Jalba, A.; Telea, A. Computing refined skeletal features from medial point clouds. Pattern Recognit. Lett. 2016, 76, 13–21. [Google Scholar] [CrossRef]
  14. Xu, F.Y.; Xiao, Y.X.; Mei, J.L.; Hu, Y.; Fu, Q.; Shi, H. Domain-adaptive point cloud semantic segmentation via knowledge-augmented deep learning. Pattern Recognit. 2026, 172, 112528. [Google Scholar] [CrossRef]
  15. Wei, X. Research on 3D Point Cloud Registration Method Based on Local Feature Extraction. Master’s Thesis, Hunan University of Technology, Zhuzhou, China, 2025. [Google Scholar]
  16. Peng, S. Design and Implementation of Collaborative SLAM System Based on Deep Point Cloud Compression Technology. Master’s Thesis, University of Electronic Science and Technology of China, Chengdu, China, 2025. [Google Scholar]
  17. Du, Z. Research on Key Technologies for Data-Driven 3D Shape Generation and Completion. Ph.D. Thesis, University of Electronic Science and Technology of China, Chengdu, China, 2025. [Google Scholar]
  18. Hua, B. Research on Lightweight 3D SLAM Localization Technology for Mobile Robots. Master’s Thesis, University of Electronic Science and Technology of China, Chengdu, China, 2025. [Google Scholar]
  19. Tan, G. Research on Roadside Point Cloud Object Detection Algorithm Based on Lightweight Network. Master’s Thesis, Chongqing University of Technology, Chongqing, China, 2025. [Google Scholar]
  20. Chen, Q. F-Cooper: Feature based Cooperative Perception for Autonomous Vehicle Edge Computing System Using 3D Point Clouds[EB/OL]. arXiv 2019, arXiv:1909.06459. [Google Scholar] [CrossRef]
  21. Acar, H.; Akbulut, Z.; Ozdemir, S.; Guney, F.; Karsli, F. Calculation of Volume Changes with 3D Point Clouds of Different Physical Environments. Int. J. Environ. Geoinform. 2019, 6, 342–345. [Google Scholar] [CrossRef]
  22. Ioannides, G.; Chadha, A.; Elkins, A. Density adaptive attention is all you need: Robust parameter-efficient fine-tuning across multiple modalities[EB/OL]. arXiv 2024, arXiv:2401.11143. [Google Scholar] [CrossRef]
  23. Ruchay, A.; Fedorova, M. Fast algorithm of 3D object volume calculation from point cloud. In Applications of Digital Image Processing XLIV; SPIE: Bellingham, WA, USA, 2021. [Google Scholar]
  24. Lienard, J.F. Fitting 3D Shapes from Partial and Noisy Point Clouds with Evolutionary Computing[EB/OL]. arXiv 2019, arXiv:1901.06722. [Google Scholar] [CrossRef]
  25. Wei, H.H.; Li, X.H.; Bao, Y. 3D Modeling for Hillside Landscape Engineering. Adv. Mater. Res. 2014, 926–930, 1639–1642. [Google Scholar] [CrossRef]
  26. Paravati, G.; Lamberti, F.; Gatteschi, V.; Demartini, C.; Montuschi, P. Point Cloud-Based Automatic Assessment of 3D Computer Animation Courseworks. IEEE Trans. Learn. Technol. 2017, 10, 532–543. [Google Scholar] [CrossRef]
  27. Li, W.N.; Piao, M.L.; Jeon, S.H.; Jeong, J.R.; Kim, N. Novel computer-generated hologram (CGH) achieved scheme using point cloud based on integral imaging. In Advances in Display Technologies V; SPIE: Bellingham, WA, USA, 2015. [Google Scholar]
  28. Zhao, Y.; Li, G.; Piao, M.L.; Lee, H.M.; Kim, N. Occlusion-removed computer generated cylindrical hologram using 3D point cloud. In Practical Holography XXVIII: Materials and Applications; SPIE: Bellingham, WA, USA, 2014. [Google Scholar]
  29. Chen, W.; He, W.; Zhang, J.; Zhang, Q.; Fu, F. Research on Bulk Grain Pile Volume Measurement Method Based on DUSt3R Stereoscopic Deep Learning. J. Henan Univ. Technol. (Nat. Sci. Ed.) 2025, 46, 1–14. [Google Scholar] [CrossRef]
Figure 1. Overall framework design of algorithm.
Figure 1. Overall framework design of algorithm.
Information 17 00184 g001
Figure 2. Comprehensive performance comparison of benchmark methods (point cloud density distribution, significance detection results, multi-view fusion effect).
Figure 2. Comprehensive performance comparison of benchmark methods (point cloud density distribution, significance detection results, multi-view fusion effect).
Information 17 00184 g002
Figure 3. Baseline Algorithm 3D Visualization Analysis Results.
Figure 3. Baseline Algorithm 3D Visualization Analysis Results.
Information 17 00184 g003
Figure 4. Comprehensive comparative analysis of ablation experiment results.
Figure 4. Comprehensive comparative analysis of ablation experiment results.
Information 17 00184 g004
Figure 5. Comparison of point cloud computing performance and density distribution calculation results in different scenarios.
Figure 5. Comparison of point cloud computing performance and density distribution calculation results in different scenarios.
Information 17 00184 g005
Figure 6. Comparison results of comprehensive performance of 3D point cloud computing in indoor and outdoor scenes.
Figure 6. Comparison results of comprehensive performance of 3D point cloud computing in indoor and outdoor scenes.
Information 17 00184 g006
Figure 7. Comparison of the conversion process from PNG images to point clouds in indoor and outdoor scenes.
Figure 7. Comparison of the conversion process from PNG images to point clouds in indoor and outdoor scenes.
Information 17 00184 g007
Figure 8. In-depth analysis results of the relationship between density and significance.
Figure 8. In-depth analysis results of the relationship between density and significance.
Information 17 00184 g008
Table 1. Performance Comparison Results of Benchmark Methods.
Table 1. Performance Comparison Results of Benchmark Methods.
PrecisionRecallF1-ScoreAccuracyTraining Time(s)
Baseline0.820.780.80.81120
Density-only0.870.830.850.86145
MultiView-only0.850.810.830.84160
Proposed0.910.890.90.92185
KPConv0.830.800.810.82130
PointNet0.800.760.780.79110
Table 2. Results of ablation experiment.
Table 2. Results of ablation experiment.
PrecisionRecallF1-ScoreImprovement
Baseline0.820.780.80
+Density Feature0.850.810.830.03
+Multi-View0.840.80.820.02
+Density-Aware Loss0.860.830.840.04
Full Model0.910.890.90.1
Table 3. Analysis of Contribution of Algorithm Components.
Table 3. Analysis of Contribution of Algorithm Components.
PrecisionRecallF1-ScoreImprovement
Baseline0.820.780.80
+Density Feature0.860.820.840.04
+Multi-View0.850.810.830.03
+Density-Aware Loss0.840.80.820.02
Full Model0.910.890.90.1
Table 4. Performance analysis of different-density regions.
Table 4. Performance analysis of different-density regions.
PrecisionRecallF1-ScorePoint Count
Very Low0.750.720.73156
Low0.820.790.8234
Medium0.880.860.87312
+High0.850.820.83278
Very High0.830.80.81220
Table 5. Processing Performance of Different Image Types.
Table 5. Processing Performance of Different Image Types.
Point CountProcessing Time(s)PrecisionRecallF1-Score
Gradient10002.30.890.870.88
Circle10002.10.920.90.91
Noise10002.50.850.820.83
Stripe10002.20.880.860.87
Complex15003.10.910.890.9
Table 6. Calculation complexity analysis.
Table 6. Calculation complexity analysis.
PrecisionBaselineProposed
Time ComplexityO(n)O(n log n)
Space ComplexityO(n)O(n + v)
Training Time120 s185 s
Inference Time5 ms8 ms
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, W.; Li, T.; Cheng, Q.; Qi, P.; Zhu, J. Research on Density-Adaptive Feature Enhancement and Lightweight Spectral Fine-Tuning Algorithm for 3D Point Cloud Analysis. Information 2026, 17, 184. https://doi.org/10.3390/info17020184

AMA Style

Huang W, Li T, Cheng Q, Qi P, Zhu J. Research on Density-Adaptive Feature Enhancement and Lightweight Spectral Fine-Tuning Algorithm for 3D Point Cloud Analysis. Information. 2026; 17(2):184. https://doi.org/10.3390/info17020184

Chicago/Turabian Style

Huang, Wenquan, Teng Li, Qing Cheng, Ping Qi, and Jing Zhu. 2026. "Research on Density-Adaptive Feature Enhancement and Lightweight Spectral Fine-Tuning Algorithm for 3D Point Cloud Analysis" Information 17, no. 2: 184. https://doi.org/10.3390/info17020184

APA Style

Huang, W., Li, T., Cheng, Q., Qi, P., & Zhu, J. (2026). Research on Density-Adaptive Feature Enhancement and Lightweight Spectral Fine-Tuning Algorithm for 3D Point Cloud Analysis. Information, 17(2), 184. https://doi.org/10.3390/info17020184

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop