A Local Discrete Feature Histogram for Point Cloud Feature Representation

Jia, Linjing; Li, Cong; Xi, Guan; Liu, Xuelian; Xie, Da; Wang, Chunyang

doi:10.3390/app15052367

Open AccessArticle

A Local Discrete Feature Histogram for Point Cloud Feature Representation

by

Linjing Jia

¹,

Cong Li

¹,

Guan Xi

^1,2,

Xuelian Liu

¹

,

Da Xie

¹ and

Chunyang Wang

^1,*

¹

Xi’an Key Laboratory of Active Photoelectric Imaging Detection Technology, Xi’an Technological University, Xi’an 710021, China

²

School of Opto-Electronical Engineering, Xi’an Technological University, Xi’an 710021, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(5), 2367; https://doi.org/10.3390/app15052367

Submission received: 16 January 2025 / Revised: 18 February 2025 / Accepted: 21 February 2025 / Published: 22 February 2025

(This article belongs to the Special Issue Applications of Advanced Deep Learning Technology in Control and Intelligent Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Local feature descriptors are a critical problem in computer vision; the majority of current approaches find it difficult to achieve a balance between descriptiveness, robustness, compactness, and efficiency. This paper proposes the local discrete feature histogram (LDFH), a novel local feature descriptor, as a solution to this problem. The LDFH descriptor is constructed based on a robust local reference frame (LRF). It partitions the local space based on radial distance and calculates three geometric features, including the normal deviation angle, polar angle, and normal lateral angle, in each subspace. These features are then discretized to generate three feature statistical histograms, which are combined using a weighted fusion strategy to generate the final LDFH descriptor. Experiments on public datasets demonstrate that, compared with the existing methods, LDFH strikes an excellent balance between descriptiveness, robustness, compactness, and efficiency, making it suitable for various scenes and sensor datasets.

Keywords:

computer vision; local feature descriptor; point cloud; local reference frame; 3D representation

1. Introduction

Image processing [1,2] is a fundamental aspect of computer vision, and as applications have expanded from 2D images to 3D spaces, tasks in computer vision have become increasingly complex. Local feature descriptors are a critical topic in the field of 3D computer vision, which are utilized extensively in applications involving 3D point cloud registration [3,4], 3D object recognition [5,6], and 3D shape retrieval [7,8]. Typically, they are created by converting the local surface’s spatial distribution and geometric attributes into a feature vector. Descriptiveness and robustness are the two most crucial attributes of a local feature descriptor. The descriptiveness is the capacity to provide sufficient description details to differentiate between local surfaces, while its robustness denotes the capacity to remain stable under a variety of disturbances, such as noise and changes in mesh resolution [9]. Additionally, compactness and efficiency are important characteristics as well, especially in real-time applications.

In previous research, many local feature descriptors have been put forth, generally classified into three types [10,11]: those based on the local reference frame (LRF), those based on the local reference axis (LRA), and those without any local reference system. Descriptors without local reference systems usually only use geometric features and lack spatial distribution information, so their description ability is weak. The descriptiveness of the descriptor can be significantly enhanced by combining geometric features with spatial information. Descriptors based on LRA or LRF generate spatial information by partitioning 3D space. An LRF comprises three mutually orthogonal and rotation-invariant coordinate axes, which can provide comprehensive local spatial details, such as radial, azimuth, and elevation directions. In contrast, LRA-based methods utilize a single axis, representing only radial and elevation information. Recent evaluations of standard datasets have shown performance advantages for LRF-based descriptors [10]. However, achieving a satisfactory balance between descriptiveness, robustness, compactness, and efficiency remains a challenge for most feature descriptors currently in use [9,12].

Therefore, we propose a novel local feature descriptor called the Local Discrete Feature Histogram (LDFH). The LDFH descriptor is constructed based on LRF, which divides the local space by radial distance, counts three geometric features (including the normal deviation angle, polar angle, and normal lateral angle) in each subspace, and discretizes the features. Finally, the three histograms are combined using weighted fusion to form the final descriptor.

The remainder of this paper is organized as follows: Section 2 introduces the related work on local feature descriptors. Section 3 provides a detailed explanation of the LDFH method proposed in this paper. Section 4 presents the experimental results of our LDFH descriptor and other state-of-the-art algorithms on the B3R dataset and Kinect dataset. Section 5 includes conclusions and some future directions.

2. Literature Review

In LRF-based descriptors, the signature of histograms of orientations (SHOT) descriptor was introduced by Tombari et al. [13], which separates the local space into multiple subspaces along the radial, azimuthal, and elevation directions in polar coordinates and performs deviation angle statistics within each subspace. The rotation projection statistics (RoPS) descriptor was introduced by Guo et al. [14], which continuously rotates a local surface within the LRF and projects the rotated surface to compute the statistical quantities of the projected point density, finally concatenating these statistics to obtain the rotation projection statistical feature. A triple orthogonal local depth images (TOLDI) descriptor was introduced by Yang et al. [15], which generates a feature vector by concatenating three local depth images taken from three orthogonal view planes within the LRF. Concatenating the depth and contour data obtained from three orthogonal view planes within the LRF into a single vector is how Du et al. [16] generated the multi-view depth and contour signatures (MDCS) descriptor. A trigonometric projection statistics histograms (TPSH) descriptor was introduced by Liu et al. [17], which constructs the LRF using a multi-attribute weighting strategy and generates geometric and spatial distribution images using a triangular projection mechanism, encoded as statistical histograms. The histograms of the dual deviation angle feature (HDDAF) descriptor were introduced by Shi et al. [18], who construct two deviation angle feature histograms by calculating the centroid connection line between the local spherical neighborhood and the LRF origin, as well as angles between this line and the LRF, and between this and the normal vector.

In LRA-based descriptors, a local feature statistics histogram (LFSH) descriptor was proposed by Yang et al. [4], which generates the LFSH descriptor by encoding the statistical characteristics of the local depth, horizontal projection distance, and normal deviation angle. The statistic of deviation angles on subdivided space (SDASS) descriptor was introduced by Zhao et al. [19], which proposes the local minimum axis (LMA) instead of the normal to reduce the impact of interference, uses height and projected radial directions to divide local space, and calculates the deviation angle between the LMA and LRA in each subspace. A divisional local feature statistics (DLFS) descriptor was introduced by Zhao and Xi [20], which partitions the local space along the projected radial direction and computes statistical operations on three geometric attributes and one spatial attribute within each partition.

In descriptors without any local reference system, the point feature histograms (PFH) descriptor was proposed by Rusu et al. [21], which encodes local surface shape information and is highly discriminative but computationally costly. To address this, Rusu et al. [22] used a simplified version of the PFH to generate the fast point feature histogram (FPFH). The descriptor named histograms of point pair features (HoPPF) was developed by Zhao et al. [23], which generates corresponding sub-features from the distribution of local point pairs in each of the eight regions that make up the local point pair set of each key point. These sub-features are then concatenated into a vector. The point pair transformation feature histograms (PPTFH) descriptor was introduced by Wu et al. [24], which divides the point pair set into four subsets, creates three feature histograms for each subset using point pair transformation features, and concatenates the histograms from all subsets into a feature vector.

3. Methods

3.1. LRF and LMA

The performance of feature descriptors depends on an efficient and reliable LRF. Nowadays, there are two primary categories of LRF construction methods [10]: covariance analysis-based and geometric attribute-based approaches. We use the stable LRF approach [15] to make the proposed descriptor invariant under rigid transformations. Additionally, we construct the LRF in the following manner to increase its efficiency and repeatability.

The mathematical representation of LRF at key point

p

is

LRF (p) = \{x (p), x (p) \times z (p), z (p)\},

(1)

where

x (p)

and

z (p)

refer to the x and z axes of the

LRF (p)

.

To compute the z-axis, we first obtain the set of local neighborhood points

Q = \{q_{1}, q_{2}, \dots, q_{k}\}

within a support radius

R

of

p

. We use all local neighborhood points to compute the z-axis. Specifically, we perform covariance matrix analysis on

Q

, and the covariance matrix

C o v (Q)

is expressed as follows:

C o v (Q) = {[\begin{matrix} q_{1} - \bar{q} \\ \dots \\ q_{k} - \bar{q} \end{matrix}]}^{T} [\begin{matrix} q_{1} - \bar{q} \\ \dots \\ q_{k} - \bar{q} \end{matrix}],

(2)

where

k

is the number of neighboring points around key point

p

and

\bar{q}

denotes the centroid of

Q

. The minimum eigenvalue’s corresponding eigenvector

v (p)

of

C o v (Q)

is computed. To resolve sign ambiguity, all neighboring points within the support radius are used to determine the sign of the z-axis:

z (p) = \{\begin{cases} v (p), if v (p) \cdot \sum_{i = 1}^{k} q_{i} p \geq 0 \\ - v (p), o t h e r w i s e \end{cases},

(3)

where

q_{i} p

refers to the direction vector from

p

to

q_{i}

, “

\cdot

” represents the dot product between vectors.

The construction of the x-axis follows a similar approach, as described in reference [15]. Specifically, the x-axis is computed as

x (p) = \sum_{i = 1}^{k} w_{i 1} w_{i 2} v_{i} / ‖\sum_{i = 1}^{k} w_{i 1} w_{i 2} v_{i}‖,

(4)

where

v_{i}

denotes the projection of each neighboring point

q_{i}

onto a tangent plane

T

at the key point, i.e.,

w_{i 1}

refers to the weight associated with the distance of

q_{i}

to

p

, given by

w_{i 1} = {(R - ‖p - q_{i}‖)}^{2}

.

w_{i 2}

is another weight based on the projection distance of

q_{i}

onto

T

, given by

w_{i 2} = {(p q_{i} \cdot z (p))}^{2}

.

Finally, by calculating the cross-product of the z-axis and x-axis, the y-axis is obtained, and construction of the LRF is complete.

For the calculation of the local minimum axis (LMA), the directions and symbols are determined by Equations (2) and (3). According to the validation of the literature [19], the direction of the LMA and the support radius used for disambiguation were uniformly set at a mesh resolution of 7 times (where the mesh resolution (mr) is generally defined as the average of the distance from each point to its nearest neighbor in a point cloud). Compared with traditional normal vector computation, the LMA offers two major advantages [20]. First, compared to a normal vector calculation, the support radius for an LMA calculation is larger. Normal vectors calculated using covariance matrix methods typically use a radius smaller than 3 mr. In contrast, when the radius is set to 7 mr (as defined in [19]), the computed LMA exhibits higher repeatability under conditions including noise and variations in mesh resolution. Second, the traditional normal vector disambiguation generally depends on the viewpoint information. When the viewpoint is uncertain, it needs to be handled manually. And the disambiguation of the LMA is realized via Equation (3), which is more efficient without depending on the viewpoint information.

3.2. Generation of the LDFH Descriptor

After constructing the LRF, the LDFH descriptor is constructed based on the LRF. The process of generating the LDFH is illustrated in Figure 1.

First, the local surface

Q

is transformed into the LRF to ensure rotational invariance, and the rotated surface is expressed as

Q^{'} = \{q_{1}^{'}, q_{2}^{'}, \dots, q_{k}^{'}\}

. Next, the spatial and geometric information on the local surface is encoded. Specifically, we use radial distance to encode the local spatial information, and the normal deviation angle

θ

, polar angle

ψ

, and normal lateral angle

ϕ

to encode the local surface geometry information. They are computed as follows:

\{\begin{cases} r_{i} = \sqrt{q_{i}^{'} {(x)}^{2} + q_{i}^{'} {(y)}^{2} + q_{i}^{'} {(z)}^{2}} \\ θ_{i} = \arccos (\frac{LMA (q_{i}^{'}) \cdot z (p)}{‖LMA (q_{i}^{'})‖ \times ‖z (p)‖}) \\ ψ_{i} = \arccos (\frac{q_{i}^{'} (z)}{r_{i}}) \\ ϕ_{i} = \arccos (\frac{LMA (q_{i}^{'}) \cdot y (p)}{‖LMA (q_{i}^{'})‖ \times ‖y (p)‖}) \end{cases},

(5)

where

r_{i}

is the radial distance of the neighboring point

q_{i}^{'}

from the key point

p

, with a range of [0,R].

q_{i}^{'} (x), q_{i}^{'} (y)

, and

q_{i}^{'} (z)

are the x, y, z-coordinates of

q_{i}^{'}

.

θ_{i}

is the deviation angle between the LMA of

q_{i}^{'}

and the normal

z (p)

of the key point, with a range of [0,π].

ψ_{i}

is the polar angle between

q_{i}^{'}

and the z-axis, with a range of [0,π].

ϕ_{i}

is the normal lateral angle between the LMA of

q_{i}^{'}

and the y-axis of the LRF, with a range of [0,π].

The number of partitions divided along the radial distance is

N_{r}

. Three geometric features are computed in each subspace. And three corresponding feature histograms

H_{θ}, H_{ψ}, H_{ϕ}

are generated. During the statistical process, continuous feature values are discretized into a fixed number of bins to capture their spatial distribution characteristics. Suppose the ranges of the three geometric features are divided into

N_{θ}, N_{ψ}, N_{ϕ}

bins, the lengths of the corresponding feature histograms are

N_{r} \times N_{θ}, N_{r} \times N_{ψ}, N_{r} \times N_{ϕ}

, respectively. The three histograms are normalized so that the sum of each histogram is 1, to achieve robustness against changes in point density. Additionally, considering the different contributions of the three features to the descriptiveness and robustness of the LDFH, three weighting coefficients—

λ_{1}, λ_{2}, λ_{3}

—are introduced to perform a weighted fusion of the feature histograms, resulting in

f = \{λ_{1} H_{θ}, λ_{2} H_{ψ}, λ_{3} H_{ϕ}\}

.

The procedure for generating the LDFH descriptor is described in Algorithm 1.

Algorithm 1: LDFH Descriptor Generation

Input: P (Point Cloud), K (Key Points), R,

N_{r}, N_{θ}, N_{ψ}, N_{ϕ}

,

λ_{1}, λ_{2}, λ_{3}

, LRF, LMA

Output: LDFH (LDFH descriptors)

1. For each key point,

p

∈ K:

2. Extract local surface around the key point

p

;

3. Construct LRF at the key point

p

;

4. Transform local surface of

p

to LRF;

5. Partition local space around

p

into radial bins

N_{r}

;

6. For each neighboring point

q^{'}

in support radius:

7. Compute geometric attributes

(θ, ψ, ϕ)

based on LRF;

8. Generate three feature histograms

H_{θ}, H_{ψ}, H_{ϕ}

;

9. Apply a weighted fusion to generate the final LDFH descriptor

\{λ_{1} H_{θ}, λ_{2} H_{ψ}, λ_{3} H_{ϕ}\}

;

10. Store the LDFH descriptor for key point

p

;

11. Return LDFH (set of all LDFH descriptors for K).

3.3. Parameter Settings

The LDFH descriptor requires the configuration of 8 parameters. These parameters include the support radius

R

, the number of partitions for the radial distance

N_{r}

, the number of bins for the three geometric attributes

(N_{θ}, N_{ψ}, N_{ϕ})

, and the values of the three weights

(λ_{1}, λ_{2}, λ_{3})

. A critical parameter for local feature descriptors is the support radius. A large value can impact computing performance and enhance sensitivity to clutter and occlusion, while a small value may reduce the descriptiveness. Based on recommendations in references [19,20], the support radius

R

is set to 20 mr.

The remaining 7 parameters are detailed below. To set the LDFH descriptor parameters, we use a scenario from the B3R dataset (see Section 4.1) that combines 1/4 mesh decimation with 0.3 mr Gaussian noise. The recall vs. 1-precision curve (RPC) serves as a widely adopted approach to evaluate descriptor performance [9,15]. To enhance compactness, we assessed the performance of LDFH descriptors using the area under the RPC curve (denoted as AUCpr). The current parameter is selected by fixing other parameters during the LDFH testing process, and its value is gradually increased within a specific range. Specifically, the range of

N_{r}, N_{θ}, N_{ψ}

, and

N_{ϕ}

is 2 to 20, with a step size of 1, and the range of

λ_{1}, λ_{2}

, and

λ_{3}

is 0.2 to 2.4, with a step size of 0.1. Referring to the parameter settings in [19], the initial values for

N_{θ}, N_{ψ}

, and

N_{ϕ}

are all set to 15, and the initial values for

λ_{1}, λ_{2}

, and

λ_{3}

are all set to 1. Once the present parameter has been tested, its value is fixed and used in subsequent parameter tuning. Table 1 displays the specific parameter tuning process for LDFH, and Figure 2 shows the corresponding AUCpr results of the experiments. Based on the test results and a trade-off between the descriptiveness, robustness, and compactness of the descriptor, the values of

N_{r}, N_{θ}, N_{ψ}, N_{ϕ}, λ_{1}, λ_{2}

, and

λ_{3}

are set to 8, 9, 14, 2, 1.5, 1.2, and 0.7, respectively.

4. Experimental Results

This section provides an overview of the two standard datasets and the evaluation criteria applied in the experiments. The performance of the proposed LDFH descriptor is then evaluated through comparisons with several classical descriptors. All experiments in this paper are conducted on a computer equipped with an Intel Core i7-7700 3.60 GHz CPU 204 and 16 GB of RAM.

4.1. Datasets

We selected two datasets for the experiments: the Bologna 3D retrieval (B3R) dataset [25] and the Kinect dataset [13]. Figure 3 displays two example models and scenes from each dataset.

The B3R dataset is a synthetic dataset primarily used for 3D shape retrieval scenarios. It contains high-quality data, consisting of 6 models and 18 scenes. The Stanford 3D Scanning Repository is the source of the models [26]. The scenes are constructed by performing random rigid transformations of the models and incorporating Gaussian noise at three different scales. To assess the impact of mesh decimation reduction on local descriptors, we generated four additional sets of scenes. These consist of two levels with different mesh decimations (1/4 and 1/8 of the initial mesh decimation) and two levels that combine mesh decimation and Gaussian noise (0.3 mr Gaussian noise with 1/4 mesh decimation and 0.5 mr Gaussian noise with 1/8 mesh decimation).

The Kinect dataset consists of 6 models and 16 scenes, obtained using the Microsoft Kinect sensor (manufactured by Microsoft Corporation, headquartered in Redmond, WA, USA). It features 2.5D models and scenes with relatively low grid quality, affected by real noise, occlusions, and clutter. This makes the Kinect dataset highly challenging for use in object recognition.

4.2. Evaluation Criteria

The performance of the proposed descriptor is evaluated in this paper using the RPC curve. The RPC is computed by using a model point cloud, a scene point cloud, and the publisher-provided ground truth transformation between the model and the scene. Its formula is as follows:

recall = \frac{the number of correct matches}{the number of corresponding features},

(6)

1 - precision = \frac{the number of false matches}{total number of matches} .

(7)

The nearest and second-nearest features are identified by matching the model features with all scene features. The match is regarded as a candidate match if the ratio of the nearest distance to the second-nearest distance is smaller than a specified threshold. The candidate match is considered correct only when its spatial distance is sufficiently small (0.5R in this paper). If a feature descriptor performs well, its precision and recall will be high, and its RPC will be in the upper left corner of the curve. In the experiments of this paper, 1000 points are chosen at random as key points from each model, and their corresponding key points are extracted from the scene using the ground truth transformation matrix that the publisher provides [9].

4.3. Performance Evaluation of the LDFH Descriptor

We compare our LDFH descriptor with several representative descriptors, including FPFH [22], SHOT [13], TOLDI [15], and SDASS [19]. Additionally, to investigate the influence of different geometric features on the LDFH performance, we design two variants of LDFH. The first variant replaces the normal lateral angle

ϕ

with the angle between the LMA of neighboring points and the x-axis of the LRF, named LDFH-X. The second variant replaces the normal lateral angle

ϕ

with the azimuth angle, named LDFH-AZ. The rest of the computation process remains the same as in LDFH. According to the LDFH parameter setting process, the values of the LDFH-X parameters

N_{r}, N_{θ}, N_{ψ}, N_{ϕ}, λ_{1}, λ_{2}

, and

λ_{3}

are set to 8, 9, 13, 2, 1.8, 1.4, and 0.7, and the values of the LDFH-AZ parameters

N_{r}, N_{θ}, N_{ψ}, N_{ϕ}, λ_{1}, λ_{2}

, and

λ_{3}

are set to 7, 12, 11, 5, 2, 1.9, and 1. Table 2 displays the parameter settings for the seven descriptors.

4.3.1. Descriptiveness and Robustness

Verifying the descriptors’ robustness to noise and variations in mesh resolution is the primary goal of testing on the B3R dataset. We generate RPC curves for each descriptor by following the procedure described in Section 3.2. The results for each descriptor are displayed in Figure 4. As can be seen from Figure 4a,b, there are significant differences in the performance of different descriptors under various noise levels. First, LDFH, LDFH-X, and LDFH-AZ demonstrate similar performances in noise robustness and exhibit a significant advantage. Second, SDASS also performs well, whereas FPFH shows the weakest robustness to noise.

Regarding robustness to changes in mesh resolution, as displayed in Figure 4c,d, LDFH, LDFH-X, and LDFH-AZ outperform the other descriptors, followed by SDASS. The performance of descriptors obviously declines when the mesh resolution decreases, because geometric edges become less distinct when the point cloud becomes overly sparse. As displayed in Figure 4e,f, when both Gaussian noise and mesh decimation are present, the performance advantages of LDFH, LDFH-X, and LDFH-AZ become even more pronounced.

The Kinect dataset has a relatively lower mesh quality compared to the other datasets. Figure 5 displays the RPC curves for this experiment. As can be seen, SHOT and LDFH perform the best, followed closely by LDFH-X and LDFH-AZ, which show a significant performance gap over TOLDI and SDASS. This indicates that the LDFH descriptor exhibits higher stability when handling cluttered and occluded data.

In summary, the LDFH descriptor demonstrates strong robustness to noise, different mesh resolutions, and clutter. Additionally, LDFH achieves high RPC values across both datasets, indicating its high descriptiveness. This is because the method encodes spatial and geometric information while using the LMA instead of the normal, further improving its performance.

4.3.2. Compactness

A descriptor’s compactness is an important property, directly affecting its computational and storage efficiency. According to [16], we calculate compactness by dividing the mean value of AUCpr by the descriptor’s length, as shown in Equation (8). Table 3 presents the AUCpr results of all RPC curves in Section 3.3, where GN represents Gaussian noise and MD stands for mesh decimation in the B3R dataset. The bold text indicates the best results. Table 2 lists the lengths of all descriptors. The compactness of each descriptor is shown in Figure 6. LDFH, LDFH-X, and LDFH-AZ achieve the best compactness. FPFH comes next due to its very low dimensionality (33). TOLDI, on the other hand, has the lowest compactness, because of its high dimensionality (1200).

compactness = \frac{The mean value of AUCpr}{The length of descriptor} .

(8)

4.3.3. Efficiency

For evaluating the time performance of descriptors, we randomly select 1000 key points from each model in the B3R dataset and describe the local surface of their surrounding regions under different support radii. The average computation time for every feature descriptor is recorded. The support radius

R

is extended from 5 mr to 30 mr at 5 mr intervals. Figure 7 displays the results. From the figure, SHOT has the lowest computational time, followed by TOLDI. The computation time of LDFH is comparable to SHOT and TOLDI, with smaller support radii, and the computation time gradually increases with the increase in the support radius, but still remains efficient compared to the other descriptors. Due to FPFH computing simplified point feature histograms for every point, it has the highest computation time.

5. Conclusions

In this work, we propose a novel local feature descriptor, LDFH, which encodes the spatial and geometric features of the neighborhood point cloud around a given key point by constructing LRF. The method determines the LRF’s z-axis direction using all neighboring points within the support radius, resolving the sign ambiguity. By using the LMA instead of the normal to encode the local geometric information, LDFH effectively improves robustness under various interference conditions, such as noise, varying mesh resolutions, and clutter, etc. Additionally, the proposed method discretizes point cloud features and introduces a weighted fusion strategy to comprehensively capture and encode geometric and spatial information. Compared with the existing methods, we compute three geometric features in each subspace, including the normal deviation angle, polar angle, and normal lateral angle, and perform feature discretization and weighted fusion. The experiments demonstrate that, compared with the existing descriptors, LDFH strikes an excellent balance between descriptiveness, robustness, compactness, and efficiency. Furthermore, the experimental results validate LDFH’s effectiveness across various scenes and sensor data, with outstanding performance on the B3R dataset and the Kinect dataset, highlighting its broad applicability to diverse tasks.

Future work can be explored in several directions. First, the construction method of the local reference frame can be further optimized by exploring more stable and efficient computational approaches to enhance the robustness of the LDFH in challenging environments such as high noise and occlusion. Second, we can try to integrate deep learning methods to further improve the performance of LDFH.

Author Contributions

Conceptualization, L.J. and X.L.; methodology, L.J. and C.W.; software, C.L. and G.X.; validation, G.X. and D.X.; formal analysis, X.L. and C.L.; writing—original draft preparation, L.J.; writing—review and editing, L.J. and C.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data are contained within the article.

Acknowledgments

The manuscript was supported by the Xi’an Key Laboratory of Active Photoelectric Imaging Detection Technology.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yan, H.; Zhang, J.X.; Zhang, X. Injected Infrared and Visible Image Fusion via L₁ Decomposition Model and Guided Filtering. IEEE Trans. Comput. Imaging 2022, 8, 162–173. [Google Scholar] [CrossRef]
Zhang, X.; Liu, R.; Ren, J.; Gui, Q. Adaptive fractional image enhancement algorithm based on rough set and particle swarm optimization. Fractal Fract. 2022, 6, 100. [Google Scholar] [CrossRef]
Dong, Z.; Liang, F.X.; Yang, B.S.; Xu, Y.S.; Zang, Y.F.; Li, J.P.; Wang, Y.; Dai, W.X.; Fan, H.C.; Hyyppä, J.; et al. Registration of large-scale terrestrial laser scanner point clouds: A review and benchmark. ISPRS-J. Photogramm. Remote Sens. 2020, 163, 327–342. [Google Scholar] [CrossRef]
Yang, J.Q.; Cao, Z.G.; Zhang, Q. A fast and robust local descriptor for 3D point cloud registration. Inf. Sci. 2016, 346, 163–179. [Google Scholar] [CrossRef]
Mian, A.S.; Bennamoun, M.; Owens, R. Three-dimensional model-based object recognition and segmentation in cluttered scenes. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 1584–1601. [Google Scholar] [CrossRef]
Guo, Y.L.; Bennamoun, M.; Sohel, F.; Lu, M.; Wan, J.W. 3D Object Recognition in Cluttered Scenes with Local Surface Features: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 2270–2287. [Google Scholar] [CrossRef]
Bronstein, A.M.; Bronstein, M.M.; Guibas, L.J.; Ovsjanikov, M. Shape Google: Geometric Words and Expressions for Invariant Shape Retrieval. ACM Trans. Graph. 2011, 30, 20. [Google Scholar] [CrossRef]
Gao, Y.; Dai, Q.H. View-Based 3D Object Retrieval: Challenges and Approaches. IEEE Multimed. 2014, 21, 52–57. [Google Scholar] [CrossRef]
Guo, Y.L.; Bennamoun, M.; Sohel, F.; Lu, M.; Wan, J.W.; Kwok, N.M. A Comprehensive Performance Evaluation of 3D Local Feature Descriptors. Int. J. Comput. Vis. 2016, 116, 66–89. [Google Scholar] [CrossRef]
Yang, J.Q.; Xiao, Y.; Cao, Z.G. Toward the Repeatability and Robustness of the Local Reference Frame for 3D Shape Matching: An Evaluation. IEEE Trans. Image Process. 2018, 27, 3766–3781. [Google Scholar] [CrossRef]
Ghorbani, F.; Ebadi, H.; Sedaghat, A.; Pfeifer, N. A Novel 3-D Local DAISY-Style Descriptor to Reduce the Effect of Point Displacement Error in Point Cloud Registration. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2022, 15, 2254–2273. [Google Scholar] [CrossRef]
Yang, J.Q.; Quan, S.W.; Wang, P.; Zhang, Y.N. Evaluating Local Geometric Feature Representations for 3D Rigid Data Matching. IEEE Trans. Image Process. 2020, 29, 2522–2535. [Google Scholar] [CrossRef] [PubMed]
Tombari, F.; Salti, S.; Di Stefano, L. Unique signatures of histograms for local surface description. In Proceedings of the Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, 5–11 September 2010; Proceedings, Part III 11. pp. 356–369. [Google Scholar]
Guo, Y.L.; Sohel, F.; Bennamoun, M.; Lu, M.; Wan, J.W. Rotational Projection Statistics for 3D Local Surface Description and Object Recognition. Int. J. Comput. Vis. 2013, 105, 63–86. [Google Scholar] [CrossRef]
Yang, J.Q.; Zhang, Q.; Xiao, Y.; Cao, Z.G. TOLDI: An effective and robust approach for 3D local shape description. Pattern Recognit. 2017, 65, 175–187. [Google Scholar] [CrossRef]
Du, Z.H.; Zuo, Y.; Qiu, J.F.; Li, X.; Li, Y.; Guo, H.X.; Hong, X.B.; Wu, J. MDCS with fully encoding the information of local shape description for 3D Rigid Data matching. Image Vis. Comput. 2022, 121, 104421. [Google Scholar] [CrossRef]
Liu, X.S.; Li, A.H.; Sun, J.F.; Lu, Z.Y. Trigonometric projection statistics histograms for 3D local feature representation and shape description. Pattern Recognit. 2023, 143, 109727. [Google Scholar] [CrossRef]
Shi, C.H.; Wang, C.Y.; Liu, X.L.; Sun, S.Y.; Xi, G.; Ding, Y.Y. Point cloud object recognition method via histograms of dual deviation angle feature. Int. J. Remote Sens. 2023, 44, 3031–3058. [Google Scholar] [CrossRef]
Zhao, B.; Le, X.Y.; Xi, J.T. A novel SDASS descriptor for fully encoding the information of a 3D local surface. Inf. Sci. 2019, 483, 363–382. [Google Scholar] [CrossRef]
Zhao, B.; Xi, J.T. Efficient and accurate 3D modeling based on a novel local feature descriptor. Inf. Sci. 2020, 512, 295–314. [Google Scholar] [CrossRef]
Rusu, R.B.; Blodow, N.; Marton, Z.C.; Beetz, M. Aligning point cloud views using persistent feature histograms. In Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, 22–26 September 2008; pp. 3384–3391. [Google Scholar]
Rusu, R.B.; Blodow, N.; Beetz, M. Fast point feature histograms (FPFH) for 3D registration. In Proceedings of the 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan, 12–17 May 2009; pp. 3212–3217. [Google Scholar]
Zhao, H.; Tang, M.J.; Ding, H. HoPPF: A novel local surface descriptor for 3D object recognition. Pattern Recognit. 2020, 103, 107272. [Google Scholar] [CrossRef]
Wu, L.; Zhong, K.; Li, Z.W.; Zhou, M.; Hu, H.B.; Wang, C.J.; Shi, Y.S. PPTFH: Robust Local Descriptor Based on Point-Pair Transformation Features for 3D Surface Matching. Sensors 2021, 21, 3229. [Google Scholar] [CrossRef] [PubMed]
Tombari, F.; Salti, S.; Di Stefano, L. Performance evaluation of 3D keypoint detectors. Int. J. Comput. Vis. 2013, 102, 198–220. [Google Scholar] [CrossRef]
Curless, B.; Levoy, M. A volumetric method for building complex models from range images. In Proceedings of the 23rd annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 4–9 August 1996; pp. 303–312. [Google Scholar]

Figure 1. An illustration of the LDFH descriptor (a) 3D model. (b) Extracting the local surface (blue) around the key point (red). (c) LRF construction at the key point. (d) Dividing the local space along radial distance (for clarity, we set 4 partitions along the radial distance). (e) Calculating three geometric attributes in each subspace. (f) Generating three feature statistical histograms. (g) Generating weighted feature histograms.

Figure 2. The parameter settings for LDFH descriptor. The solid markers present the selected parameter values.

Figure 3. Experimental datasets: Two model examples and two scene examples are displayed from left to right.

Figure 4. The RPC results on the B3R dataset.

Figure 5. The RPC results on the Kinect dataset.

Figure 6. The compactness of selected descriptors.

Figure 7. The average computation time of descriptors under different support radii.

Table 1. Parameter settings for the LDFH descriptor.

	$N_{r}$	$N_{θ}$	$N_{ψ}$	$N_{ϕ}$	$λ_{1}$	$λ_{2}$	$λ_{3}$	R (mr)
$Setting N_{r}$	2–20	15	15	15	1	1	1	20
$Setting N_{θ}$	8	2–20	15	15	1	1	1	20
$Setting N_{ψ}$	8	9	2–20	15	1	1	1	20
$Setting N_{ϕ}$	8	9	14	2–20	1	1	1	20
$Setting λ_{1}$	8	9	14	2	0.2–2.4	1	1	20
$Setting λ_{2}$	8	9	14	2	1.5	0.2–2.4	1	20
$Setting λ_{3}$	8	9	14	2	1.5	1.2	0.2–2.4	20

Table 2. Parameter settings for the seven feature descriptors.

Descriptor	Support Radius (mr)	Dimensionality	Length
FPFH	20	3 × 11	33
SHOT	20	8 × 2 × 2 × 11	352
TOLDI	20	3 × 20 × 20	1200
SDASS	20	15 × 5 × 5	345
LDFH-AZ	20	7 × (12 + 11 + 5)	196
LDFH-X	20	8 × (9 + 13 + 2)	192
LDFH	20	8 × (9 + 14 + 2)	200

Table 3. The AUCpr results of selected descriptors on B3R dataset (with Gaussian noise and mesh decimation) and Kinect dataset. GN represents Gaussian noise and MD stands for mesh decimation in the B3R dataset. The bold text indicates the best results.

Descriptor	0.3 mr GN	0.5 mr GN	1/4 MD	1/8 MD	1/4 MD 0.3 mr GN	1/8 MD 0.5 mr GN	Kinect
FPFH	0.2250	0.1154	0.1424	0.0749	0.0898	0.0488	0.0889
SHOT	0.6945	0.6560	0.5716	0.2395	0.4510	0.1818	0.3030
TOLDI	0.9002	0.8401	0.5930	0.3206	0.4915	0.1990	0.1936
SDASS	0.9689	0.9349	0.8790	0.5959	0.8415	0.4630	0.1612
LDFH-AZ	0.9687	0.9386	0.9101	0.7022	0.8641	0.5321	0.2550
LDFH-X	0.9676	0.9392	0.9136	0.7097	0.8685	0.5483	0.2661
LDFH	0.9731	0.9528	0.9308	0.7198	0.8872	0.5532	0.2929

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jia, L.; Li, C.; Xi, G.; Liu, X.; Xie, D.; Wang, C. A Local Discrete Feature Histogram for Point Cloud Feature Representation. Appl. Sci. 2025, 15, 2367. https://doi.org/10.3390/app15052367

AMA Style

Jia L, Li C, Xi G, Liu X, Xie D, Wang C. A Local Discrete Feature Histogram for Point Cloud Feature Representation. Applied Sciences. 2025; 15(5):2367. https://doi.org/10.3390/app15052367

Chicago/Turabian Style

Jia, Linjing, Cong Li, Guan Xi, Xuelian Liu, Da Xie, and Chunyang Wang. 2025. "A Local Discrete Feature Histogram for Point Cloud Feature Representation" Applied Sciences 15, no. 5: 2367. https://doi.org/10.3390/app15052367

APA Style

Jia, L., Li, C., Xi, G., Liu, X., Xie, D., & Wang, C. (2025). A Local Discrete Feature Histogram for Point Cloud Feature Representation. Applied Sciences, 15(5), 2367. https://doi.org/10.3390/app15052367

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Local Discrete Feature Histogram for Point Cloud Feature Representation

Abstract

1. Introduction

2. Literature Review

3. Methods

3.1. LRF and LMA

3.2. Generation of the LDFH Descriptor

3.3. Parameter Settings

4. Experimental Results

4.1. Datasets

4.2. Evaluation Criteria

4.3. Performance Evaluation of the LDFH Descriptor

4.3.1. Descriptiveness and Robustness

4.3.2. Compactness

4.3.3. Efficiency

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI