Research on Complexity Quantification Method for Multibeam Point Clouds Based on Feature Joint Entropy

Liang, Dekun; Cui, Yang; Jin, Shaohua; Wei, Yuan; Tan, Jichuan

doi:10.3390/jmse14090824

Open AccessArticle

Research on Complexity Quantification Method for Multibeam Point Clouds Based on Feature Joint Entropy

by

Dekun Liang

¹,

Yang Cui

¹,

Shaohua Jin

^1,*,

Yuan Wei

¹ and

Jichuan Tan

²

¹

Department of Oceanography and Hydrography, Dalian Naval Academy, Dalian 116018, China

²

Chart Information Center, Tianjin 300450, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2026, 14(9), 824; https://doi.org/10.3390/jmse14090824

Submission received: 25 March 2026 / Revised: 24 April 2026 / Accepted: 25 April 2026 / Published: 29 April 2026

(This article belongs to the Section Geological Oceanography)

Download

Browse Figures

Versions Notes

Abstract

This study addresses the challenge of simplifying massive multibeam seafloor topographic point cloud datasets featuring significant spatial heterogeneity. We propose a feature joint entropy-based quantification method for seafloor terrain complexity, which provides a foundation for the adaptive and differentiated simplification of point clouds. In this method, the elevation and slope features of point clouds are treated as two-dimensional random variables that describe terrain morphology; we estimate the Shannon entropy of their joint distribution by constructing a two-dimensional adaptive histogram and use the entropy value to quantify the topographic information content and complexity of local regions. To overcome the parameter sensitivity and subjective dependence inherent in traditional fixed-bin methods, we incorporate the Minimum Description Length (MDL) principle to guide binning optimization, taking the sum of stochastic complexity and model coding length as the evaluation criterion. A dimension-alternating optimization strategy combining dynamic programming and an iterative greedy algorithm is adopted to solve for the optimal binning structure, thus achieving data-driven adaptive binning. To ensure the fairness and reliability of quantification, we adopt a fixed-point number partitioning strategy to decompose the point cloud into several independent analysis nodes and determine the minimum sample size supporting the stable estimation of entropy values through convergence analysis. Experimental results demonstrate that the proposed method, as a consistent and data-driven complexity metric, can reliably reflect the relative complexity of different seafloor terrain regions, thereby providing an objective quantitative basis for subsequent differentiated point cloud simplification.

Keywords:

multibeam bathymetry; complexity; joint entropy; Minimum Description Length (MDL)

1. Introduction

Seafloor topographic surveying is a fundamental task in marine surveying and mapping, and its accurate detection is crucial for marine engineering construction, scientific research, and ship navigation safety [1]. As a mainstream tool for current seafloor topographic surveying, multibeam bathymetry systems leverage the advantages of full coverage, high resolution, and high efficiency to directly acquire massive high-density 3D point-cloud data, greatly enhancing the ability to finely characterize seafloor landforms. However, this high-efficiency measurement method is accompanied by a rapid increase in data volume, imposing a heavy burden on subsequent data storage, processing, and analysis [2]. Consequently, point-cloud simplification has become a key preprocessing step. Yet, traditional simplification algorithms often rely on preset, globally uniform sampling rules or geometric thresholds, which constitute an undifferentiated processing strategy. This one-size-fits-all strategy cannot adapt to the spatial heterogeneity of seafloor terrain: in complex areas, excessive simplification may lose key features that reflect geomorphic details, degrading model reconstruction accuracy; in flat areas, overly conservative sampling retains a large number of redundant points, unnecessarily increasing subsequent processing burdens. Therefore, there is an urgent need to effectively quantify the intrinsic complexity of terrain to guide differentiated and adaptive point-cloud simplification.

The existing approaches for quantifying terrain complexity can be primarily divided into three methodologies. The first methodology is the terrain factor-based approach, whose core lies in characterizing the topographic morphological features of point-clouds by extracting topographic feature factors. Commonly used parameters include elevation, height difference, surface area, slope, aspect, maximum curvature, curvature, relief, and topographic roughness [3,4,5,6,7,8]. Such approaches boast clear physical meanings and a relatively mature methodological system, yet their limitation is that single-factor approaches only capture the one-dimensional characteristics of terrain, and thus cannot comprehensively characterize its overall multi-dimensional complexity. To address this drawback, researchers have successively proposed multi-factor fusion methods [8,9,10,11,12,13]. For instance, Lu et al. developed a Composite Terrain Complexity Index (CTCI) through the equal-weight fusion of multiple terrain factors to overcome the limitations of a single perspective; Zhang et al. introduced the Principal Component Analysis (PCA) method to assign weights to terrain factors based on their variance contribution rates; Tang et al. incorporated the CRITIC weighting method into complexity calculation to objectively determine the contribution of each factor. Although multi-factor approaches have significantly improved the dimensional completeness of complexity description, they suffer from high computational complexity and cannot efficiently process extremely large-scale point-cloud data. The second methodology is the information entropy-based approach. From a statistical perspective, this approach converts the local features of point-clouds into probability distributions based on the concept of Shannon entropy, and quantifies point-cloud complexity using the calculated entropy values, thereby providing a novel quantitative perspective for point-cloud complexity assessment [14,15,16,17,18,19]. However, the density entropy-based quantification method commonly adopted in existing studies has notable limitations in processing seafloor topographic point-clouds acquired by multibeam systems: the point-cloud density of multibeam data varies with water depth, leading to a significantly non-uniform density distribution, and the density-based entropy calculation method often misinterprets this non-uniformity as a variation in terrain complexity, thus overestimating the disorder degree of the point-clouds. The third methodology is the semantic-based description approach for terrain complexity [20]. This approach characterizes terrain complexity using descriptive language and is therefore not suitable for the quantitative expression of topographic features.

In view of the aforementioned limitations of existing approaches and the inherent characteristics of multibeam point-clouds, this study proposes a feature joint entropy-based quantification method for point-cloud complexity. In this method, the elevation and slope features of point-clouds are treated as two-dimensional random variables that characterize seafloor topographic morphology, and the Shannon entropy of their joint distribution is calculated to quantify the topographic complexity of local regions. On this basis, the Minimum Description Length (MDL) principle is incorporated to automatically determine the optimal binning structure in a data-driven manner, which effectively overcomes the parameter sensitivity and subjective dependence associated with traditional fixed-binning methods. Thus, this method provides an objective and robust quantitative basis for the differentiated and adaptive simplification of multibeam point-clouds.

2. Methods

2.1. Shannon Entropy

The concept of Shannon Entropy [21] was first proposed by the American mathematician Claude E. Shannon in 1948, and it serves as the core measure in information theory for quantifying the uncertainty of random variables. For a discrete random variable

X

, let its possible values be

\{x_{1}, x_{2} \dots x_{n}\}

with corresponding probabilities

\{p_{1}, p_{2} \dots p_{n}\}

, and the Shannon entropy is defined as follows:

H (X) = - \sum_{i = 1}^{n} p_{i} \log p_{i}

(1)

The entropy value

H (X)

quantitatively characterizes the average information content or degree of uncertainty contained in a discrete random variable

X

: when its values follow a uniform distribution, the entropy value reaches its maximum, and the system exhibits the highest degree of uncertainty; when the state is deterministic, the entropy value is zero, indicating no uncertainty in the system. Thus, Shannon Entropy provides a universal mathematical framework for measuring the intrinsic “degree of disorder” of a system. When introducing this theoretical framework into the field of terrain analysis, the point cloud attributes that characterize terrain morphology can be regarded as a set of random variables, and their probability distributions are used to describe the statistical laws of topographic features in the region. When the terrain undulates sharply and attribute changes are disordered, their distributions tend to be scattered, resulting in a higher entropy value, which indicates a complex terrain morphology with rich information; conversely, when the terrain is flat and regular and attributes are concentrated, the entropy value is lower, indicating a simple terrain structure with poor information.

However, one-dimensional Shannon entropy can only quantify the degree of disorder in the distribution of a single terrain attribute and fails to characterize the correlation between attributes. Consequently, it has limitations in distinguishing terrains with similar one-dimensional distributions but distinct spatial combination patterns. For instance, a regularly undulating hillside (with concentrated slope values and significant elevation variation) and a complexly undulating area (with scattered slope values and equally significant elevation variation) may have similar one-dimensional elevation entropy. Nevertheless, the former features a simple terrain structure while the latter exhibits rich morphological characteristics, leading to significant differences in their actual complexity. Relying solely on one-dimensional entropy values may result in the bias of “misclassifying simple terrain as complex.” To address this limitation, this study extends the measurement of terrain complexity from one-dimensional distribution entropy to two-dimensional joint distribution entropy. By simultaneously considering the joint probability distribution of two feature attributes—elevation and slope—this approach achieves more comprehensive and reliable quantification of the richness and complexity of terrain morphology.

2.2. MDL-Based Construction of Two-Dimensional Adaptive Histogram

The histogram method is a commonly used non-parametric density estimation method for calculating information entropy. Its basic idea is to discretize the continuous sample space into non-overlapping intervals and approximate the probability density function by counting the sample frequency within each interval. This method is simple to implement and efficient in computation, making it a widely adopted approach for entropy estimation in the field of information theory. However, traditional fixed-binning methods have significant limitations: firstly, the determination of the optimal number of bins lacks adaptability and often relies on empirical rules, while the number of bins has a significant impact on the estimation results—excessively fine binning tends to introduce noise and overfitting, whereas overly coarse binning smooths out distribution details, inducing estimation biases and underfitting. Secondly, uniform binning struggles to adapt to non-uniform data distributions: it may lead to data overload in high-density regions and empty bins in sparse regions, resulting in an imbalance between estimation efficiency and accuracy. In particular, these issues become more pronounced in the estimation of two-dimensional or higher-dimensional joint distributions, directly restricting the reliability of entropy measurement and its comparability across different scenarios.

To address the shortcomings of fixed-binning methods, this study introduces the Minimum Description Length (MDL) principle [22] to guide the construction of the two-dimensional histogram. The MDL principle is a general framework for model selection in statistical modeling, whose core idea is that the optimal description of data should achieve an optimal balance between model simplicity and fitting performance to the data. Specifically, for a given dataset

D

and a set of candidate models

M

, the MDL principle will select the model that minimizes the total description length

L (D, M)

:

L (D, M) = L (M) + L (D | M)

(2)

where

L (M)

denotes the amount of information required to encode the model itself, and

L (D | M)

denotes the amount of information required to encode the data given the model. This model framework not only pursues good fitting to the current data but also penalizes models that are prone to overfitting due to complex structures, thereby theoretically guaranteeing the robustness and generalization ability of the selection. When this framework is applied to histogram binning, each binning scheme can be regarded as a candidate model. The optimal binning scheme is the one that minimizes the total description length

L (D, M)

, which essentially achieves an automatic trade-off between data fitting and model complexity. Therefore, this method can adaptively determine the optimal binning structure and number matching the distribution pattern in a completely data-driven manner, without relying on manually preset parameters.

In this study, the core of the MDL-based two-dimensional adaptive histogram construction method lies in discretizing the continuous feature space into several candidate cut points and determining the optimal binning scheme through iterative search. To address the challenge of cut point combination explosion in the two-dimensional feature space, this study adopts a hybrid optimization strategy combining dynamic programming and iterative greedy algorithm: in each iteration, the binning scheme of one feature dimension is fixed, and dynamic programming is used to solve for the optimal cut point combination of the other dimension; subsequently, the optimization direction is determined according to the improvement of the two-dimensional MDL score, and the iteration is performed until the stopping criterion is met. This strategy transforms the complex two-dimensional binning problem into efficiently solvable one-dimensional subproblems, retaining both the theoretical rigor of the MDL principle and the practical operability. Figure 1 illustrates the overall workflow of the method, which mainly includes three key stages: precision setting and candidate cut point generation, dynamic programming solution, and iterative greedy optimization.

(1): Precision Setting and Candidate Cut Point Generation

In constructing the adaptive histogram, the precision parameter

ε

defines the minimum unit for discretizing continuous feature values, and it directly determines the density of candidate cut points and the granularity of the final binning. An appropriate value of

ε

has dual significance: on the one hand, it constrains the minimum size of bins, effectively preventing overfitting caused by excessively fine binning; on the other hand, it limits the number of candidate cut points, indirectly controlling the computational complexity of subsequent dynamic programming solutions. Typically, the precision

ε

can be determined in two ways: one is relying on the inherent measurement accuracy of the data acquisition equipment, reflecting the reliable resolution limit at the physical level of the data; the other is determined based on the statistical characteristics of the data itself, achieving fully data-driven parameter configuration. The former has a clear physical meaning, while the latter is more compatible with the true distribution characteristics of the data.

The multibeam data studied in this paper are characterized by diverse sources and various types of acquisition equipment, resulting in inconsistencies in data measurement accuracy. To ensure the effectiveness and adaptability of the method, this study adopts a hybrid strategy of “Data-driven + Cut-point Selection”, which aligns with the data distribution characteristics while balancing the computational efficiency of the algorithm. Specifically, the precision

ε

is determined by calculating the minimum non-zero difference of feature values in each dimension, which serves as the minimum unit for discretization to adapt to the true distribution characteristics of the data; for each data point

x_{j}

in each dimension of the dataset, two candidate cut points—

x_{j} - \frac{ε}{2}

and

x_{j} + \frac{ε}{2}

—are generated respectively, thereby constructing the initial candidate cut point set for each dimension. Its expression is as follows:

\tilde{C} = (\{x_{j} - \frac{ε}{2}\} \cup \{x_{j} + \frac{ε}{2}\}), x_{j} \in x^{n}

(3)

Subsequently, by counting the actual number of data points in each adjacent cut-point interval, the redundant cut points corresponding to empty intervals with no data distribution are removed to form an initial pruned cut-point set. If the scale of the obtained cut-point set exceeds the computational efficiency requirement, sampling at intervals can be performed according to actual needs, eventually obtaining a compact candidate cut-point set that meets the computational requirements.

(2): Dynamic Programming for Optimal Binning

In the two-dimensional feature space, exhaustive combination of all candidate cut-points to minimize stochastic complexity will encounter the problem of combination explosion, rendering it computationally infeasible. To address this issue, this study draws on the dynamic programming algorithm proposed by Kontkanen et al. [23], which is applicable to one-dimensional data, and extends it to two-dimensional scenarios. This approach efficiently optimizes the cut-point selection process and obtains the globally optimal binning scheme within an acceptable computation time, thereby balancing theoretical advantages and practical operability.

The one-dimensional dynamic programming algorithm proposed by Kontkanen et al. aims to minimize the MDL score

B (x^{n} | E, K, C)

, which is composed of stochastic complexity and model coding length. Its recurrence formula is as follows:

\begin{array}{l} {\hat{B}}_{1, e} = - n_{e} \cdot (\log (ε \cdot n_{e})) - \log (({\tilde{c}}_{e} - (X_{\min} - \frac{ε}{2})) \cdot n) \\ {\hat{B}}_{K, e} = \min_{e^{'}} \{{\hat{B}}_{K - 1, e^{'}} - (n_{e} - n_{e'}) \cdot \log (ε \cdot (n_{e} - n_{e^{'}})) - \log (({\tilde{c}}_{e} - {\tilde{c}}_{e^{'}}) \cdot n) + \log \frac{R_{h_{K}}^{n_{e}}}{R_{h_{K - 1}}^{n_{e^{'}}}} + \log \frac{E - K + 2}{K - 1}\} \end{array}

(4)

where

n

denotes the total number of samples,

n_{e}

is the number of samples in the current interval,

ε

represents the precision parameter,

E

stands for the total number of candidate cut-points,

K

is the number of bins,

C

denotes the cut-point set, and

R_{h_{K}}^{n}

is the parametric complexity under

K

bins. The state

{\hat{B}}_{K, e}

indicates the minimum MDL score achievable when partitioning the data within the interval

[x_{\min}, {\tilde{c}}_{e}]

using the first

e

candidate cut-points into

K

bins (where

{\tilde{c}}_{e}

is the value of the e-th candidate cut-point). The core of this algorithm lies in leveraging the optimal substructure property of the binning problem: it decomposes the global problem of partitioning a sequence with

E

candidate cut-points into

K

optimal bins into two subproblems—the “

K - 1

optimal bins for the interval corresponding to the first

e'

candidate cut-points” and the “local optimization of the

K - t h

bin within the remaining interval”—which are solved by means of the recurrence formula. Through dynamic programming recurrence, the optimal number of bins and the corresponding cut-point set can be determined simultaneously, thereby avoiding exhaustive search.

Based on this one-dimensional dynamic programming framework, this study extends it to the two-dimensional feature space. However, if the one-dimensional recurrence formula is directly extended to two-dimensional joint optimization, its state space and computational complexity will still increase drastically with the rise in dimensions. To address this problem, this study draws on the research idea proposed by Alexander Marx et al. [24] and adopts a dimensional alternating iterative optimization strategy, which decomposes the complex two-dimensional global optimization problem into one-dimensional dynamic programming subproblems with conditional constraints. The core idea of this strategy is as follows: in each iteration, the current binning scheme of one feature dimension is fixed, and under this condition, the binning optimization for the other dimension is transformed into a one-dimensional dynamic programming problem that aims to minimize the two-dimensional joint MDL score. This problem can still be solved using the aforementioned one-dimensional recurrence formula, but the calculation of the MDL score increment must incorporate the constraint effect of the fixed-dimensional binning structure. In addition, all key computational metrics including likelihood, the total number of bins, and others must be derived from the corresponding statistical values under the two-dimensional joint distribution, so as to ensure that the optimization objective is consistently oriented toward the minimization of the description length of the overall two-dimensional joint distribution.

(3): Greedy Iterative Optimization

On the basis of dynamic programming, this study adopts a greedy iterative strategy to approach the global optimal solution for two-dimensional binning. This strategy compares the MDL score improvement margins of the two optimization directions and selects the one with a larger improvement for update, thereby avoiding invalid computations and accelerating the convergence of the algorithm.

The two-dimensional joint MDL score is composed of the stochastic complexity and the model coding length, and its definition formula is as follows:

B = - \sum_{j = 1}^{K} c_{j} \log \frac{c_{j}}{n \cdot v (B_{j})} + \log (\sum_{c_{1} + c_{2} + \dots c_{K} = n} \frac{n!}{c_{1}! \cdot \cdot \cdot c_{K}!} {\prod_{i = 1}^{K} (\frac{c_{i}}{n})}^{c_{i}}) + \log (_{K_{Z} - 1}^{E_{Z}}) + \log (_{K_{S} - 1}^{E s})

(5)

The first term denotes the negative log-likelihood, the second term is the parametric complexity, and the last two terms correspond to the model coding lengths for the elevation and slope dimensions, respectively. A lower score indicates the more efficiently the binning scheme describes the data.

The iterative process is as follows: first, both dimensions are initialized to a single bin, and the initial two-dimensional joint MDL score

L_{0}

is calculated. Subsequently, optimization calculations are carried out in two directions: fix the current elevation binning and optimize the slope cut-points using the two-dimensional dynamic programming method to obtain scheme

S_{1}

and its corresponding score

L_{1}

; fix the current slope binning and optimize the elevation cut-points to obtain scheme

S_{2}

and its corresponding score

L_{2}

. The score improvements

Δ L_{1} = L_{1} - L_{0}

and

Δ L_{2} = L_{2} - L_{0}

are then compared, and the one with a larger improvement is selected to update the current binning. Iteration is performed until the score change between two consecutive iterations is less than the threshold or the maximum number of iterations is reached, and the converged binning scheme output at this point is the desired optimal binning scheme.

Through the alternating iterative optimization of “fixing one dimension and optimizing the other”, this strategy can gradually approach the optimal binning scheme in the two-dimensional feature space, and confine the overall computational complexity within a manageable range while maintaining the theoretical consistency of the MDL principle.

2.3. Method Implementation and Point Cloud Partitioning Strategy

The multibeam point cloud complexity quantification method proposed in this paper adopts the core logic of “data preprocessing—feature calculation—data partitioning—adaptive binning—entropy calculation—result output”, thus achieving the objective quantification of topographic complexity.

In the aforementioned workflow, to balance the computational efficiency for large-scale point clouds and the fairness of complexity quantification, this study adopts a fixed-point number partitioning strategy, decomposing the point cloud into several independent nodes for local complexity calculation. The adoption of this strategy is based on two considerations: first, the entropy estimation accuracy of the MDL-based adaptive histogram is directly proportional to the sample size. If there is a significant discrepancy in the number of points among nodes, even with identical data feature distributions, variations will exist in entropy estimation accuracy, leading to the loss of horizontal comparability of entropy values; second, the stable construction of the adaptive histogram and the accurate estimation of entropy values rely on sufficient sample size. If a fixed-scale method is adopted for point cloud partitioning, the density heterogeneity of multi-beam point clouds will result in the number of points in some nodes falling below the minimum stable estimation threshold, thus failing to ensure the reliability of entropy estimation. In summary, the fixed-point number partitioning strategy ensures that all nodes optimize their binning strategies under unified statistical conditions, thereby guaranteeing the fairness and reliability of entropy value calculation.

The fixed-point number partitioning strategy is also well aligned with the complexity quantification objective of this study. On the premise of a fixed sample size, the more abundant the topographic relief variations within a node are, the more dispersed the combination of elevation and slope in the feature space is. After binning via the MDL-based adaptive histogram, the more two-dimensional bins the data cover, and the higher the calculated entropy value is. From the perspective of information theory, entropy represents the average amount of information carried by a randomly sampled point from the node. A higher calculated entropy value indicates a higher information content per point within the node, meaning that a higher proportion of points should be retained or a more sophisticated feature extraction method adopted to characterize the topographic features within the node during point cloud simplification. Although the density inhomogeneity of multi-beam point cloud data leads to variations in the spatial span of each node, the Shannon entropy in this study is calculated based on the distribution pattern of data in the feature space and is independent of spatial coverage. Such variations in the spatial coverage of nodes neither alter the richness of topographic features within the node nor affect the information content it contains, and thus have no impact on the discrimination of topographic complexity in the application scenario of this study.

3. Materials and Experiments

3.1. Experimental Data and Setup

The experimental data of this study were acquired from a multibeam bathymetry survey cruise in a certain sea area of the Western Pacific Ocean, and the data acquisition and quality control were strictly implemented in accordance with the IHO Standard [25] for Hydrographic Surveys (S-44). To fully verify the quantification capability of the proposed method for terrains with varying complexity levels, a representative region with prominent topographic relief and distinct complexity gradient characteristics was selected as the experimental sample in this sea area. The water depth of this region ranges from 1529 m to 5370 m, and the terrain presents a typical seamount geomorphology as a whole, covering diverse topographic forms including seamount summits, steep slopes, gentle slopes, and surrounding flat areas, which provides complete complexity gradient samples for this study.

The post-processing of the experimental data was conducted using the professional software CARIS HIPS and SIPS (Version 11.3). Standardized processing procedures, including sound velocity correction, tide correction, and gross error elimination, were sequentially carried out. Finally, a total of 394,468 valid bathymetric data points were obtained, and the 3D point cloud visualization of these data is presented in Figure 2.

For the selection of the calculation method, Principal Component Analysis (PCA) is adopted to perform local plane fitting on the neighborhood point cloud for slope estimation. For an arbitrary central point

p_{0}

, we select its k-nearest neighbor set

N_{k} = \{p_{1}, p_{2}, \dots p_{k}\}

(where k = 20 in this study) and calculate the covariance matrix

C = \frac{1}{k} \sum_{i = 1}^{k} (p_{i} - \bar{p}) {(p_{i} - \bar{p})}^{T}

of the neighborhood point set, where

\bar{p}

denotes the centroid of the neighborhood point set. Eigenvalue decomposition is performed on

C

to obtain three eigenvalues sorted in descending order (

λ_{1} \geq λ_{2} \geq λ_{3}

) and their corresponding eigenvectors

v_{1}, v_{2}, v_{3}

. The directions corresponding to

λ_{1}

and

λ_{2}

span the tangent plane of the local terrain, and the direction corresponding to

λ_{3}

is the normal vector. Accordingly, the slope value is calculated as the included angle between the normal vector and the vertical direction.

To verify the reliability of the local plane fitting, the variance explained ratio is introduced as a quality criterion in this study. The variance explained ratio of the i-th principal component is defined as

η_{i} = λ_{i} / (λ_{1} + λ_{2} + λ_{3})

. Among them,

η_{3}

reflects the relative contribution of the normal component: when

η_{3}

approaches 0, the neighborhood point cloud fits the 2D plane more closely, and the estimation of the normal vector and slope becomes more reliable. Statistical results show that among the valid points after preprocessing in this study,

η_{3}

< 0.05 for 96.45% of the points and

η_{3}

< 0.1 for 99.07% of the points, which verifies the robustness and reliability of the adopted slope calculation method from a data perspective.

For parameter settings, the following hyperparameters are configured for the construction of the MDL-based adaptive histogram:

(1): Maximum number of bins: $5 \log n$ . According to the research by Kontkanen et al., the asymptotic complexity of the optimal number of bins for a one-dimensional MDL histogram is $O (\log n)$ . The accuracy of entropy estimation can be guaranteed only when the growth rate of the number of bins is much lower than the sample size n. This study adopts the upper-bound parameter proposed by Marx et al. to constrain the binning scale, striking a balance between estimation accuracy and computational efficiency.
(2): Maximum number of iterations and convergence threshold. These settings are determined based on the statistical results of preliminary experiments. Preliminary tests were conducted on multiple nodes with 2500 points in the experimental area, and the MDL score converged within 5–8 iterations. Accordingly, the maximum number of iterations is set to 10, which fully satisfies the convergence requirement of the algorithm. The convergence threshold is set to 1 × 10⁻⁴, which matches the calculation precision of the entropy value. The algorithm terminates early when the relative change in the MDL score between two consecutive iterations is less than this threshold, balancing computational stability and operational efficiency.

3.2. Experimental Design

To systematically verify the effectiveness of the multi-beam point cloud complexity quantification method based on joint feature entropy proposed in this study, three sets of experiments are designed focusing on two core aspects: parameter determination and performance verification. The detailed experimental design and operation protocols are elaborated as follows.

3.2.1. Experimental Design for Determination of Sample Size Threshold

This study adopts a fixed-point number strategy to partition the point cloud into independent nodes for complexity quantification, and thus it is necessary to determine a reasonable node point number threshold to ensure that the two-dimensional joint entropy calculated for each node has sufficient accuracy and stability. In accordance with the Law of Large Numbers, as the sample size increases, the histogram-based estimation of the probability distribution will gradually approach the true distribution, and the calculation results of the two-dimensional joint entropy will consequently tend to be stable. In practical applications, an excessively small sample size is prone to causing a large entropy estimation bias, whereas an excessively large sample size will increase the computational burden and yield no significant improvement in estimation accuracy. Therefore, it is essential to strike a balance between estimation reliability and computational efficiency, and determine the minimum sample size

N_{\min}

that enables the entropy estimation to reach a stable state.

Since the elevation-slope joint distribution of seafloor topographic data is unknown, the theoretical true value of entropy cannot be directly calculated, which renders the traditional evaluation approach based on “true value verification” inapplicable. To address this issue, this experiment abandons the direct solution for the true entropy value and instead adopts a verification strategy based on convergence and consistency analysis. Kernel Density Estimation (KDE) is introduced as a comparative benchmark, and the minimum number of points required to support the stable calculation of entropy values by the MDL-based adaptive histogram is determined by analyzing the differences between the entropy estimation results of the two methods and their variation patterns with sample size.

The specific experimental design is as follows: a series of gradient sample sizes N are set using the method of random sampling with replacement within the selected experimental area. For each sample size, 30 independent point cloud blocks are randomly extracted, and the two-dimensional joint entropy of each block is calculated by both the KDE method and the algorithm proposed in this study. Subsequently, the mean and variance of the differences between the entropy estimation results of the two methods are statistically computed, and their variation curves with sample size N are plotted. The sample size threshold

N_{\min}

for the experimental area is determined by analyzing the variation trends of the curves.

3.2.2. Experimental Design for Discrimination Ability of Topographic Complexity

Since the numerical results of the two-dimensional joint entropy depend on the selection of binning schemes and thus have no absolute true value, this experiment is not intended to verify the absolute accuracy of entropy calculation, but to demonstrate that the proposed method, as a consistent and data-driven metric for seafloor topographic complexity quantification, can yield entropy values that reliably reflect the relative complexity of different seafloor topographic areas.

To verify the discrimination ability of the proposed method for topographic complexity, three sub-regions with typical topographic features (Sample A, Sample B, and Sample C) were selected from the experimental area, representing three levels of topographic complexity (simple, moderate, and complex), respectively. A total of 10,000 valid points were extracted from each sub-region (exceeding the predetermined sample size threshold from the experiment in Section 3.2.1), which were treated as independent analysis nodes for the calculation of two-dimensional joint entropy.

To quantitatively verify that the entropy values calculated in this experiment can reliably reflect the relative complexity levels of Samples A, B, and C, a verification experiment based on simplification fidelity is further designed in this section. At present, a unified ground truth benchmark and standard metric system have not been established in the field of seafloor point cloud terrain complexity analysis. Different complexity quantification methods fundamentally differ in computational units, output dimensions, and quantification logics, which makes it impossible to carry out direct and unified numerical horizontal cross-comparisons. Therefore, an indirect verification strategy based on simplification fidelity is adopted in this study. According to the objective law of terrain simplification, the higher the terrain complexity, the more significant the accuracy degradation under the same point cloud retention ratio. In other words, to maintain equivalent reconstruction accuracy, nodes with higher complexity need to retain more point cloud data. Accordingly, the point cloud simplification algorithm proposed by Nie et al. [14] is reproduced in this paper to simplify Samples A, B, and C respectively. The retention ratios are set to 100%, 60%, 40%, and 20%, and the Root Mean Square Error (RMSE) at each retention ratio is calculated via the checkpoint method (the average value is obtained from 10 repeated calculations). Since the multibeam bathymetry accuracy decreases with the increase in water depth, there exist discrepancies in the accuracy benchmarks of different nodes. Thus, the relative change rate of RMSE is adopted as the primary quantitative analysis indicator in this study, and its calculation formula is as follows:

δ = \frac{R M S E_{r %} - R M S E_{100 %}}{R M S E_{100 %}} \times 100 %

(6)

By analyzing the relationship between the relative change rate and the two-dimensional joint entropy of each sample under different retention ratios, the discrimination capability of the complexity metric proposed in this paper can be quantitatively verified.

3.2.3. Experimental Design for Response to Continuous Variation in Topographic Complexity

The experiment on the discrimination ability of topographic complexity in Section 3.2.2 has verified the capability of the proposed method to distinguish topographic complexity across typical terrain categories. However, actual seafloor terrain exhibits the characteristics of continuously gradational spatial heterogeneity, and this experiment is therefore intended to verify the responsiveness of the proposed method to the gradient variation in topographic complexity. First, the point cloud of the experimental area is partitioned to form several independent point cloud blocks as the basic analysis units. Subsequently, the MDL-based adaptive histogram method proposed in this study is adopted to calculate the two-dimensional joint entropy of each block one by one, which is used as the quantitative index for the topographic complexity of the corresponding node. The calculation results of all point cloud blocks are mapped to a spatial grid to generate a thermal distribution map of topographic complexity. Second, five typical entropy nodes are selected at approximately equal intervals along the color band gradient of this distribution map, corresponding to five entropy intervals (low, low-medium, medium, medium-high, and high) respectively, and the node point cloud data corresponding to each color block are randomly extracted as experimental samples. Finally, the proposed method is used to recalculate the two-dimensional joint entropy of each node, and the optimal MDL binning scheme for each node is recorded to analyze the variation law of entropy values with the gradient of topographic complexity. For the five selected typical entropy nodes, the simplification fidelity method presented in Section 3.2.2 is synchronously adopted to perform quantitative verification in the subsequent analysis.

4. Results and Discussion

4.1. Experimental Results and Analysis for the Determination of Sample Size Threshold

The detailed statistical results and variation curves of the mean and variance of the differences in two-dimensional entropy values calculated by the two methods are presented in Table 1 and Figure 3. In terms of the estimation regularity, with a small sample size, the entropy estimations of both methods deviate from the true distribution, accompanied by a large variance of the differences. As the sample size increases gradually, the estimation results of the two methods converge gradually, approaching their respective asymptotic estimation values. Since the magnitude relationship between the asymptotic estimation values of the two methods and the theoretical true value, as well as their convergence rates, cannot be determined in advance, the variation trend of the mean of the differences with the sample size is also difficult to predict a priori. When the sample size is sufficiently large to support the stable estimation of both methods, the mean of the differences stabilizes around a certain systematic difference (a systematic deviation exists between the asymptotic estimation values due to the differences in methodological principles and implementation approaches of the two methods). Meanwhile, the variance of the differences decreases to a low level and stabilizes accordingly, exhibiting a variation characteristic of a gradual decrease followed by entering a plateau phase. Therefore, when both curves enter the plateau phase, the corresponding sample size can be regarded as the one that ensures the stability and reliability of the entropy estimation of the proposed method, which is the sample size threshold to be determined in this experiment.

It can be seen from the results in Table 1 and the variation trends of the curves shown in Figure 3 that both the mean of the differences and the variance of the differences in the entropy estimations of the two methods exhibit distinct phased variation characteristics with the increase in sample size. At small sample sizes (N < 1500), the mean of the differences fluctuates drastically, while the variance of the differences remains at a high level with an overall downward trend. A brief rebound occurs in the variance of the differences when the sample size reaches 700 points. This phenomenon may be attributed to the random sampling at this sample size extracting more hard samples with complex distribution patterns, which pose greater challenges for entropy estimation. The limited sample size thus leads to an increase in the estimation differences between the two methods, resulting in a temporary rise in the variance of the differences. As the sample size gradually increases to the range of 1500–2500 points, the fluctuation range of the mean of the differences narrows progressively and stabilizes basically at 2000 points; the variance of the differences shows a continuous downward trend, and the curve of the variance of the differences enters a relatively stable plateau phase when the sample size reaches 2500 points, with no significant decrease observed thereafter. In summary, when the sample size reaches 2500 points, the entropy estimations of both methods have entered a state of stable convergence, and further increasing the sample size yields limited gains in improving estimation accuracy. Considering both estimation reliability and computational efficiency comprehensively, 2500 points is selected as the target number of points for node partitioning in the subsequent experiments.

4.2. Experimental Results and Analysis for the Discrimination Ability of Topographic Complexity

In this experiment, Sample A, Sample B and Sample C, which represent three typical terrain types (simple, moderate and complex), were selected to conduct the verification of topographic complexity discrimination. The detailed calculation results of each sample are presented in Table 2, and the corresponding 3D point cloud visualization and optimal MDL binning schemes are shown in Figure 4. The experimental results show that Sample A, the simple terrain, is a flat area with the data distribution characteristics of minimal elevation fluctuations and slope values close to zero; the algorithm met the convergence criterion after 5 iterations, and the calculated entropy value was 5.3238 bits. Sample B, the moderately complex terrain, is a hillside terrain with significant macroscopic elevation fluctuations and an elevation range of approximately 2200 m with a relatively uniform distribution; however, due to the relatively concentrated distribution of slope values, the topographic morphology is relatively monotonous, resulting in a moderate level of complexity, and the entropy value was 6.4467 bits after 8 iterations. Sample C, the complex terrain, features rich topographic morphology and diverse structures, where the distribution of both elevation and slope features exhibits the strongest discreteness, and the entropy value calculated after 7 iterations was 6.9559 bits. The two-dimensional entropy values of the three typical terrain types exhibit distinct gradient differences, and the quantitative results are highly consistent with the manual interpretation results of seafloor topographic complexity. This indicates that the adaptive binning rule based on the MDL principle can achieve data-driven optimal binning matching according to the data distribution characteristics of different terrains. Moreover, there exists a significant correlation between the number of bins in the binning scheme and the actual complexity of the terrain, which demonstrates that the entropy values output by the proposed method can serve as an effective quantitative index for topographic complexity.

To further quantitatively verify the discrimination capability of the aforementioned entropy values for the complexity of typical terrain samples, the simplification fidelity verification is performed on Samples A, B, and C in this section in accordance with the experimental design in Section 3.2.2, and the results are shown in Table 3. At a retention ratio of 20%, the relative change rates of Samples A, B, and C are 17.99%, 20.48%, and 37.71%, respectively, which are positively correlated with their two-dimensional joint entropy values (5.32, 6.45, and 6.96 bits). At retention ratios of 60% and 40%, the relative change rates also maintain the order of C > B > A. Combined with manual interpretation, this quantitative result objectively verifies the effectiveness and reliability of the complexity quantification metric proposed in this paper.

4.3. Experimental Results and Analysis for Response to Continuous Variation in Topographic Complexity

Based on the experimental design described in Section 3.2.3, the point cloud of the experimental area was subjected to node partitioning and two-dimensional joint entropy calculation, with the results mapped to a spatial grid to generate a thermal distribution map of topographic complexity, as shown in Figure 5. A gradient color spectrum from cool to warm colors is adopted in the figure to characterize the variation trend of entropy values from low to high, which intuitively reflects the spatial distribution characteristics of seafloor topographic complexity in the experimental area. This distribution map can be used to guide subsequent point cloud simplification: for complex terrain areas with high entropy values, more feature points are retained to preserve topographic details; for simple terrain areas with low entropy values, high-intensity simplification can be performed to achieve effective compression of data volume on the premise of ensuring the expression quality of topographic features. In terms of algorithm efficiency, the average processing time for a single point cloud block (2500 points) in this experiment is approximately 22.53 s. Since the calculation of each point cloud block is independent of one another, this process can be implemented with parallel processing, and the overall processing efficiency can meet the engineering processing requirements of multi-beam point cloud data.

Based on the topographic complexity distribution map, five typical entropy nodes were selected at approximately equal intervals along the color band gradient, and the point cloud data corresponding to the respective color blocks were extracted (Figure 6a–e). The two-dimensional joint entropy of each node was calculated using the proposed method. As shown in Table 4, the entropy values of the five nodes were 4.67 bit, 5.39 bit, 5.80 bit, 6.26 bit and 6.63 bit in sequence, exhibiting a monotonically increasing trend. The corresponding 3D point cloud visualization results indicated that with the increase in entropy value, the topographic morphology transitioned gradually from homogeneous and simple (a) to locally undulating but globally regular (b), then to undulations with relatively regular patterns and monotonous morphological variation (c), followed by interlaced undulations across multiple regions and diverse morphologies (d), and finally to rich and diverse morphologies (e). This trend was consistent with the results of manual interpretation of the gradient differences in topographic complexity. As shown in Table 5, the relative change rates of RMSE for each node at a 40% retention ratio are 4.88%, 5.06%, 11.92%, 14.98% and 22.65%, respectively. This indicator also shows a monotonic increasing trend with the rise in entropy values, which further verifies the reliable response capability of the proposed method to the continuous gradient variation in topographic complexity from a quantitative perspective. Meanwhile, the total number of bins of the MDL adaptive binning increased gradually from 81 to 195, and the binning resolution was positively correlated with the gradient variation in topographic complexity. This reflects the adaptive characteristic of the proposed method to dynamically adjust binning according to the differences in topographic complexity. The results confirm that the entropy values output by the proposed method can stably respond to the gradient variation in topographic complexity, and can thus serve as a reliable quantitative basis for differentiated point cloud simplification.

5. Conclusions

Aiming at the simplification challenge caused by the massive volume and uneven spatial distribution of multi-beam seafloor topographic point cloud data, this paper proposes a seafloor topographic complexity quantification method based on feature joint entropy combined with MDL-based adaptive binning. This method regards the elevation and slope features of the point cloud as two-dimensional random variables describing topographic morphology, estimates the Shannon entropy of their joint distribution by constructing an adaptive histogram based on the Minimum Description Length (MDL) principle, and uses the entropy value to measure the topographic information content and complexity of local regions. In the process of method construction, a dimension-alternating optimization strategy combining dynamic programming and greedy iteration is adopted to solve the optimal binning structure, which realizes data-driven adaptive binning and effectively overcomes the parameter sensitivity and subjective dependence of traditional fixed binning methods. In addition, the minimum sample size supporting the stable estimation of entropy values is determined through convergence analysis, which ensures the statistical reliability and inter-regional comparability of the complexity quantification results. Experimental results show that the two-dimensional joint entropy can effectively distinguish seafloor terrains of different complexity levels, and the generated thermal distribution map of topographic complexity can reflect the spatial distribution characteristics of seafloor terrain, providing an objective quantitative basis for subsequent differentiated point cloud simplification. On this basis, future research can be further deepened in the following directions: first, further optimize the algorithm efficiency to promote the engineering application of the method in the processing of large-scale seafloor topographic data; second, optimize spatial continuity. Although the current fixed-point number node partitioning strategy ensures computational reliability, it severs the spatial correlation between nodes. In the future, we will combine point cloud simplification algorithms to optimize the continuity of feature extraction, so as to achieve effective integration of complexity estimation and simplification processing; third, by optimizing computing resources or introducing efficient approximation algorithms, we will further incorporate multi-dimensional topographic features such as curvature and terrain roughness index into the joint entropy framework to construct a high-dimensional complexity quantification model, so as to improve the comprehensiveness and refinement of terrain complexity representation; and fourth, we will explore the applicability of the proposed method to other types of point cloud data and investigate improved strategies adapted to different data characteristics, so as to further expand the application boundaries and generalizability of the method.

Author Contributions

Conceptualization, Y.C. and S.J.; methodology, D.L.; software, D.L., J.T. and Y.W.; writing—original draft preparation, D.L. and Y.W.; writing—review and editing, D.L., S.J. and Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data used in this study cannot be shared publicly, as it contains real geographic and bathymetric information.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhao, J.H.; Ouyang, Y.Z.; Wang, A.X. Status and Development Tendency for Seafloor Terrain Measurement Technology. Acta Geod. Cartogr. Sin. 2017, 46, 1786–1794. [Google Scholar]
Cao, H.B.; Zhang, L.H.; Zhu, M.H.; Zhao, W. Comparison and Analysis of Thinning Methods for Mass Data Acquired by Multibeam Sounders. Hydrogr. Surv. Chart. 2010, 30, 81–83. [Google Scholar]
Tang, G.A.; Liu, X.J.; Lv, G.N. Digital Elevation Model and Geo-Science Analysis: Principles and Methods; Science Press: Beijing, China, 2005. [Google Scholar]
Zhou, Q.M.; Liu, X.J. Digital Terrain Analysis, 1st ed.; Science Press: Beijing, China, 2006. [Google Scholar]
Zhang, L. Spatial Pattern of Loess Landform Morphology Based on Core Topographic Factor Analysis. Master’s Thesis, Nanjing Normal University, Nanjing, China, May 2013. [Google Scholar]
Zhou, T.; Long, Y.; Tang, G.A.; Yang, X. A Fractal Method to Describe the Terrain Complexity Reflected by the Raster DEM. Geogr. Geo-Inf. Sci. 2006, 22, 26–30. [Google Scholar]
Zhang, C.; Chen, B.X.; Wu, L. Geographic Information System; Higher Education Press: Beijing, China, 1999. [Google Scholar]
Zhang, Q.N. A New Simplification Method Based on Terrain Complexity for LiDAR Point Cloud. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, May 2016. [Google Scholar]
Lu, H.X.; Liu, X.J.; Tang, G.A. Terrain Complexity Assessment Based on Multivariate Analysis. J. Mt. Sci. 2012, 30, 616–621. [Google Scholar]
Tang, Y.H. A Thin-Method Based on Local Terrain Complexity Index for LiDAR Bare Earth Surface Point Cloud. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, May 2019. [Google Scholar]
Li, J.Y.; Zhang, J.X.; Wang, B.; Du, M.D.; Zhang, Y.H.; Yang, F.L.; Luan, Z.D. Quantitative Analysis on the Seabed Terrain Complexity of Submarine Canyons of the South China Sea Continental Slope. Mar. Geol. Front. 2024, 40, 84–92. [Google Scholar]
Chen, J.Y.; Bu, X.H.; Chen, D.C. A Thinning Algorithm of Multibeam Point Cloud Considering Weight of Terrain Complexity Factor. J. Shandong Univ. Sci. Technol. Nat. Sci. 2022, 41, 21–29. [Google Scholar]
Li, S.X.; Ge, W.; Cheng, Y.; Zhang, J.L. Quantitative Study of a Generic Model of Complexity Taking into Account Topographic Features. Hydrogr. Surv. Chart. 2024, 44, 72–77. [Google Scholar]
Nie, X. A LiDAR-Based DEM Thinning Algorithm Under Precision Constraints. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, May 2014. [Google Scholar]
Hu, C. Thinning Algorithm of LiDAR Bare Earth Surface Point Cloud Under the Restriction of Precision. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, May 2015. [Google Scholar]
Mahdaoui, A.; Sbai, E.H. 3D Point Cloud Simplification Based on k-Nearest Neighbor and Clustering. Adv. Multimed. 2020, 2020, 8825205. [Google Scholar] [CrossRef]
Mahdaoui, A.; Bouazi, A.; Marhraoui Hsaini, A.; Sbai, E.H. Comparison of K-Means and Fuzzy C-Means Algorithms on Simplification of 3D Point Cloud Based on Entropy Estimation. Adv. Sci. Technol. Eng. Syst. J. 2017, 2, 38–44. [Google Scholar] [CrossRef][Green Version]
Zhai, J.S.; Zhang, C.; Li, Z.X.; Zhang, L. Representation and Calculation of Submarine Landform Complexity. Period. Ocean Univ. China 2019, 49, 143–147. [Google Scholar]
Li, M.F.; Li, L.Y.; Zhao, X.Y.; Lu, H.F. Application of Slope Entropy of Points in Terrain Simplification. Bull. Surv. Mapp. 2019, 11, 109–113. [Google Scholar]
Ma, J.J. Research on Quantifying Terrain Complexity. Master’s Thesis, Nanjing Normal University, Nanjing, China, May 2012. [Google Scholar]
Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–656. [Google Scholar] [CrossRef]
Rissanen, J. Modeling By Shortest Data Description. Automatica 1978, 14, 465–471. [Google Scholar] [CrossRef]
Kontkanen, P.; Myllymäki, P. MDL histogram density estimation. In Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, San Juan, PR, USA, 21–24 March 2007. [Google Scholar]
Marx, A.; Yang, L.; van Leeuwen, M. Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms. arXiv 2021, arXiv:2101.05009v1. [Google Scholar]
S-44 Edition 6.1.0; IHO Standards for Hydrographic Surveys. IHO: Monaco, Monaco, 2022.

Figure 1. Workflow of MDL-based Two-Dimensional Adaptive Histogram Construction.

Figure 2. 3D Point Cloud Visualization of the Experimental Area.

Figure 3. Variation Curves of the Mean and Variance of Differences in Two-Dimensional Entropy Values Calculated by the Two Methods.

Figure 4. 3D point cloud visualization and optimal MDL binning schemes of three typical terrain samples. (a) Sample A; (b) Sample B; (c) Sample C; (d) Binning scheme of Sample A; (e) Binning scheme of Sample B; (f) Binning scheme of Sample C.

Figure 5. Node Partitioning Results (Left) and Topographic Complexity Distribution Map (Right).

Figure 6. 3D Point Cloud Visualization Results of Nodes (a–e).

Table 1. Statistical Table of the Mean and Variance of Differences in Two-Dimensional Entropy Values Calculated by the Two Methods.

Sample Size	Mean of the Differences	Variance of the Differences
100	0.17696	0.080728
200	0.15281	0.037621
300	0.1779	0.035259
500	0.12552	0.034692
700	0.07762	0.050108
900	0.09389	0.035141
1100	0.08009	0.03943
1300	0.05063	0.03718
1500	0.10771	0.022066
1700	0.07694	0.015899
2000	0.09984	0.011255
2300	0.09972	0.009567
2500	0.1006	0.007556
2700	0.1031	0.007602
3000	0.09942	0.007745
3500	0.09534	0.007165
4000	0.10413	0.007179

Table 2. Calculation Results of Two-Dimensional Joint Entropy in the Experimental Area.

Experimental Block	Number of Iterations	Binning Scheme	Two-Dimensional Entropy Value
Sample A	5	12 × 11	5.3238
Sample B	8	16 × 15	6.4467
Sample C	7	24 × 13	6.9559

Table 3. Simplification Fidelity Verification Results of Samples A, B, and C.

Retention Ratio	Sample A		Sample B		Sample C
Retention Ratio	RMSE	Relative Change Rate	RMSE	Relative Change Rate	RMSE	Relative Change Rate
100%	0.857865	——	17.377362	——	7.638752	——
60%	0.86819	1.2035%	18.012107	3.6527%	7.976996	4.4280%
40%	0.913218	6.4524%	18.589163	6.9734%	8.327113	9.0114%
20%	1.012221	17.993%	20.93588	20.4778%	10.519674	37.7145%

Table 4. Comparison of Topographic Features and Binning Schemes for Nodes with Different Entropy Values.

Node ID	Entropy Interval	Entropy Value (bit)	Number of Elevation Bins	Number of Slope Bins	Total Number of Bins	Topographic Description
a	Low	4.6723	9	9	81	Homogeneous and simple
b	Low-Medium	5.3930	12	10	120	Locally undulating, globally regular
c	Medium	5.8001	13	10	130	Relatively regular undulations, monotonous morphological variation
d	Medium-High	6.2682	13	12	156	Interlaced undulations in multiple regions, diverse morphologies
e	High	6.6374	15	13	195	Rich and diverse morphologies

Table 5. Simplification Fidelity Verification Results of Five Typical Entropy Nodes (a–e).

Retention Ratio	a		b		b		d		e
Retention Ratio	RMSE	Relative Change Rate	RMSE	Relative Change Rate	RMSE	Relative Change Rate	RMSE	Relative Change Rate	RMSE	Relative Change Rate
100%	8.971422	——	11.014302	——	16.174115	——	6.788884	——	13.937556	——
60%	9.06994	1.0981%	11.447358	3.9317%	17.617475	8.9238%	7.429014	9.4290%	15.566122	11.6847%
40%	9.408851	4.8758%	11.572046	5.0638%	18.101392	11.9158%	7.806162	14.9844%	17.094787	22.6526%
20%	10.453663	16.5218%	13.340235	21.1173%	21.130446	30.6435%	9.266855	36.5004%	19.116594	37.1588%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, D.; Cui, Y.; Jin, S.; Wei, Y.; Tan, J. Research on Complexity Quantification Method for Multibeam Point Clouds Based on Feature Joint Entropy. J. Mar. Sci. Eng. 2026, 14, 824. https://doi.org/10.3390/jmse14090824

AMA Style

Liang D, Cui Y, Jin S, Wei Y, Tan J. Research on Complexity Quantification Method for Multibeam Point Clouds Based on Feature Joint Entropy. Journal of Marine Science and Engineering. 2026; 14(9):824. https://doi.org/10.3390/jmse14090824

Chicago/Turabian Style

Liang, Dekun, Yang Cui, Shaohua Jin, Yuan Wei, and Jichuan Tan. 2026. "Research on Complexity Quantification Method for Multibeam Point Clouds Based on Feature Joint Entropy" Journal of Marine Science and Engineering 14, no. 9: 824. https://doi.org/10.3390/jmse14090824

APA Style

Liang, D., Cui, Y., Jin, S., Wei, Y., & Tan, J. (2026). Research on Complexity Quantification Method for Multibeam Point Clouds Based on Feature Joint Entropy. Journal of Marine Science and Engineering, 14(9), 824. https://doi.org/10.3390/jmse14090824

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Complexity Quantification Method for Multibeam Point Clouds Based on Feature Joint Entropy

Abstract

1. Introduction

2. Methods

2.1. Shannon Entropy

2.2. MDL-Based Construction of Two-Dimensional Adaptive Histogram

2.3. Method Implementation and Point Cloud Partitioning Strategy

3. Materials and Experiments

3.1. Experimental Data and Setup

3.2. Experimental Design

3.2.1. Experimental Design for Determination of Sample Size Threshold

3.2.2. Experimental Design for Discrimination Ability of Topographic Complexity

3.2.3. Experimental Design for Response to Continuous Variation in Topographic Complexity

4. Results and Discussion

4.1. Experimental Results and Analysis for the Determination of Sample Size Threshold

4.2. Experimental Results and Analysis for the Discrimination Ability of Topographic Complexity

4.3. Experimental Results and Analysis for Response to Continuous Variation in Topographic Complexity

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI