Next Article in Journal
Evaluating the Impact of Different Spatial Resolutions of UAV Imagery on Mapping Tidal Marsh Vegetation Using Multiple Plots of Different Complexity
Previous Article in Journal
RMCMamba: A Multi-Factor High-Speed Railway Bridge Pier Settlement Prediction Method Based on RevIN and MARSHead
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Evaluation of Nine Machine Learning Models for Target and Background Noise Classification in GM-APD LiDAR Signals Using Monte Carlo Simulations

1
National Key Laboratory of Laser Spatial Information, Harbin Institute of Technology, Harbin 150001, China
2
Zhengzhou Research Institute of Harbin Institute of Technology, Zhengzhou 450000, China
3
Research Center for Space Optical Engineering, Harbin Institute of Technology, Harbin 150001, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2025, 17(21), 3597; https://doi.org/10.3390/rs17213597
Submission received: 27 August 2025 / Revised: 27 October 2025 / Accepted: 28 October 2025 / Published: 30 October 2025

Highlights

What are the main findings?
  • A complete data-processing framework for GM-APD LiDAR echo signals was established, enabling systematic comparison of nine machine-learning models derived from six baseline algorithms.
  • NN-BP-based models, especially the proposed ResNet extension, achieved the highest classification accuracy and robustness under low-SNR and multi-frame conditions.
What are the implications of the main findings?
  • The study confirms the feasibility and advantages of applying machine learning to GM-APD LiDAR signal classification, providing a benchmark for future algorithm evaluation.
  • The results offer practical guidance for balancing detection accuracy, computational efficiency, and hardware deployability in real-world GM-APD LiDAR systems.

Abstract

This study proposes a complete data-processing framework for Geiger-mode avalanche photodiode (GM-APD) light detection and ranging (LiDAR) echo signals. It investigates the feasibility of classifying target and background noise using machine learning. Four feature processing schemes were first compared, among which the PNT strategy (Principal Component Analysis without tail features) was identified as the most effective and adopted for subsequent analysis. Based on this framework, nine models derived from six baseline algorithms—Decision Trees (DTs), Support Vector Machines (SVMs), Backpropagation Neural Networks (NN-BPs), Linear Discriminant Analysis (LDA), Logistic Regression (LR), and k-Nearest Neighbors (KNN)—were systematically assessed under Monte Carlo simulations with varying echo signal-to-noise ratio (ESNR) and statistical frame number (SFN) conditions. Model performance was evaluated using eight metrics: accuracy, precision, recall, FPR, FNR, F1-score, Kappa coefficient, and relative change percentage (RCP). Monte Carlo simulations were employed to generate datasets, and Principal Component Analysis (PCA) was applied for feature extraction in the machine learning training process. The results show that LDA achieves the shortest training time (0.38 s at SFN = 20,000), DT maintains stable accuracy (0.7171–0.8247) across different SFNs, and NN-BP models perform optimally under low-SNR conditions. Specifically, NN-BP-3 achieves the highest test accuracy of 0.9213 at SFN = 20,000, while NN-BP-2 records the highest training accuracy of 0.9137. Regarding stability, NN-BP-3 exhibits the smallest RCP value (0.0111), whereas SVM-3 yields the largest (0.1937) at the same frame count. In conclusion, NN-BP-based models demonstrate clear advantages in classifying sky-background noise. Building on this, we design a ResNet based on NN-BP, which achieves further accuracy gains over the best baseline at 400, 2000, and 20,000 frames—12.5% (400), 9.16% (2000), and 2.79% (20,000)—clearly demonstrating the advantage of NN-BP for GM-APD LiDAR signal classification. This research thus establishes a novel framework for GM-APD LiDAR signal classification, provides the first systematic comparison of multiple machine learning models, and highlights the trade-off between accuracy and computational efficiency. The findings confirm the feasibility of applying machine learning to GM-APD data and offer practical guidance for balancing detection performance with real-time requirements in field applications.

1. Introduction

Highly sensitive Geiger-mode avalanche photodiode (GM-APD) single-photon light detection and ranging (LiDAR) is extensively utilized for three-dimensional depth reconstruction of targets in fields such as autonomous driving [1], environmental monitoring [2], and topographic mapping [3]. Its single-photon-level sensitivity enables the detection of extremely weak target returns, extending the effective detection range and advancing the moment of target awareness. As a result, downstream systems gain a larger response budget. In low-altitude UAV operations, earlier recognition of power lines expands the decision and maneuver window, reducing collision risk and associated economic losses. In homeland-security contexts, earlier detection of UAVs or missiles affords additional warning and preparation time, strengthening overall defense readiness. Consequently, deploying GM-APD LiDAR for target detection against sky backgrounds can help mitigate economic loss and enhance national security. However, during daytime detection, small target pixel signals resemble the sky background noise, posing significant challenges for detection.
Researchers have proposed numerous methods to effectively distinguish between target pixels and sky background in GM-APD data, which can be categorized into two types. After reconstructing the image, the first class of methods segments it by features (e.g., texture, brightness), as adopted in [4,5,6]. However, it has limited effectiveness on lower-resolution GM-APD images and may lead to signal feature loss. The second type directly classifies the time-domain GM-APD signals, as in [7,8,9]. However, because GM APD returns contain a rich structure, approaches that rely on a limited set of handcrafted features have constrained applicability.
In comparison, machine learning (ML) can delve deeply into data features, especially in classifying highly similar signals [10,11,12,13,14]. However, ML for classifying GM-APD signals remains a relatively unexplored field. ML can automatically learn the characteristics of high-dimensional time-domain array GM-APD signals, surpassing traditional noise segmentation, feature extraction, and reconstruction methods. Thus, our team uses ML to classify GM-APD data.
In this study, we explored the performance of six ML algorithms, including Decision Trees (DTs) [15], Logistic Regression (LR) [16], Support Vector Machines (SVMs) [17], K-Nearest Neighbors (KNN) [18], Linear Discriminant Analysis (LDA) [19] and Back Propagation neural networks (NN-BPs) [20]. Specifically, we considered three kernel functions for SVM-L (Linear), SVM-2 (quadratic), and SVM-3 (cubic). We also employed NN-BP with two-layer (NN-BP-2) and three-layer (NN-BP-3) architectures. Consequently, the six models ultimately yielded nine algorithmic variants. Initially, we simulated GM-APD signals using the Monte Carlo (MC) method. Subsequently, dimensionality reduction and feature extraction were conducted through PCA. Following this, we input these features into various ML models for training and employed five-fold cross-validation [21] to ensure the models’ robustness. We evaluated models using accuracy, precision, recall, false positive rate (FPR), false negative rate (FNR), F1-score, Kappa coefficient [22], and relative change percentage (RCP) [23]. Furthermore, we investigated the classification performance of the models under different statistical frame numbers (SFNs) and signal-to-noise ratios (SNRs). Our contributions are listed as follows:
  • Framework and methodology for GM-APD signal classification: This study proposes a systematic framework for classifying target and background noise signals in GM-APD LiDAR returns. The framework integrates MC-based photon-level simulation, PCA-driven feature extraction, and a comparative evaluation of multiple ML algorithms. This is the first work to explicitly structure and assess a complete classification pipeline designed for GM-APD data.
  • Performance insights across models: By evaluating nine model variants derived from six baseline algorithms under varying SFNs and SNRs, we reveal distinctive performance characteristics of each model. NN-BP-3 achieves the highest test accuracy (0.9213) and the lowest RCP (0.0111) at SFN = 20,000, LDA records the shortest training time (0.38 s), and DT maintains robust accuracy (0.7171–0.8247) across SFNs, providing practical guidance for selecting optimal models in different operational conditions.

2. Materials and Methods

2.1. Imaging Principle of GM-APD LiDAR

As illustrated in Figure 1, GM-APD LiDAR emits laser pulses at 1064 nm, which are scattered when encountering small targets such as drones, and the GM-APD subsequently collects the scattered photons. During a single detection, the system operates in time-correlated single-photon counting (TCSPC) mode, as shown in Process A. The laser and the GM-APD are synchronized under the time-to-digital converter (TDC) clock. Simultaneously with pulse emission, the GM-APD opens its gate after a delay t d . Operating in synchronized mode, the detector remains active for a gating duration t g , while the TDC provides a temporal resolution of t b . The TDC records the corresponding arrival time when a return photon triggers an avalanche event. Process B records the time-of-flight (TOF) values from 200 consecutively acquired frames. The vertical axis represents the index of consecutive frames, and the horizontal axis denotes the TOF of the current frame. Process C illustrates how repeated gating accumulates the arrival times of return photons, thereby generating statistical echo signals. A histogram is constructed by collecting a fixed number of frames, where the horizontal axis represents the time bin within the gating window and the vertical axis denotes the photon counts. Processing this histogram enables the extraction of key target information, including depth and relative reflectivity. The depth is derived from the timing of photon arrivals, which reflects the distance between the detector and the target. In contrast, reflectivity is a relative measure indicating the strength of the scene’s photon reflection capability.
For a single detection of the individual pixel, assuming the impulse response function (IRF) is Equation (1),
g ( t ) = N 0 t τ 2 exp t τ
where N 0 denotes the number of photons emitted by the laser, and τ represents the laser pulse width.
After a delay of ‘Tdelay,’ the gate’s ‘Tgate’ opens, awaiting the arrival of photons. The probability of the nth bin being triggered under the long dead-time mode is [4,9,24]
h m [ n ] φ n t | η n Δ t ( n + 1 ) Δ t r · g t 2 z c d t + n t
n t = η n a + η n n + n d a
In Equations (2) and (3), η represents the photon transmission efficiency, r is reflectivity, n is pulsed laser pulse number, φ follows a Poisson distribution, z is the target distance, c is the speed of light, n a represents ambient noise photon number, n n denotes neighboring noise photon number, and n d a is the dark count.

2.2. PCA Algorithm Principle

During feature preprocessing, PCA was employed for dimensionality reduction. PCA, as a fundamental data processing technique in LiDAR signal analysis, has been extensively applied for feature extraction and dimensionality reduction. Previous studies have shown that incorporating PCA into LiDAR SLAM can significantly enhance mapping accuracy and overall robustness while maintaining real-time performance [25]. In addition, adaptive PCA-based clustering methods have been proposed to project 3D LiDAR point clouds into lower-dimensional subspaces, enabling efficient noise filtering and fine structural detail preservation, thereby improving both precision and computational efficiency [26]. Furthermore, PCA-based data fusion approaches have proven effective in integrating LiDAR structural features with multispectral information, achieving up to 95% classification accuracy in forest disturbance assessment using UAV LiDAR and multispectral datasets [27]. In GM-APD single-photon LiDAR applications, PCA is commonly employed for data preprocessing. By extracting principal components and analyzing the directional features of target echo signals, PCA enables accurate attitude estimation and dimensionality reduction, thereby improving tracking stability and computational efficiency [28].
In this paper, the MC simulations used a gating window of 1 μs with a temporal resolution of 1 ns, yielding 1000 bins. After excluding the final bin (tail data bin) in the gating window, the original feature dimensionality was 999. PCA was then applied to this feature space, with the number of components determined by preserving 95% of the cumulative explained variance. The retained dimensionalities under different SFNs are summarized in Table 1. As shown, fewer statistical frames required more components to achieve the same explained variance, indicating that sparser data contain less effective feature information. No additional hyperparameter optimization, such as cross-validation for component selection, was conducted at this stage and will be considered in future work.

2.3. ML Model Principle

To investigate the applicability and performance of ML algorithms in classifying targets and background noise within GM-APD echo signals, this study evaluates six representative classifiers: DT, LDA, NN-BP, SVM, KNN, and LR. These algorithms exhibit respective advantages in feature extraction, nonlinear modeling, and generalization capabilities, making them suitable for complex scenarios characterized by high noise levels, nonlinear feature distributions, and limited training samples, which are commonly observed in GM-APD signal echoes. Their classification performance is evaluated through experiments, providing insights into algorithm selection and system design for GM-APD-based applications. A detailed description of each algorithm is presented as follows.

2.3.1. DT

The DT algorithm constructs a hierarchical tree structure by recursively partitioning the feature space based on information gain or Gini impurity. Each internal node represents a decision rule on a feature, and each leaf node corresponds to a predicted class. DT can model nonlinear decision boundaries and capture feature interactions without requiring data normalization. Its structure makes it suitable for analyzing the discriminative characteristics of GM-APD echo signals. Previous studies have successfully applied the DT algorithm to airborne LiDAR target extraction tasks, demonstrating its efficiency and interpretability in complex scenes [29]. However, it may suffer from overfitting when applied to small or noisy datasets. Information Gain is a fundamental criterion for determining the optimal attribute to split nodes when constructing a DT.
I G ( D , A ) = E n t ( D ) v Values ( A ) | D v | | D | · E n t ( D v )
where E n t ( D ) = k = 1 K p k log 2 p k is the entropy of dataset D, and D v is the subset where attribute A takes value v. This formula is used for attribute selection in DT.

2.3.2. LDA

LDA is a linear classification method that projects high-dimensional data onto a lower-dimensional space where class separability is maximized. It assumes that the data from each class follows a Gaussian distribution with the same covariance matrix. LDA is particularly effective when class distributions are approximately linear, and the number of training samples is limited. Previous studies have demonstrated that LDA is effective for LiDAR-based target classification tasks. For instance, it has been successfully applied to distinguish buildings from non-building planar surfaces in vegetated urban areas, achieving an accuracy of up to 95% [30]. In GM-APD signal classification, LDA can reduce feature dimensionality while preserving class-discriminative information, thus improving computational efficiency and classification robustness. The principle of LDA is based on maximizing the ratio of between-class scatter to within-class scatter.
w * = arg max w w T S B w w T S W w
where S B and S W represent the between-class and within-class scatter matrices, respectively. This objective is used in LDA to maximize class separability.

2.3.3. NN-BP

The NN-BP model is a multilayer feed-forward network trained using error backpropagation. It consists of input, hidden, and output layers and uses gradient descent to minimize the loss function. NN-BP have been increasingly applied to LiDAR signal processing, particularly for photon-counting point cloud denoising and classification. Recent studies have demonstrated that NN-BP-based models can effectively suppress background noise and improve the accuracy of interpreting single-photon LiDAR data, achieving F-scores up to 0.977 under strong noise conditions [31].
NN-BP can approximate complex nonlinear relationships between features and class labels, making it suitable for GM-APD signals with highly nonlinear characteristics and noise interferences. Its performance largely depends on network structure, learning rate, and training sample size. We also used NN-BP with dual-layer (NN-BP-2) and triple-layer (NN-BP-3) structures. The core update rule in the error backpropagation algorithm, typically implemented via gradient descent, is formulated as follows:
w i j ( t + 1 ) = w i j ( t ) η · E w i j
This is the weight update rule in NN-BP neural networks, where η is the learning rate, and E w i j is the gradient of the loss function with respect to weight w i j .

2.3.4. SVM

SVM is a supervised learning model that seeks to find the optimal hyperplane that maximizes the margin between different classes. For nonlinearly separable data, SVM utilizes kernel functions to map the input space into a higher-dimensional space where linear separation is possible. SVM performs well on high-dimensional and small-sample-size datasets and is robust to noise and outliers. In single-photon LiDAR applications, SVM-based classifiers have been employed to distinguish target and background photons, effectively reducing boundary blur and improving depth reconstruction accuracy [32]. These characteristics make it an appropriate choice for classifying GM-APD echo signals with limited annotated data and complex background interference. In this paper, we considered three kernel functions for SVM-L (Linear), SVM-2 (quadratic), and SVM-3 (cubic). The optimal hyperplane in SVM is obtained by solving the following constrained optimization problem:
min w , b 1 2 w 2 subject to y i ( w T x i + b ) 1
This is the primal form of the SVM optimization problem for finding the maximum-margin hyperplane.

2.3.5. KNN

KNN is a non-parametric, instance-based learning algorithm that classifies a sample based on the majority class of its K nearest neighbors in the feature space. It does not require a training phase, which makes it simple to implement. KNN is effective when the samples of each class are locally clustered, and the feature space is well-represented. In photon-counting LiDAR, KNN-based methods distinguish target photons from background noise by analyzing local Euclidean distances, achieving F-scores of 0.97–0.99 across varying noise levels [33]. In the context of GM-APD signal classification, KNN can leverage the similarity of local feature patterns to distinguish between target and background responses. The Euclidean distance formula, commonly used in KNN, is expressed as:
d ( x , x i ) = j = 1 n ( x j x i j ) 2
The Euclidean distance used in the KNN algorithm to measure similarity between test sample x and training sample x i .

2.3.6. LR

LR is a linear probabilistic model used for binary classification. It models the relationship between input features and the probability of a class using the logistic sigmoid function. Despite its simplicity, LR provides robust performance when the data exhibits a linear decision boundary. Additionally, LR outputs class probabilities, which can be helpful for confidence-based post-processing. Recent studies have shown that LR can effectively classify geomorphological features such as sinkholes from LiDAR-derived elevation data, achieving an AUC of 0.90 and demonstrating strong reliability in complex terrain analysis [34]. In GM-APD signal classification, LR is a lightweight baseline method with fast training speed and high interpretability. The sigmoid function is used in LR to model the probability that a given input belongs to the positive class, as defined by:
P ( y = 1 | x ) = 1 1 + exp ( ( w T x + b ) )
This is the logistic function used in LR to compute the probability of the positive class.

2.4. GM-APD Simulated Dataset

Based on Equation (1), MC simulation was employed to generate photon-level datasets for target and noise pixels. Unlike the fixed SNR configuration used in [5,35], approximately 8000 target echoes and 8000 noise echoes were produced with randomly varying SNR values determined by signal intensity, target distance, and ambient light intensity. These data were split into training and testing sets at a 9:1 ratio. The simulated scenario corresponds to sky-background observation, where detector echoes consist solely of two classes: target echoes and sky-background noise echoes. Table 2 summarizes the simulation parameters, which include photon counts for background noise and laser echoes, target location, temporal characteristics of the laser pulse, and the dark count rate within the time-gating window. Different configurations are specified for target and noise pixels to reproduce signal and background conditions realistically under natural environments. In the present simulation, only clear-weather conditions are considered; factors such as cloud cover and fog, which can affect laser transmission, are not included. The simulation process models the laser and background-light echoes arriving at the target surface, independent of target reflectivity, and generates target laser echoes with varying SNRs.

2.5. Temporal Tail Data

When GM-APD operates in long dead-time mode, especially under low light conditions, it forms a tail peak due to many untriggered events.
For a specified number of frames, histograms are constructed by accumulating photon-triggered events. Owing to the inherent triggering characteristics of GM-APD detectors, if no avalanche event occurs within a frame, the recorded value defaults to the last time bin of the gating window. Because non-triggering events occur frequently, the accumulated histogram—where the horizontal axis denotes the time bins within the gating window and the vertical axis represents the counts in each bin—exhibits a distinct tail peak at the end of the window. This study defines these data as tail data confined to the final few bins. In the MC simulations, this corresponds to the last bin of the gating window. As illustrated in Figure 2b,d, both background noise and target echoes produce such tail peaks at the end of their histograms. From a detector perspective, this phenomenon is further influenced by the readout circuitry, which forces untriggered events into the final bins. Consequently, the tail data possess highly distinctive characteristics, making them relatively easy to identify and filter.
Rather than discarding the tail peak, we treat it as an explicit feature and examine its impact on classification results.

2.6. Algorithm Principle

As shown in Figure 3, the simulated dataset is produced by a three-stage pipeline (processes A–C) and then used for model development (processes D–E). In process A, scene parameters and photon statistics are sampled to synthesize target and background echoes. Process B aggregates consecutively acquired frames to form per-frame histograms/features, and process C assigns labels and compiles the final training/testing sets. Before feeding data to the classifiers, we apply PCA to extract low-dimensional, decorrelated features and mitigate noise. Model training (process D) is conducted with stratified five-fold cross-validation to avoid data leakage and to obtain robust estimates. Each fold uses identical preprocessing parameters learned from the training split only, and the procedure is repeated across all folds. Process E performs final fitting on the complete training data and evaluates the selected model on the held-out test set, ensuring a fair, reproducible assessment under the same preprocessing and hyperparameter settings.

2.7. Model Evaluation Metrics

This study employs several metrics to comprehensively evaluate and compare model performance and accuracy, including accuracy, precision, recall, FPR, FNR, F1-score, Kappa coefficient, and RCP. The specific calculations are as follows:
A c c = T P + T N T P + T N + F P + F N
P r e = T P T P + F P
R e c = T P T P + F N
F P R = F P F P + T N
F N R = F N F N + T P
F 1 = 2 · P r e × R e c P r e + R e c
K a p p = A c c p e 1 p e
p e = a 1 · b 1 + a 2 · b 2 N · N
δ i , N = A c c t e s t A c c t r a i n A c c t r a i n × 100 %
The physical meanings of T P , F P , F N , and T N are provided in Table 3. Assuming the actual number of samples for each class is a 1 and a 2 , the predicted number of each class is b 1 and b 2 .

3. Experimental and Results Analysis

3.1. Performance Under Different Feature Extraction Methods

This study conducts an in-depth investigation into the impact of four feature-processing strategies, defined by whether PCA is applied (P vs. NP) and whether tail-end features are retained (T vs. NT). Accordingly, four data-preprocessing schemes are formed: PT, NPT, PNT, and NPNT. Figure 4 provides a detailed representation of the ML models’ performance under different feature processing strategies and SFNs. First, from an accuracy perspective, as the SFN exceeds 2000 frames, the accuracy of models processed with the PNT gradually surpasses that of others. Second, regarding training time, appropriate strategies like PNT can effectively reduce the duration, as depicted in Figure 5, where models based on the PNT feature processing method tend to have shorter training times across different SFNs.
Empirical observations indicate that the tail data do not yield a meaningful improvement in classification performance. The underlying reason is that the discriminative power of GM-APD echoes primarily lies in the local waveform characteristics around the target-return region. In this study, we adopt per-sample normalization to emphasize these local patterns. However, when the tail values are disproportionately large, they shift the global scaling and cause the normalized distribution to be dominated by tail features, thereby suppressing the representation of other salient cues. Consequently, the model’s feature extraction becomes constrained and fails to capture the informative characteristics of the target-return segment effectively. Hence, although the tail segment can be regarded as an independent feature, it does not substantively enhance the overall classification capability.
In summary, PNT emerges as the optimal feature-processing strategy. Thus, we mainly focus on it in the subsequent analysis.

3.2. Performance Analysis Under Different SFNs

In the subsequent research, we explore ML algorithms’ accuracy under different SFNs. SFN significantly impacts the sparsity of GM-APD signals. As illustrated in Table 4, we utilize the average density of non-zero elements [36] (ADNZE) to evaluate the sparsity of the echo signal. When the SFN is 100, the ADNZE is 0.0791, increasing to 0.8409 when the SFN rises to 20,000. As the SFN grows, the feature information of the GM-APD signals becomes progressively richer.
Combining the data analysis of Figure 4 and Table 5, we observe that as SFN increases, the accuracy of most models shows an upward trend. For instance, when the SFN is 100, the NN-BP-2 achieves the highest training accuracy of 0.6259; When the SFN increases to 20,000, the NN-BP-2’s training accuracy rises to 0.9137, becoming the highest. Furthermore, as observed in Table 6 and Table 7, the evaluation metrics of most algorithms show an upward trend with the increase in SFN. For example, NN-BP-2’s precision increases from 0.5613 to 0.9384 and recall from 0.5662 to 0.8962. Simultaneously, both the FNR and FPR decrease, while the F1-score and kappa coefficient also experience growth, all indicating a significant improvement in model performance. Additionally, the analysis, combined with Table 5, shows that the DT consistently performs well regarding stability and efficiency across different SFNs.
We analyzed F1-score and Kappa across models using the raw F1/Kappa matrices (Table 6 and Table 7) and summarized model-wise means with 95% confidence intervals (Table 8 and Table 9). We applied non-parametric tests because the sample size per model is small (eight SFN levels) and normality is not guaranteed. A Friedman test with SFN as blocks revealed significant overall differences for both F1 ( χ 2 ( 8 ) = 18.83 , p = 0.0158 ; Kendall’s W = 0.294 ) and Kappa ( χ 2 ( 8 ) = 24.14 , p = 0.0022 ; W = 0.377 ). The confidence interval results further quantified performance, showing that NN-BP-2, NN-BP-3, and LR achieved the highest mean F1 values (∼ 0.72 0.73 ), while DT and SVM-2 performed best on Kappa (∼ 0.47 0.53 ).
To further examine specific model contrasts, pairwise Wilcoxon signed-rank tests (Table 10) indicated that NN-BP-3 vs. KNN yielded p = 0.0078 for both F1 and Kappa before adjustment. Although conservative Holm corrections attenuated these differences (adjusted p > 0.05), bootstrap analysis confirmed their robustness: NN-BP-3 exceeded KNN by 0.285 in F1 [0.200, 0.364] and by 0.141 in Kappa [0.104, 0.181]. These complementary tests ensure that the observed differences are not incidental but reflect meaningful and statistically supported performance gaps. Overall, the results highlight that NN-BP-based models offer a clear and practically relevant advantage over baseline methods such as KNN across SFNs, even if some pairwise contrasts do not survive stringent multiplicity corrections.

3.3. Robust Analysis Under Echo SNR

The target echo signal-to-noise ratio (ESNR) represents the ratio of the target signal to the background signal when the echo arrives at the detector surface. ESNR varies with time and environmental conditions; therefore, evaluating ML performance under different ESNRs is a critical indicator of model classification accuracy and robustness. In this section, we investigate the classification performance of ML models under three representative statistical frame numbers (SFNs: 400, 2000, and 20,000). Here, the number of target echoes and background noise signals is kept consistent, and the statistical distribution of the ESNR of target echoes is used to analyze model performance. The subsequent analysis particularly emphasizes binary classification between low-ESNR target echoes and background noise echoes.
Combining the data analysis of Table 11 and Figure 6, we observe that under low ESNR (0–0.1) conditions, the performance of most algorithms declines. However, NN-BP-based algorithms exhibit exceptional performance within the ESNR range of 0–0.05. Under medium ESNR (0.1–0.5) conditions, the performance of algorithms generally improves, with LR and SVM particularly standing out in the ESNR interval of 0.3–0.5, achieving a value close to 1. In high ESNR (above 0.5) environments, the accuracy of all algorithms approaches or reaches 1, indicating that they can effectively handle high SNR data. However, consistent with the conclusion in Section 3.2, when the SFN is relatively low, the ESNR will decrease further, leading to a decline in the classification accuracy of each model. Overall, the NN-BP-based algorithm demonstrates robust performance under various ESNR conditions.
Figure 7 presents the Acc of different models across varying ESNRs under three representative SFNs: (a) SFN = 400, (b) SFN = 2000, and (c) SFN = 20,000. Overall, all models experience a pronounced drop in accuracy at extremely low ESNRs (e.g., ESNR < 0.01), followed by a gradual recovery as ESNR increases. A key turning point is observed when ESNR exceeds approximately 0.05, where BP-based models (NN-BP-2 and NN-BP-3) outperform other algorithms, particularly under the low-SFN scenario shown in Figure 7a. This superiority arises from the nonlinear fitting capacity and multilayer feature representation of NN-BPs, which enable more effective extraction of weak target signals from noisy data. By contrast, linear models such as LDA and LR exhibit limited adaptability, resulting in severe degradation under low-ESNR conditions. As the SFN increases to 2000 and 20,000 (Figure 7b,c), model performance converges, and the advantage of NN-BP-based models diminishes, indicating that larger statistical frames reduce noise sensitivity and mitigate differences across algorithms. These findings demonstrate the coupled influence of ESNR and SFN on classification performance and highlight the suitability of NN-BPs for low-SNR and small-sample scenarios.

3.4. Model Stability Analysis

Figure 8 illustrates the analysis of training accuracy, testing accuracy, and their relative stability for ML models under three different SFNs (400, 2000, 20,000). The smaller the RCP value, the stronger the model’s stability. The results indicate that the NN-BP-3 model exhibits smaller RCP values across all frame counts, especially when SFN is 20,000, where its RCP value is only 0.0111. Conversely, the SVM-3 model has the most significant RCP value of 0.1937 among all models when SFN is 20,000, suggesting lower stability under this condition.

3.5. Lightweight ResNet on NN-BP Backbone: Gain Verification on GM-APD LiDAR Signals

Building on the preceding results, NN-BP-type networks already exhibit a clear advantage for classifying GM-APD LiDAR data. To further test whether an NN-BP backbone can yield additional gains, we augment the NN-BP architecture with 1-D convolutions and residual connections, forming a lightweight ResNet-style model tailored to the temporal characteristics of GM-APD signals.
The NN-BP-enhanced ResNet is illustrated in Figure 9. To accommodate the long 1D sequences within the GM-APD gating window, the network first applies Conv(7) + Batch Normalization (BN) + Rectified Linear Unit (ReLU) to capture long-range temporal dependencies, followed by a pooling layer for local noise suppression and mild down-sampling. The backbone then stacks five residual blocks (Rs_Block). In each block, the main branch adopts Conv(3)–BN–ReLU–Conv(3)–BN, while the shortcut branch uses Conv(1)–BN for channel/length alignment; the two branches are summed element-wise and passed through ReLU. This design deepens the network, enlarges the receptive field, and mitigates gradient vanishing without incurring excessive optimization burden. After the backbone, a ReLU and global feature aggregation produce a compact representation, which is finally fed to the NN-BP head for classification output.
Table 12 compares ResNet with the best-performing method in Table 6 and Table 7 at 400, 2000, and 20,000 frames—DT for 400 and 2000 and NN-BP-2 for 20,000. Results show that the NN-BP-based ResNet consistently outperforms the corresponding baselines, with relative improvements of 12.5% (400), 9.16% (2000), and 2.79% (20,000). These findings indicate that introducing convolutional and residual units on top of the NN-BP prior structure captures local and cross-scale patterns in GM-APD signals more effectively, while still providing steady—though diminishing—marginal gains at high frame counts. Accordingly, subsequent work will focus on NN-BP-based architectural optimization, incorporating the physical characteristics of GM-APD signals (e.g., pulse statistics and background-noise signatures) to further enhance robustness in complex spatial backgrounds.

3.6. Model Computational Complexity Analysis

Table 13 reports efficiency and model size under the PNT strategy for three representative accumulation frame counts (400, 2000, and 20,000). All metrics are computed on 4096 samples and include the total test time, the average per-sample latency, and the imaging frame rate converted from the total time. Model size is measured in FP32 (4 bytes per parameter) and includes the PCA projection parameters. Note that ResNet does not use PCA, and thus, its parameter count remains unchanged across frame settings.
In terms of efficiency, NN-BP-based models consistently deliver lower latency and higher frame rates across all three accumulation settings. Taking NN-BP-2 as an example, the average per-sample latency is 0.067, 0.047, and 0.025 ms for SFN = 400, 2000, and 20,000, respectively. Assuming 4096 samples constitute one frame (corresponding to a 64 × 64 pixel GM-APD array), these latencies translate to frame rates of 3.64, 5.15, and 9.69 Hz, respectively. Under the same settings, NN-BP-3 attains 1.17, 4.61, and 6.93 Hz. By contrast, ResNet achieves only 0.14–0.15 Hz, indicating that the current convolution–residual stacking is not amenable to real-time processing.
Regarding model size, increasing the frame count does not enlarge the models. Most PCA-based methods become smaller as the frame count rises. For example, the size of NN-BP-2 decreases from 4.820 MB to 1.619 MB and then to 0.259 MB. This is because higher frame counts produce more stable echo statistics and less sparsity, so fewer principal components are needed and the projection matrix becomes smaller. ResNet does not use PCA, so its size remains 0.380 MB across all frame counts.
Taken together with Table 12 and Table 13, although the NN-BP-based ResNet substantially improves accuracy, its computational cost and parameter footprint also increase, resulting in reduced real-time capability. These observations suggest that subsequent work should prioritize efficiency optimization for NN-BP-type networks, for example, by pursuing light-weight architectures, operator fusion, and low-bit quantization while maintaining accuracy.

4. Conclusions

This study proposed a complete data-processing framework for GM-APD LiDAR echo signals and systematically assessed nine ML models derived from six baseline algorithms. Feature extraction was optimized through PCA, and among the candidate schemes, the PNT strategy emerged as the optimal feature-processing method. This framework provides a novel and feasible approach for GM-APD data classification, which, to our knowledge, has not been explicitly formulated in prior work. Building upon this framework, MC simulations were carried out under varying ESNR and SFN conditions. The results showed that NN-BP-based models (NN-BP-2 and NN-BP-3) performed best in low-SNR and small-sample regimes. LR and LDA were fast but less robust, DT achieved good stability, and SVM models yielded competitive accuracy but at a higher computational cost. Building on these findings, we further introduce an NN-BP-based ResNet that achieves additional test-accuracy gains at typical frame counts. Complemented by bootstrap confidence intervals, statistical significance analysis using Friedman and Wilcoxon tests confirmed that the observed differences among models are statistically meaningful.
In addition to accuracy and robustness, practical deployment must also balance computational efficiency and hardware deployability. Lightweight models (LR/LDA) offer the fastest inference and the smallest parameter footprint, making them suitable for embedded or resource-constrained platforms, albeit with limited noise resilience. NN-BP models provide stronger overall accuracy and stability with an acceptable runtime overhead, fitting scenarios that require both real-time performance and high accuracy. By contrast, SVM and KNN incur higher computational and memory costs. Although ResNet attains the highest accuracy, it bears the most significant computational load and the lowest throughput, thus being more appropriate for offline analysis or high-compute platforms. Moreover, empirical timing across models trained with different SFNs shows that increasing SFN reduces sample sparsity, thereby lowering the number of principal components needed by PCA and, paradoxically, improving efficiency for PCA-based methods; SFN has no noticeable effect on ResNet, which does not rely on PCA. Finally, this study considers clear-sky conditions only; more complex atmospheres (e.g., fog or cloud) would further increase the computational burden, underscoring the need to jointly evaluate algorithmic performance, time complexity, memory footprint, and hardware compatibility in real GM-APD systems.
Looking ahead, methodological generalization to real-scene datasets is planned through a staged route encompassing hardware finalization and calibration, multi-scenario data acquisition with standardized and traceable records, ground-truth construction using surveyed references, and protocol/metric standardization for fair comparison. Further directions include cross-dataset generalization and sim-to-real transfer studies, the inclusion of representative deep-learning baselines (e.g., Transformers, MLPs), and the exploration of lightweight or hardware-adaptive solutions (pruning, quantization, operator fusion, on-device deployment) to jointly optimize accuracy, robustness, and computational efficiency. Extending the simulation suite to atmospheric conditions beyond clear weather (e.g., fog, haze, turbulence) will enhance realism. Collectively, these efforts are expected to advance GM-APD LiDAR signal processing toward high-precision, resource-efficient, and real-time applications.

Author Contributions

Conceptualization, J.S. and H.N.; methodology, H.N. and D.L.; software, H.N.; validation, H.N. and X.Z. (Xin Zhou); formal analysis, H.N. and X.Z. (Xin Zhang); investigation, H.N. and J.C.; data curation, H.N. and W.L.; writing—original draft preparation, H.N.; writing—review and editing, H.N. and S.L.; visualization, H.N. and X.Z. (Xin Zhou); supervision, J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The datasets presented in this article are not readily available because the data are part of an ongoing study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhou, S.L.; Xu, H.; Zhang, G.H.; Ma, T.W.; Yang, Y. Leveraging Deep Convolutional Neural Networks Pre-Trained on Autonomous Driving Data for Vehicle Detection From Roadside LiDAR Data. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22367–22377. [Google Scholar] [CrossRef]
  2. Hua, Z.Y.; Xu, S.; Liu, Y.A. Individual Tree Segmentation from Side-View LiDAR Point Clouds of Street Trees Using Shadow-Cut. Remote Sens. 2022, 14, 5742. [Google Scholar] [CrossRef]
  3. Li, C.; Cheng, N.; Zhao, H.; Yu, T.C. Multiple-Beam LiDAR Detection Technology. In Proceedings of the Seventh Asia Pacific Conference on Optics Manufacture and 2021 International Forum of Young Scientists on Advanced Optical Manufacturing (APCOM and YSAOM 2021), Hong Kong, 13–16 August 2021; p. 12166. [Google Scholar] [CrossRef]
  4. Ma, L.; Sun, J.F.; Jiang, P.; Liu, D.; Zhou, X.; Wang, Q. Signal Extraction Algorithm of Gm-APD LiDAR with Low SNR Return. Optik 2020, 206, 164340. [Google Scholar] [CrossRef]
  5. Wang, M.Q.; Sun, J.F.; Li, S.N.; Lu, W.; Zhou, X.; Zhang, H.L. A Photon-Number-Based Systematic Algorithm for Range Image Recovery of GM-APD LiDAR under Few-Frames Detection. Infrared Phys. Technol. 2022, 125, 104267. [Google Scholar] [CrossRef]
  6. Zhang, Y.B.; Li, S.N.; Sun, J.F.; Liu, D.; Zhang, X.; Yang, X.H.; Zhou, X. Dual-Parameter Estimation Algorithm for Gm-APD LiDAR Depth Imaging through Smoke. Measurement 2022, 196, 111269. [Google Scholar] [CrossRef]
  7. Zhang, X.; Sun, J.; Li, S.; Zhang, Y.; Liu, D.; Zhang, H. Research on the Detection Probability Curve Characteristics of Long-Range Target Based on SPAD Array LiDAR. Infrared Phys. Technol. 2022, 126, 104325. [Google Scholar] [CrossRef]
  8. Zhang, X.; Li, S.; Sun, J.; Zhang, Y.; Liu, D.; Yang, X.; Zhang, H. Target Edge Extraction for Array Single-Photon LiDAR Based on Echo Waveform Characteristics. Opt. Laser Technol. 2023, 167, 109736. [Google Scholar] [CrossRef]
  9. Liu, D.; Sun, J.F.; Gao, S.; Ma, L.; Jiang, P.; Guo, S.H.; Zhou, X. Single-Parameter Estimation Construction Algorithm for Gm-APD Ladar Imaging through Fog. Opt. Commun. 2021, 482, 126558. [Google Scholar] [CrossRef]
  10. Fan, T.; Qiu, S.; Wang, Z.; Zhao, H.; Jiang, J.; Wang, Y.; Zhou, X. A New Deep Convolutional Neural Network Incorporating Attentional Mechanisms for ECG Emotion Recognition. Comput. Biol. Med. 2023, 159, 106938. [Google Scholar] [CrossRef]
  11. Khan, F.; Yu, X.; Yuan, Z.; Rehman, A.U. ECG Classification Using 1-D Convolutional Deep Residual Neural Network. PLoS ONE 2023, 18, 284791. [Google Scholar] [CrossRef] [PubMed]
  12. Raza, A.; Mehmood, A.; Ullah, S.; Ahmad, M.; Choi, G.S.; On, B.-W. Heartbeat Sound Signal Classification Using Deep Learning. Sensors 2019, 19, 4819. [Google Scholar] [CrossRef] [PubMed]
  13. Zhong, M.; Castellote, M.; Dodhia, R.; Lavista Ferres, J.; Keogh, M.; Brewer, A. Beluga Whale Acoustic Signal Classification Using Deep Learning Neural Network Models. J. Acoust. Soc. Am. 2020, 147, 1834–1841. [Google Scholar] [CrossRef]
  14. Yang, Y.; Fu, P.; He, Y. Bearing Fault Automatic Classification Based on Deep Learning. IEEE Access 2018, 6, 71540–71554. [Google Scholar] [CrossRef]
  15. Gokgoz, E.; Subasi, A. Comparison of Decision Tree Algorithms for EMG Signal Classification Using DWT. Biomed. Signal Process. Control 2015, 18, 138–144. [Google Scholar] [CrossRef]
  16. Subasi, A.; Ercelebi, E. Classification of EEG Signals Using Neural Network and Logistic Regression. Comput. Methods Programs Biomed. 2005, 78, 87–99. [Google Scholar] [CrossRef]
  17. Raj, S.; Ray, K.C. ECG Signal Analysis Using DCT-Based DOST and PSO Optimized SVM. IEEE Trans. Instrum. Meas. 2017, 66, 470–478. [Google Scholar] [CrossRef]
  18. Sha’Abani, M.; Fuad, N.; Jamal, N.; Ismail, M. kNN and SVM Classification for EEG: A Review. In Proceedings of the 5th International Conference on Electrical, Control & Computer Engineering, Kuantan, Malaysia, 29 July 2019; Springer: Singapore, 2020; pp. 555–565. [Google Scholar]
  19. Kim, K.S.; Choi, H.H.; Moon, C.S.; Mun, C.W. Comparison of k-Nearest Neighbor, Quadratic Discriminant and Linear Discriminant Analysis in Classification of Electromyogram Signals Based on the Wrist-Motion Directions. Curr. Appl. Phys. 2011, 11, 740–745. [Google Scholar] [CrossRef]
  20. Khandetsky, V.; Antonyuk, I. Signal Processing in Defect Detection Using Back-Propagation Neural Networks. NDT&E Int. 2002, 35, 483–488. [Google Scholar] [CrossRef]
  21. Fushiki, T. Estimation of Prediction Error by Using K-Fold Cross-Validation. Stat. Comput. 2011, 21, 137–146. [Google Scholar]
  22. Kraemer, H.C. Kappa Coefficient. In Wiley StatsRef: Statistics Reference Online; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2014; pp. 1–4. [Google Scholar]
  23. Li, X.; Luan, F.; Wu, Y. A Comparative Assessment of Six Machine Learning Models for Prediction of Bending Force in Hot Strip Rolling Process. Metals 2020, 10, 685. [Google Scholar] [CrossRef]
  24. Zhou, X.; Sun, J.F.; Jiang, P.; Liu, D.; Shi, X.J.; Wang, Q. Research of Detecting the Laser’s Secondary Reflected Echo from Target by Using Geiger-Mode Avalanche Photodiode. Opt. Commun. 2019, 433, 1–9. [Google Scholar] [CrossRef]
  25. Guo, S.Y.; Rong, Z.; Wang, S.; Wu, Y.H. A LiDAR SLAM with PCA-Based Feature Extraction and Two-Stage Matching. IEEE Trans. Instrum. Meas. 2022, 71, 1–11. [Google Scholar] [CrossRef]
  26. Duan, Y.; Yang, C.C.; Chen, H.; Yan, W.Z.; Li, H.B. Low-Complexity Point Cloud Denoising for LiDAR by PCA-Based Dimension Reduction. Opt. Commun. 2021, 482, 126567. [Google Scholar] [CrossRef]
  27. Iheaturu, C.J.; Hepner, S.; Batchelor, J.L.; Agonvonon, G.A.; Akinyemi, F.O.; Wingate, V.R.; Speranza, C.I. Integrating UAV LiDAR and Multispectral Data to Assess Forest Status and Map Disturbance Severity in a West African Forest Patch. Ecol. Inf. 2024, 84, 102876. [Google Scholar] [CrossRef]
  28. Guo, D.F.; Qu, Y.C.; Zhou, X.; Sun, J.F.; Yin, S.W.; Lu, J.; Liu, F. Research on Automatic Tracking and Size Estimation Algorithm of “Low, Slow and Small” Targets Based on GM-APD Single-Photon LiDAR. Drones 2025, 9, 85. [Google Scholar] [CrossRef]
  29. Dong, Z.W.; Yan, Y.J.; Jiang, Y.G.; Fan, R.W.; Chen, D.Y. Ground Target Extraction Using Airborne Streak Tube Imaging LiDAR. J. Appl. Remote Sens. 2021, 15, 016509. [Google Scholar] [CrossRef]
  30. Yamashita, T.J.; Wester, D.B.; Tewes, M.E.; Young, J.V., Jr.; Lombardi, J.V. Distinguishing Buildings from Vegetation in an Urban–Chaparral Mosaic Landscape with LiDAR-Informed Discriminant Analysis. Remote Sens. 2023, 15, 1703. [Google Scholar] [CrossRef]
  31. Yu, S.; Wei, K.; Ma, R.J.; Huang, G.H. Photon-Counting LiDAR Point Cloud Filtering Using a Backpropagation Neural Network. Prog. Laser Optoelectron. 2024, 61, 2415001. [Google Scholar] [CrossRef]
  32. Yang, C.C.; Zhang, H.L. Adaptive SVM-Based Pixel Accumulation Technique for a SPAD-Based LiDAR System. Appl. Opt. 2022, 61, 10623–10628. [Google Scholar] [CrossRef]
  33. Ma, R.J.; Kong, W.; Chen, T.; Shu, R.; Huang, G.H. KNN-Based Denoising Algorithm for Photon-Counting LiDAR: Numerical Simulation and Parameter Optimization Design. Remote Sens. 2022, 14, 6236. [Google Scholar] [CrossRef]
  34. Kim, Y.J.; Nam, B.H.; Youn, H. Sinkhole Detection and Characterization Using LiDAR-Derived DEM with Logistic Regression. Remote Sens. 2019, 11, 1592. [Google Scholar] [CrossRef]
  35. Lindell, D.B.; O’Toole, M.; Wetzstein, G. Single-Photon 3D Imaging with Deep Sensor Fusion. ACM Trans. Graph. 2018, 37, 1–12. [Google Scholar] [CrossRef]
  36. Rodgers, G.; De Dominicis, C. Density of States of Sparse Random Matrices. J. Phys. A Math. Gen. 1990, 23, 1567. [Google Scholar] [CrossRef]
Figure 1. GM-APD LiDAR echo signal acquisition.
Figure 1. GM-APD LiDAR echo signal acquisition.
Remotesensing 17 03597 g001
Figure 2. Illustration of target and background noise echo data. (a) Raw data of 2000 frames of background noise echoes. (b) Statistical histogram of background noise echoes. (c) Raw data of 2000 frames of target echoes. (d) Statistical histogram of target echoes.
Figure 2. Illustration of target and background noise echo data. (a) Raw data of 2000 frames of background noise echoes. (b) Statistical histogram of background noise echoes. (c) Raw data of 2000 frames of target echoes. (d) Statistical histogram of target echoes.
Remotesensing 17 03597 g002
Figure 3. Training framework diagram for GM-APD signal.
Figure 3. Training framework diagram for GM-APD signal.
Remotesensing 17 03597 g003
Figure 4. Training accuracy for models with different feature processing methods at different SFNs. (a) SFN is 100, 400, 800, 1000. (b) SFN is 2000, 5000, 10,000, 20,000.
Figure 4. Training accuracy for models with different feature processing methods at different SFNs. (a) SFN is 100, 400, 800, 1000. (b) SFN is 2000, 5000, 10,000, 20,000.
Remotesensing 17 03597 g004
Figure 5. Training time for ML models with different feature processing methods at different SFNs. (a) SFN is 100, 400, 800, 1000. (b) SFN is 2000, 5000, 10,000, 20,000.
Figure 5. Training time for ML models with different feature processing methods at different SFNs. (a) SFN is 100, 400, 800, 1000. (b) SFN is 2000, 5000, 10,000, 20,000.
Remotesensing 17 03597 g005
Figure 6. Accuracy for different models at different ESNRs. (a) ESNR distribution in noise, (0.00–0.01]. (b) ESNR distribution in (0.01–0.05], (0.05–0.10]. (c) ESNR distribution in (0.10–0.30], (0.30–0.50]. (d) ESNR distribution in (0.50–1.00], (1.00–4.20].
Figure 6. Accuracy for different models at different ESNRs. (a) ESNR distribution in noise, (0.00–0.01]. (b) ESNR distribution in (0.01–0.05], (0.05–0.10]. (c) ESNR distribution in (0.10–0.30], (0.30–0.50]. (d) ESNR distribution in (0.50–1.00], (1.00–4.20].
Remotesensing 17 03597 g006aRemotesensing 17 03597 g006b
Figure 7. Line chart illustrating the classification accuracy of different models under varying ESNR conditions. (a) Changes in classification accuracy of different models under SFN = 400. (b) Changes in classification accuracy of different models under SFN = 2000. (c) Changes in classification accuracy of different models under SFN = 20,000.
Figure 7. Line chart illustrating the classification accuracy of different models under varying ESNR conditions. (a) Changes in classification accuracy of different models under SFN = 400. (b) Changes in classification accuracy of different models under SFN = 2000. (c) Changes in classification accuracy of different models under SFN = 20,000.
Remotesensing 17 03597 g007
Figure 8. Analysis of model stability under different SFNs.
Figure 8. Analysis of model stability under different SFNs.
Remotesensing 17 03597 g008
Figure 9. Schematic of the NN-BP-based ResNet architecture.
Figure 9. Schematic of the NN-BP-based ResNet architecture.
Remotesensing 17 03597 g009
Table 1. Dimensionality of features before and after PCA under different SFNs.
Table 1. Dimensionality of features before and after PCA under different SFNs.
Frame Count10040080010002000500010,00020,000
d = 1000 3701221721111
d = 999 65759653050340024312663
Table 2. Simulation parameters for target and noise signal generation.
Table 2. Simulation parameters for target and noise signal generation.
Parameter CategoryParameter NameDistribution/RangeDescription
arget pixelsNoise photonsUniform [0.01, 1.01]Simulated background-noise photon count
Laser photonsUniform [0.01, 10.01]Simulated laser-echo photon count
Target locationUniform [1, 900]Time/space index of the target
Temporal profile of Laser pulseSee Equation (1)Temporal distribution of the laser pulse
Laser pulse width τ Fixed at 20 binsConstant pulse width
Noise pixelsNoise photonsUniform [0.01, 1.01]Simulated pure-noise pixel photon count
GM-APD parametersDark count rateFixed at 0.01Dark count rate within the time-gating window
T g width1 μsGate width in the synchronous mode of GM-APD
T b 1 nsTime resolution within the GM-APD gate
Table 3. Illustration of the confusion matrix.
Table 3. Illustration of the confusion matrix.
Positive (Target)Negative (Background)
Predicted positive T P F P
Predicted negative F N T N
Table 4. Sparsity of data at different SFNs.
Table 4. Sparsity of data at different SFNs.
Frame Count10040080010002000500010,00020,000
ADNZE0.07910.22700.34150.38230.51370.67520.77110.8409
Table 5. Models with the highest training accuracy and testing accuracy at different SFNs.
Table 5. Models with the highest training accuracy and testing accuracy at different SFNs.
Frame Count10040080010002000500010,00020,000
Train MaxSVM-3DT-2DT-2DT-2DT-2NN-BP-2NN-BP-2NN-BP-2
Acc0.62410.71710.76210.77150.79470.82470.85400.9137
Train MinKNN-4KNN-4KNN-4KNN-4KNN-4KNN-4KNN-4LR-4
Acc0.54330.58520.60700.61560.64530.69620.74010.7091
Test MaxSVM-3SVM-3SVM-3DT-2DT-2DT-2NN-BP-2NN-BP-3
Acc0.62190.67560.71940.76620.79190.80560.86810.9213
Test MinKNN-4KNN-4KNN-4KNN-4KNN-4KNN-4KNN-4LDA-4
Acc0.53000.56690.59440.59810.62940.63190.72810.7538
Table 6. Corresponding metrics for the test data at higher SFNs (100–1000).
Table 6. Corresponding metrics for the test data at higher SFNs (100–1000).
Frame CountModelAccPreRecFPRFNRF1Kappa
100NN-BP-30.55750.55230.60750.49250.43610.57860.1150
NN-BP-20.56180.56130.56630.44250.43760.56380.1238
DT0.59980.82720.25130.05250.44140.38540.1988
KNN0.5310.0600.48450.11320.06
LR0.60940.61220.60130.38250.39240.60620.2188
SVM-20.61310.62380.570.34380.39590.59570.2263
SVM-30.62190.6330.580.33630.38750.60530.2438
SVM-L0.60250.60850.5750.370.40280.59130.205
LDA0.60880.61070.60.38250.39310.60530.2175
400NN-BP-30.6250.63120.60130.35130.38070.61590.25
NN-BP-20.63750.64550.610.3350.36970.62720.275
DT0.69310.94520.410.02380.37670.57190.3863
KNN0.566910.133800.46420.23590.1338
LR0.66380.67470.63250.3050.34590.65290.3275
SVM-20.67380.70870.590.24250.35120.64390.3475
SVM-30.67560.70690.60.24880.34740.64910.3513
SVM-L0.66810.68810.6150.27880.3480.64950.3363
LDA0.66620.68170.62380.29130.34680.65140.3325
800NN-BP-30.65250.65060.65880.35380.34560.65470.305
NN-BP-20.67690.67240.690.33630.31840.68110.3538
DT0.75630.97020.52880.01630.32390.68450.5125
KNN0.594410.188800.44790.31760.1888
LR0.71810.7330.68630.250.29490.70880.4363
SVM-20.71870.75890.64130.20380.31060.69510.4375
SVM-30.71940.74890.660.22130.30390.70170.4388
SVM-L0.71190.74320.64750.22380.31230.69210.4238
LDA0.70250.72250.65750.25250.31420.68850.405
1000NN-BP-30.6750.67240.68250.33250.32230.67740.35
NN-BP-20.66130.66330.6550.33250.34070.65910.3225
DT0.76630.94940.56250.030.31080.70640.5325
KNN0.59810.99370.19750.00130.44550.32950.1963
LR0.71380.7280.68250.2550.29880.70450.4275
SVM-20.7150.75670.63380.20380.31510.68980.43
SVM-30.70560.70310.65250.24130.31410.68910.4113
SVM-L0.7150.74570.65250.22250.30890.6960.43
LDA0.70310.72540./65380.24750.31510.68770.4063
2000NN-BP-30.70060.70040.70130.30.29910.70080.4013
NN-BP-20.70380.69930.71500.30750.29160.7070.4075
DT0.79190.95870.610.02630.2860.74560.5838
KNN0.62940.96410.26880.010.42480.42030.2588
LR0.75060.76840.71750.21630.26490.74210.5013
SVM-20.7350.78660.6450.1750.30080.70880.47
SVM-30.720.75960.64380.20380.30910.69690.44
SVM-L0.74190.79450.65250.16880.29480.71650.4838
LDA0.72880.7740.64630.18880.30360.70440.4575
5000NN-BP-30.78440.7920.77130.20250.22290.78150.5688
NN-BP-20.79310.7950.790.20380.20870.79250.5863
DT0.80560.93580.65630.0450.26470.77150.6113
KNN0.68190.96190.37880.0150.38680.54350.3638
LR0.79250.81540.75630.17130.22730.78470.585
SVM-20.76190.80060.69750.17380.2680.74550.5238
SVM-30.74380.76710.70.21250.27590.7320.4875
SVM-L0.78380.88340.65380.08630.27480.75140.5675
LDA0.75560.8190.65630.1450.28680.72870.5113
Table 7. Corresponding metrics for the test data at higher SFNs (2000–20,000).
Table 7. Corresponding metrics for the test data at higher SFNs (2000–20,000).
Frame CountModelAccPreRecFPRFNRF1Kappa
2000NN-BP-30.70060.70040.70130.30.29910.70080.4013
NN-BP-20.70380.69930.71500.30750.29160.7070.4075
DT0.79190.95870.610.02630.2860.74560.5838
KNN0.62940.96410.26880.010.42480.42030.2588
LR0.75060.76840.71750.21630.26490.74210.5013
SVM-20.7350.78660.6450.1750.30080.70880.47
SVM-30.720.75960.64380.20380.30910.69690.44
SVM-L0.74190.79450.65250.16880.29480.71650.4838
LDA0.72880.7740.64630.18880.30360.70440.4575
5000NN-BP-30.78440.7920.77130.20250.22290.78150.5688
NN-BP-20.79310.7950.790.20380.20870.79250.5863
DT0.80560.93580.65630.0450.26470.77150.6113
KNN0.68190.96190.37880.0150.38680.54350.3638
LR0.79250.81540.75630.17130.22730.78470.585
SVM-20.76190.80060.69750.17380.2680.74550.5238
SVM-30.74380.76710.70.21250.27590.7320.4875
SVM-L0.78380.88340.65380.08630.27480.75140.5675
LDA0.75560.8190.65630.1450.28680.72870.5113
10,000NN-BP-30.84940.86250.83130.13250.16280.84660.6988
NN-BP-20.86810.8870.84380.10750.1490.86480.7363
DT0.84440.95990.71880.030.22480.8220.6888
KNN0.72810.85850.54630.090.33270.66770.4563
LR0.80310.83920.750.14380.2260.79210.6063
SVM-20.81250.90980.69380.06880.24750.78720.625
SVM-30.81250.87760.72630.10130.23350.79480.625
SNM-L0.80060.91680.66130.060.26490.76830.6013
LDA0.77810.8870.63750.08130.28290.74180.5563
20,000NN-BP-30.92130.9470.89250.0500.10170.91890.8425
NN-BP-20.91880.93850.89630.05880.09930.91690.8375
DT0.86810.97120.75880.02250.19790.85190.7363
KNN0.87310.92220.8150.06880.16570.86530.7463
LR0.77060.84420.66380.12250.2770.74320.5413
SVM-20.83691.00000.673800.2460.80510.6738
SVM-30.84881.00000.697500.23220.82180.6975
SVM-L0.76750.9350.5750.040.30690.71210.535
LDA0.75380.93010.54880.04130.320.69030.5075
Table 8. Mean ± 95% CI per model (F1/Kappa).
Table 8. Mean ± 95% CI per model (F1/Kappa).
ModelMeanSDLower 95% CIUpper 95% CI
F1 Kappa F1 Kappa F1 Kappa F1 Kappa
NN-BP-30.72180.44140.11750.24350.62360.23780.82000.6450
NN-BP-20.72660.45530.12130.24340.62510.25180.82800.6589
DT0.69240.53130.15140.17230.56590.38720.81890.6754
KNN0.43660.30050.24480.21960.23200.11690.64130.4841
LR0.71680.45550.06340.13230.66380.34490.76980.5661
SVM-20.70890.46670.07000.14410.65030.34620.76740.5872
SVM-30.71130.46190.07120.14450.65180.34110.77090.5827
SVM-L0.69710.44780.05620.13040.65020.33880.74410.5569
LDA0.68730.42420.04310.11010.65120.33220.72330.5163
Table 9. Friedman test summary (F1/Kappa).
Table 9. Friedman test summary (F1/Kappa).
Chi-Squarep-ValueKendall’s WN (SFN Levels)k (Models)
F1KappaF1KappaF1Kappa
Value18.8324.140.01580.002170.290.3889
Table 10. Pairwise Wilcoxon—all model pairs (F1/Kappa).
Table 10. Pairwise Wilcoxon—all model pairs (F1/Kappa).
Model AModel BMean_DiffMedian_Diff p value p adj _ holm Effect_Size_r
F1KappaF1KappaF1KappaF1KappaF1Kappa
DTKNN0.25580.23080.29880.25000.01560.01560.45310.42190.8550.855
DTLDA0.00510.10700.03000.11680.64060.015610.46880.1650.855
DTLR−0.02440.0758−0.00560.07930.74220.015610.4375−0.1160.855
DTSVM-2−0.01650.06450.02130.06940.84380.015610.51560.0700.855
DTSVM-3−0.01890.06940.02220.06870.84380.039110.93750.0700.730
DTSVM-L−0.00470.08340.01530.08810.74220.015610.45310.1160.855
KNNLDA−0.2506−0.1237−0.3211−0.17810.02340.19530.65621−0.801−0.458
KNNLR−0.2802−0.1550−0.3484−0.20740.01560.05470.46881−0.855−0.679
KNNSVM-2−0.2723−0.1662−0.3244−0.19000.01560.01560.48440.4844−0.855−0.855
KNNSVM-3−0.2747−0.1614−0.3181−0.18250.01560.01560.50.5−0.855−0.855
KNNSVM-L−0.2605−0.1473−0.3313−0.20310.02340.07810.63281−0.801−0.623
LRLDA0.02950.03130.02900.03260.00780.02340.27340.60940.9400.801
LRSVM-20.0079−0.01120.0121−0.00500.19530.5469110.458−0.213
LRSVM-30.0055−0.00640.0055−0.01060.31250.7422110.357−0.116
LRSVM-L0.01970.00770.02020.00940.00780.07810.265610.9400.623
NN-BP-2DT0.0342−0.07590.0319−0.09310.250.1094110.407−0.566
NN-BP-2KNN0.28990.15480.30810.14500.00780.00780.25780.26560.9400.940
NN-BP-2LDA0.03930.0311−0.0024−0.05060.64060.945311−0.165−0.024
NN-BP-2LR0.0097−0.0002−0.0267−0.06750.84380.843811−0.070−0.070
NN-BP-2SVM-20.0177−0.0114−0.0079−0.06750.74220.945311−0.116−0.024
NN-BP-2SVM-30.0152−0.0066−0.0052−0.05440.64060.945311−0.165−0.024
NN-BP-2SVM-L0.02940.0075−0.0103−0.06560.74220.843811−0.116−0.070
NN-BP-3DT0.0294−0.08990.0173−0.11000.54690.0781110.213−0.623
NN-BP-3KNN0.28520.14090.30880.12940.00780.00780.28120.27340.9400.940
NN-BP-3LDA0.03450.0172−0.0070−0.05630.7422111−0.116−0.000
NN-BP-3LR0.0050−0.0141−0.0273−0.07750.74220.460911−0.116−0.261
NN-BP-3NN-BP-2−0.0048−0.0139−0.0086−0.01320.54690.148411−0.213−0.511
NN-BP-3SVM-20.0129−0.0253−0.0102−0.07430.84380.460911−0.070−0.261
NN-BP-3SVM-30.0105−0.0205−0.0039−0.05000.64060.742211−0.165−0.116
NN-BP-3SVM-L0.0247−0.0064−0.0142−0.08130.94530.742211−0.024−0.116
SVM-2LDA0.02160.04250.00550.01930.250.007810.28120.4070.940
SVM-2SVM-3−0.00240.0048−0.0059−0.00070.64060.726311−0.165−0.124
SVM-2SVM-L0.01170.0189−0.00130.012410.362711−0.0000.322
SVM-3LDA0.02410.03770.00240.02250.18340.1094110.4700.566
SVM-3SVM-L0.01420.01410.00460.01500.64060.8438110.1650.070
SVM-LLDA0.00990.02360.01020.02500.10940.023410.58590.5660.801
Table 11. Distribution of different ESNRs in the test data (calculated under SFN being 20,000).
Table 11. Distribution of different ESNRs in the test data (calculated under SFN being 20,000).
ESNRNumberPercent (%)
(0.00–0.01]17021.25
(0.01–0.05]16520.625
(0.05–0.10]11013.75
(0.10–0.30]19924.875
(0.30–0.50]789.75
(0.50–1.00]607.5
(1.00–4.20]182.25
Table 12. Performance comparison at different accumulation frame counts on GM-APD echoes.
Table 12. Performance comparison at different accumulation frame counts on GM-APD echoes.
Frame CountModelAccPreRecFPRFNRF1Kappa
400DT0.69310.94520.41000.02380.37670.57190.3863
400ResNet0.78000.84250.68870.12880.31130.75790.5600
2000DT0.79190.95870.61000.02630.28600.74560.5838
2000ResNet0.86440.96640.75500.02620.24500.84770.7288
20,000NN-BP-30.92130.94700.89250.05000.10170.91890.8425
20,000ResNet0.94440.96960.91750.02880.08250.94280.8887
Table 13. Run time and model size comparison at different accumulation frame counts.
Table 13. Run time and model size comparison at different accumulation frame counts.
Frame CountM-NameAlgorithm Time (s)Avg. Latency (ms)Rate (Hz)Params (MB)
400NN-BP-30.85360.20841.17152.4100
NN-BP-20.27460.06703.64174.8200
DT0.40140.09802.49132.3900
KNN3.95850.96640.252636.8400
LR0.54530.13311.83392.3880
SVM-24.29101.04740.233129.7200
SVM-34.62121.12820.216431.9170
SVM-L3.79830.92730.263327.9180
LDA0.35430.08652.82253.8140
ResNet6.55611.60060.15250.3800
2000NN-BP-30.21670.05294.61471.6190
NN-BP-20.19420.04745.14931.6190
DT0.13230.03327.55861.6070
KNN2.72830.66610.366524.7610
LR0.17120.04185.84111.6040
SVM-21.69100.41290.591417.1840
SVM-32.04730.49980.488418.6390
SVM-L1.77190.43260.564416.4310
LDA0.19470.04755.13612.2470
ResNet6.92941.69180.14430.3800
20,000NN-BP-30.14440.03526.92520.2588
NN-BP-20.10320.02529.68990.2588
DT0.13120.03207.62200.2600
KNN0.44050.10762.27014.0000
LR0.17110.04185.84450.2560
SVM-20.34070.08322.93511.7530
SVM-30.30490.07353.23121.5290
SVM-L0.34910.08522.86452.3970
LDA0.12410.03038.05800.2274
ResNet7.13791.74260.14010.3800
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ni, H.; Sun, J.; Zhou, X.; Liu, D.; Zhang, X.; Cheng, J.; Lu, W.; Li, S. Comparative Evaluation of Nine Machine Learning Models for Target and Background Noise Classification in GM-APD LiDAR Signals Using Monte Carlo Simulations. Remote Sens. 2025, 17, 3597. https://doi.org/10.3390/rs17213597

AMA Style

Ni H, Sun J, Zhou X, Liu D, Zhang X, Cheng J, Lu W, Li S. Comparative Evaluation of Nine Machine Learning Models for Target and Background Noise Classification in GM-APD LiDAR Signals Using Monte Carlo Simulations. Remote Sensing. 2025; 17(21):3597. https://doi.org/10.3390/rs17213597

Chicago/Turabian Style

Ni, Hongchao, Jianfeng Sun, Xin Zhou, Di Liu, Xin Zhang, Jixia Cheng, Wei Lu, and Sining Li. 2025. "Comparative Evaluation of Nine Machine Learning Models for Target and Background Noise Classification in GM-APD LiDAR Signals Using Monte Carlo Simulations" Remote Sensing 17, no. 21: 3597. https://doi.org/10.3390/rs17213597

APA Style

Ni, H., Sun, J., Zhou, X., Liu, D., Zhang, X., Cheng, J., Lu, W., & Li, S. (2025). Comparative Evaluation of Nine Machine Learning Models for Target and Background Noise Classification in GM-APD LiDAR Signals Using Monte Carlo Simulations. Remote Sensing, 17(21), 3597. https://doi.org/10.3390/rs17213597

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop