Next Article in Journal
ZEMAX Simulations and Experimental Validation of Laser Interferometers
Next Article in Special Issue
Time-Resolved Calibration of Photon Detection Efficiency and Afterpulse Probability in 100 MHz Gated InGaAs/InP Single-Photon Avalanche Diodes
Previous Article in Journal
Laser Linewidth Measurement Using an FPGA-Based Delay Self-Homodyne System
Previous Article in Special Issue
Room-Temperature Fiber-Coupled Single-Photon Source from CdTeSeS Core Quantum Dots
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Low-Altitude Aircraft Point Cloud Generation Method Using Single Photon Counting Lidar

Sino-European Institute of Aviation Engineering, Civil Aviation University of China, Tianjin 300300, China
*
Author to whom correspondence should be addressed.
Photonics 2025, 12(3), 205; https://doi.org/10.3390/photonics12030205
Submission received: 23 January 2025 / Revised: 23 February 2025 / Accepted: 26 February 2025 / Published: 27 February 2025
(This article belongs to the Special Issue Recent Progress in Single-Photon Generation and Detection)

Abstract

:
To address the deficiency of aircraft point cloud training data for low-altitude environment perception systems, a method termed APCG (aircraft point cloud generation) is proposed. APCG can generate aircraft point cloud data in the single photon counting Lidar (SPC-Lidar) system based on information such as aircraft type, position, and attitude. The core of APCG is the aircraft depth image generator, which is obtained through adversarial training of an improved conditional generative adversarial network (cGAN). The training data of the improved cGAN are composed of aircraft depth images formed by spatial sampling and transformation of fine point clouds of 76 types of aircraft and 4 types of drone. The experimental results demonstrate that APCG is capable of efficiently generating diverse aircraft point clouds that reflect the acquisition characteristics of the SPC-Lidar system. The generated point clouds exhibit high similarity to the standard point clouds. Furthermore, APCG shows robust adaptability and stability in response to the variation in aircraft slant range.

1. Introduction

With the rapid expansion of the low-altitude economy, ensuring the flight safety of low-altitude aircraft has emerged as a crucial topic [1]. In this context, single photon counting Lidar (SPC-Lidar), which offers ultra-long-range detection capabilities under eye-safe conditions, is regarded as a potential environmental perception technology [2,3,4]. Through precise segmentation and recognition of the target point cloud data collected by SPC-Lidar, effective perception of targets in the low-altitude airspace can be accomplished. Nonetheless, the application of SPC-Lidar in low-altitude environmental perception systems is still in its early stages, with a lack of sufficient actual target point cloud datasets for training the system [5]. Hence, exploring methods for generating point cloud data of aerial targets has become particularly pressing.
Traditional target point cloud generation techniques mainly include ray tracing methods [6,7], voxel methods [8,9], and physical simulation methods [10,11]. All these methods necessitate elaborate modeling of the target object and its environment. However, the relevant environmental and target parameters are often fixed, and the calculation process is intricate. When the parameters of the target or the environment change, the point cloud data must be regenerated. The point cloud data generated by these methods have limitations in terms of inheritance, portability, and scalability, making it difficult to meet the simulation requirements for aerial targets with six degrees of freedom.
In 2014, Goodfellow et al. put forward a revolutionary machine learning framework—Generative Adversarial Nets (GANs) [12]. This framework can generate simulated data through the mutual game-playing learning process between the generator and the discriminator. With the rapid advancement of GANs in theory and modeling, they have been extensively utilized in multiple domains such as computer vision, natural language processing, and human–computer interaction, and have gradually extended to the field of target point cloud generation. At present, the methods for generating target point clouds based on GANs can be categorized into three main types according to the form of data processing: voxel-based generation methods [13,14], direct point cloud generation methods [15,16], and image-based generation methods [17,18].
In the voxel-based generation methods [13,14], point cloud data are transformed into a three-dimensional voxel grid. These structured data enable the extraction of features using three-dimensional convolutional networks for generating point clouds. Nevertheless, since each voxel merely represents the position at the volume center and does not contain the position information of all points within the volume, the accuracy of the generated point cloud is constrained by the voxel resolution. Although increasing the voxel resolution can enhance the accuracy of the generated point cloud, this significantly increases the computational cost. The direct point cloud generation methods [15,16] consider the disordered nature of point cloud data and pay more attention to the relative positions between points, and can generate dense and uniformly distributed point clouds in the three-dimensional space. However, the number of points in the point cloud data generated by this method is typically fixed, which to a certain extent limits its practical application. The image-based generation methods [17,18] convert each data point in the point cloud into a pixel in a depth or a multi-view image, and extract image features through two-dimensional convolution to guide the generation of point clouds. This approach avoids the accuracy loss in the voxelization process of point clouds and can handle point clouds with different numbers of data points.
In the field of point cloud data generation under the Lidar system, given the non-uniformity of the point cloud data collected by Lidar and the dynamic changes in the number of data points, the current trend is to adopt image-based generation methods [19,20]. Specifically, this approach involves converting the point clouds collected by Lidar into two-dimensional depth images of spherical projection and utilizing these images to train GANs. The merit of this method lies in its ability to retain the spatial position information of the original point cloud data. Additionally, it can flexibly accommodate changes in the number of points, enhancing the adaptability and flexibility of point cloud generation.
In this paper, point cloud generation for aircraft in the low-altitude airspace under the SPC-Lidar system is focused on. Compared with conventional vehicle-mounted Lidar systems, the SPC-Lidar system installed on low-altitude aircraft is required to monitor targets over a broader area. With the variation in target distances, the angular extent of the target in the field of view of the SPC-Lidar changes significantly, thereby resulting in a remarkable difference in the number of point cloud data points collected by the SPC-Lidar for the same target at different slant ranges. Simultaneously, as aircraft in the air possess six degrees of freedom, their point cloud data exhibit more complex spatial characteristics compared to ground scenarios. To improve the controllability and diversity of generating aircraft point clouds, in this paper, a conditional GAN (cGAN) [21] is utilized as the fundamental framework. Using two-dimensional depth images of aircraft point clouds as training data, an improved cGAN through adversarial training is employed to develop a generator capable of generating aircraft depth images under specific conditions. With this depth image generator as the core, a method, named APCG (aircraft point cloud generation), is proposed in this paper. APCG can generate diverse and high-quality aircraft point clouds that conform to the characteristics of the SPC-Lidar system based on input parameters such as the type, position, and attitude of the aircraft.
The remaining structure of this paper is organized as follows: Section 2 introduces the composition of APCG and the implementation methods of its core modules; in Section 3, the performances of the improved cGAN and APCG are analyzed through experiments; Section 4 discusses the advantages and limitations of the proposed method and explores potential directions for future research; and the conclusions are provided in Section 5. To ensure the rationality of this paper, the quantitative parameters for evaluating point cloud similarity are introduced in Appendix A.

2. Proposed Method

To solve the problem of rapid and diverse generation of point clouds for aircraft in the low-altitude airspace under the SPC-Lidar system, in this section, APCG, the improved cGAN, and the standard data generator will be introduced successively. The relationship among the three is shown in Figure 1. The APCG proposed in this paper can generate point clouds of aircraft of specific types, positions, and attitudes. APCG is constructed with the Generator module as the core, and the Generator module is obtained through adversarial training by the improved cGAN. The training data of the improved cGAN originates from the standard data generator. The following will introduce these three parts, respectively, in each subsection.

2.1. APCG Method

The basic structure of the APCG method, which is proposed in this paper, is shown in Figure 2. APCG is mainly composed of three modules: Generator, DeNorm., and Im2PC. The core of APCG is the Generator module, namely the aircraft depth image generator. The aircraft depth image generator is obtained through adversarial training of the improved cGAN using aircraft depth image data.
In Figure 2, e is the noise vector. In this paper, this noise vector is a 100-dimensional Gaussian noise vector in the latent space. ν is the conditional vector, which is a 6-dimensional vector constituted by the type, position, and attitude of the aircraft. The noise vector e and the conditional vector ν jointly constitute the input vector of the aircraft depth image generator.
The aircraft depth image generator generates aircraft depth images at specific distances and attitudes in accordance with the input vector. Herein, the depth images output by the generator are all the results of mapping the depth images of the actual positions of the aircraft to the reference slant distance r 0 , ref . The images of the aircraft at the actual slant distance r 0 require scaling processing using the images output by the generator, and this task is realized by the DeNorm. module in Figure 2.
After being processed by the DeNorm. module, the depth image of the aircraft is adjusted to its actual slant distance r 0 . With the center of the aircraft depth image as the origin, w and h, respectively, denote the coordinates of the pixel point relative to the origin in the horizontal and vertical directions, and d w , h represents the pixel value of the pixel point at ( w , h ) in the input image of the DeNorm. module. The pixel value of the depth image output by the DeNorm. module is
d w , h = d w · q , h · q
where · represents the ceiling operation, and q is the adjustment ratio.
q = r 0 / r 0 , ref
with the value of the reference slant distance r 0 , ref of the target being determined by the value of the target slant distance r 0 . If r 0 > 3 km, then r 0 , ref = 3 km; otherwise, r 0 , ref = 0.1 km. If the coordinates of the generated image corresponding to the adjusted image pixels exceed the range of the image, the pixel value d w , h of that point is set to −1.
The adjusted image is fed into the Im2PC module. The Im2PC module recovers the point cloud information of the aircraft based on the parameters of the SPC-Lidar system, the position information of the aircraft, and the length l of the aircraft fuselage. The point cloud recovered here may be referred to as the generated point cloud. The coordinates of each data point in the generated point cloud are defined by the observation coordinate system. The observation coordinate system takes the location of Lidar as the origin, as shown in Figure 3. The positive direction of the Z-axis is the projection of the line O O p ¯ , connecting the origin O and the center of mass O p of the aircraft on the horizontal plane. The positive direction of the Y-axis is the vertical upward direction. The coordinate system adopts a left-handed coordinate system, and the positive direction of the X-axis is the direction perpendicular to the Z-axis within the horizontal plane. In the observation coordinate system, a certain point in space can be presented in the form of rectangular coordinates ( x i , y i , z i ) or the corresponding spherical coordinates ( α i , β i , r i ) , where α i , β i , and r i are, respectively, the azimuth angle, elevation angle, and slant range corresponding to point ( x i , y i , z i ) . The azimuth angle α i is defined as the angle of rotation around the Y-axis, with the right-handed direction being positive. The elevation angle β i is defined as the angle with the horizontal plane X O Z , with the upward direction from the horizontal plane being positive.
In APCG, the pixel value range of the depth image is restricted to [−1, 1], and the values are in floating-point form. It is stipulated that a pixel with a value of −1 in the depth image indicates that there are no data points corresponding to the resolution unit. Suppose the pixel value at point ( w , h ) in the depth image input to the Im2PC module is greater than −1. Based on the SPC-Lidar system parameters (azimuth resolution R α and pitch resolution R β ), aircraft position information (slant range r 0 and elevation angle β 0 ), and fuselage length l, the Im2PC module can convert this pixel value d w , h into the spherical coordinates of the corresponding data point in the observation coordinate system.
α w , h = w · R α
β w , h = h · R β + β 0
r w , h = d w , h · l / 1.8 + r 0
From Equations (3)–(5), it can be observed that the azimuth and elevation angle information of the data point restored based on the pixel of the depth image is determined by the azimuth and elevation angles of the center of the corresponding resolution unit. Hence, the spatial coordinates of the generated data points present more pronounced quantization features, which exhibit certain differences from the coordinates of the actual data points collected.
Based on the spherical coordinates of the acquired data points, the point cloud data of the aircraft can be further converted into the rectangular coordinate representation of the observation coordinate system.
The aircraft attitude parameters within the condition vector ν are determined by the pitch angle θ p , yaw angle ϕ p , and roll angle γ p . An auxiliary coordinate system is established with the center of mass of the observed aircraft as the origin O p , where its coordinate axes are parallel to and opposite in direction to those of the observation coordinate system, as shown in Figure 3. The pitch angle θ p , yaw angle ϕ p , and roll angle γ p are, respectively, defined as the angles of rotation of the aircraft around the X p -axis, Y p -axis, and Z p -axis, with the right-handed direction being positive.

2.2. Improved cGAN

2.2.1. Improved cGAN Based on Depth Image

Aircraft in the air possess six degrees of freedom, and the numerical variation of each degree of freedom will significantly influence the point cloud data of the aircraft collected by SPC-Lidar. Hence, in the process of generating aircraft point clouds, it is essential to explicitly specify the corresponding position and attitude information of the aircraft when generating the point clouds. In other words, the aircraft point cloud generator must precisely generate the corresponding point cloud data based on the specific position and attitude parameters of the aircraft. Based on this demand, choosing a cGAN to train the aircraft point cloud generator is a rational option.
The improved cGAN structure for training the aircraft point cloud generator is shown in Figure 4. To concurrently guarantee the accuracy of point cloud generation and the adaptability to the variations in the number of data points, the improved cGAN in Figure 4 is trained based on images. The generator and discriminator of the improved cGAN are, respectively, constructed by employing transposed convolutional layers and convolutional layers. Such a design enhances the feature representation ability of the generator and the feature extraction ability of the discriminator [22]. Furthermore, in order to enhance the overall quality of the generated images, the self-attention mechanism is incorporated into both the generator and the discriminator, which facilitates the network’s focus on key features [23].
In Figure 4, the conditional vector of the improved cGAN, ν , is a 6-dimensional conditional vector constituted by the type, position parameters, and attitude parameters of the aircraft. This vector is combined with a 100-dimensional Gaussian noise vector e in the latent space to form the input vector of the generator. After being processed by the generator, the corresponding generated data are acquired. The generated data are a depth image of size W × H . The width W and height H of the image are, respectively, determined by the number of resolution units in the azimuth and elevation directions of the SPC-Lidar. Each pixel value of the image represents the slant range difference of the point cloud data within the corresponding resolution unit relative to the center of the aircraft. All the depth images output by the generator can represent the spatial distribution of the data points in the point cloud. After the channel expansion of the conditional vector ν , it is combined with the generated data in form of channel data to form a 7-channel fake data input for the discriminator, i.e.,
u ˜ = ( M , 7 , W , H )
where M represents the quantity of images input into the network in each batch. The conditional vector ν selects the corresponding standard data, which are also presented in the form of a depth image. The conditional vector is superimposed onto the standard data in channel form to constitute the real data input of the improved cGAN discriminator, i.e.,
u = ( M , 7 , W , H )
For the selection of the loss function, the improved cGAN adopts the concept from Reference [24]. Reference [24] indicates that constructing a discriminator with 1-Lipschitz continuity is essential. By computing the Wasserstein distance between the outputs of real and fake data, and then maximizing this distance, the training of the generator can be guaranteed to converge to the global optimum. Hence, the improved cGAN also employs the Wasserstein distance utilized in Reference [24] as the loss function of the discriminator. To ensure that the discriminator meets the 1-Lipschitz continuity, mixed data are introduced [24], i.e.,
u ^ = ε u + ( 1 ε ) u ˜
where ε is a random number that follows a uniform distribution within [ 0 ,   1 ] and is updated in accordance with the training batch. The discriminator acquires corresponding outputs D ( u | ν ) , D ( u ˜ | ν ) , and D ( u ^ | ν ) based on distinct inputs u, u ˜ , and u ^ . The larger the output of the discriminator, the closer the distribution of the data to that of the real data; the smaller the output, the closer the distribution of the data to that of the fake data. All the outputs of the discriminator are sent to the loss calculator to obtain the corresponding discriminator loss L D and generator loss L G . The output D ( u ^ | ν ) of the discriminator with respect to the mixed data u ^ will introduce a gradient penalty term to the discriminator loss L D , thereby maintaining the 1-Lipschitz continuity of the discriminator. Based on the discriminator loss L D and generator loss L G obtained in the current training batch, the parameters of the discriminator and generator are updated via backpropagation. Through multiple batches of adversarial training of the discriminator and generator, the generator converges to the global optimal solution.

2.2.2. Generator of Improved cGAN

The generator of the improved cGAN is constituted by five transposed convolutional layers, as shown in Figure 5. The input composed of the noise vector e and the conditional vector ν , after undergoing progressive up-sampling, generates the depth image of the aircraft that complies with specific position and attitude conditions, and the size of this image is channels × width × height ( 1 × W × H ).
The 106-dimensional column vector inputted to the generator undergoes reshaping to form a 106 × 1 × 1 tensor. This tensor is processed by five transposed convolutional layers (TConv), after which the spatial dimensions of the feature map gradually increase from 16 × 4 of the first transposed convolutional layer to 256 × 64. Simultaneously, the number of channels of the corresponding feature map gradually decreases from the initial 512 to 1.
The three transposed convolutional layers located in the middle of the generator all incorporate a transposed convolution operation, batch normalization, and the ReLU activation function. Batch normalization is applied to the feature maps of each channel after the transposed convolution. Its purpose is to ensure that the feature data of each channel follow a standard normal distribution. This enhances the stability of training and accelerates the convergence of the model. Nonlinearity is introduced into the model via the activation function ReLU, augmenting the model’s expressive capability.
To enhance the correlation between the feature maps of different channels, a self-attention (SA) layer is incorporated into the first transposed convolutional layer. As the improved cGAN in this paper is an image-based point cloud generation approach, the introduction of a single self-attention layer suffices to meet the requirement of the generator for generating high-quality images.
The final transposed convolutional layer of the generator transforms the feature map into a depth image of the aircraft point cloud. After being processed by the activation function tanh, the output data values are constrained within the range of −1 to 1. In floating-point arithmetic, the range of −1 to 1 can offer superior numerical stability and enhance the convergence rate and performance of the model. Consequently, in this paper, the pixel values of the depth images for training the improved cGAN are all represented in floating-point form within the range of −1 to 1.

2.2.3. Discriminator of Improved cGAN

The discriminator consists of 4 convolutional layers, as shown in Figure 4. The input of the discriminator is a 256 × 64 image data with 7 channels, among which 6 channels are usually formed by the conditional information and 1 channel is the depth image data of the aircraft point cloud. After the input image data undergo feature extraction through 4 convolutional (Conv) layers step by step, the features are flattened and then weighted and summed by the fully connected (FC) layer to obtain the output of the discriminator. The first 3 convolutional layers of the discriminator each contain a convolution operation and a LeakyReLU activation function. In addition, a self-attention layer is added in the 4th convolutional layer.
To avoid the impact of batch size on the performance of the discriminator, batch normalization processing is no longer employed in each convolutional layer. This is beneficial for preserving the original distribution of the input data, reducing overfitting, and enhancing the generalization ability of the model. The adoption of the LeakyReLU activation function in the convolutional layers, on the one hand, can retain the nonlinear characteristics of the ReLU activation function, which is conducive to improving the expression ability of the model. On the other hand, it can prevent neurons from being activated to zero, thereby enhancing the model’s generalization ability for the input data. The attention mechanism introduced in the final convolutional layer serves as an enhancement approach for data features, assisting the discriminator in better handling adversarial samples during the training process and enhancing the robustness of the model.

2.2.4. Training Process

Assume that the number of images in each batch for training the model is M; then, M conditional vectors ν are selected. Corresponding standard data are selected based on the conditional vectors ν and combined with the channelized conditional information to form 7-channel real data u. Meanwhile, each conditional vector ν is combined with a random sample e from the latent space to constitute the input vector for the generator. The generator generates the corresponding point cloud depth image based on the input vector, and this image is combined with the channelized conditional information to form 7-channel fake data u ˜ . The real data u and fake data u ˜ with the same conditional vector ν form the corresponding mixed data u ^ using Equation (8). At this moment, the obtained real data u, fake data u ˜ , and mixed data u ^ will serve as the samples for training the discriminator and generator in this batch.
As previously stated, the improved cGAN adopts the same loss function calculation method as in reference [24]. To ensure the integrity of this paper, a brief introduction to the calculation methods of the discriminator and generator loss functions is provided below. The outputs of the discriminator based on the input real data u, fake data u ˜ , and mixed data u ^ are, respectively, D ( u | ν ) , D ( u ˜ | ν ) , and D ( u ^ | ν ) , and the numerical range of these values is ( , + ) . When the discriminator determines the input data as real, the output value is positive; when it determines the data as fake, the output value is negative. The discriminator loss is measured by the Wasserstein distance, and its goal is to maximize the distance between the real data and the fake data. Therefore, the discriminator loss can be specifically expressed as
L D = E [ D ( u ˜ | ν ) ] E [ D ( u | ν ) ]
where E [ · ] represents the operation of computing the expectation. To enforce the 1-Lipschitz continuity of the discriminator more smoothly, a gradient penalty term is incorporated in the discriminator loss, i.e.,
L D = E [ D ( u ˜ | ν ) ] E [ D ( u | ν ) ] + λ ζ
where λ is the gradient penalty term and ζ is the penalty weight, whose specific expression is
ζ = E D ( u ^ | ν ) 1 2
with ∇ denoting the gradient operator and · representing the Euclidean norm. Considering the numerical range output by the discriminator, in the initial stage of the improved cGAN training, the discriminator is more prone to judge the real data u as true and the fake data u ˜ as false. Hence, E [ D ( u | ν ) ] > E [ D ( u ˜ | ν ) ] , and therefore the discriminator’s loss L D will be less than zero. With the advancement of the adversarial training between the discriminator and the generator, the disparity between E [ D ( u | ν ) ] and E [ D ( u ˜ | ν ) ] gradually reduces, and the discriminator’s loss L D subsequently approaches the vicinity of zero.
According to the acquired discriminator loss L D , the gradient of the loss function with respect to the parameters of the discriminator is computed via backpropagation, and subsequently the parameters of the discriminator are updated. The updated discriminator re-discriminates the samples of this batch and computes the corresponding discriminator loss L D . The aforementioned process is repeated until the predicted number of training rounds is attained or the convergence criterion is satisfied, completing the training of the discriminator using this batch.
After the training of the discriminator is accomplished, the generator is required to be trained using the samples of this batch. If the fake data output by the generator are successfully recognized by the discriminator, that is, if the fake data are unable to deceive the discriminator, it constitutes a loss for the generator, i.e.,
L G = E [ D ( u ˜ | ν ) ]
According to the acquired loss of the generator L G , a scheme similar to that of the discriminator is employed. Through backpropagation, the parameters of the generator are updated. The training is repeated until the training of the generator with the current batch of samples is accomplished.
Upon the completion of the training of one batch, new batch samples are required to be selected for the training of the discriminator and the generator of the improved cGAN. The above process is repeated until the model converges.

2.3. Standard Data Generator

As shown in Figure 4, the real data for training the improved cGAN are constructed by integrating the standard data of aircraft and the channelized conditional information. Herein, the standard data refers to the depth images transformed from the point clouds of aircraft collected in the SPC-Lidar system. Evidently, to generate these standard data, it is essential to obtain the point cloud of the target under the SPC-Lidar system initially. Since the aircraft target in the air possesses six degrees of freedom, any variation in the parameters of these degrees of freedom will lead to discrepancies in the point cloud data collected by the SPC-Lidar. Employing the actual SPC-Lidar system for target point cloud collection is not only costly but also operationally complex and has low feasibility. Hence, generating the point cloud of the aircraft under the SPC-Lidar system via simulation and further converting it into standard data is a feasible solution. This approach can effectively reduce costs and enhance the efficiency and operability of data collection.

2.3.1. Parameter Selection of SPC-Lidar

According to the technical parameters of the conventional SPC-Lidar, the azimuth resolution R α , the elevation resolution R β , and the distance resolution R r of the SPC-Lidar utilized for simulation are set as 0 . 0040 , 0 . 0076 , and 0.15 m , respectively. To reduce the complexity of the model and improve the quality of the aircraft point cloud generated by the improved cGAN, the generation process of the aircraft standard data does not consider the effects of the cumulative count and dead time of the SPC-Lidar system. Additionally, the improved cGAN employed in this paper disregards the impacts of noise and the environment on the samples during training, with the aim of obtaining a generator capable of generating high-quality aircraft point clouds. The effects of SPC-Lidar system parameters, noise, and the environment on the aircraft point clouds generated by APCG can be realized through simple subsequent processing such as adding noise and masking.
The perspective size of the aircraft in the surveillance area of SPC-Lidar varies with the distance between the aircraft and SPC-Lidar. The shorter the slant distance of the aircraft, the larger the field of view it occupies, and vice versa. For the sake of the flight safety of the aircraft, the distance between the aircraft and the low-altitude flyer should not be less than 3 km. Take 3 km as the reference slant distance r 0 , ref and consider the field of view occupied by medium-sized aircraft at this distance. Taking the widely used medium-sized civil aircraft such as A320 and B737 as examples, the fuselage length and wingspan of these aircraft models are approximately 45 m, and the height is approximately 12.5 m. At 3 km, the ranges occupied by the aircraft in the azimuth and elevation directions are approximately 0 . 859 and 0 . 239 , respectively. According to the resolutions of SPC-Lidar in the azimuth and elevation directions, it can be calculated that a medium-sized aircraft at 3 km approximately occupies 215 azimuth resolution units and 32 elevation resolution units.
In the low-altitude airspace, drones constitute a rather distinctive type of aircraft, characterized by low altitude, small size, and slow speed. Compared to medium-sized aircraft, the distance at which drones are observed in the SPC-Lidar system and the range of the visual field they occupy are significantly smaller. Taking large quadcopter drones, DJI Matrice 300 RTK and Freefly Systems Alta 8, for instance, their maximum wingspans and heights are approximately 1.5 m and 0.25 m, respectively. At a distance of 0.1 km from the SPC-Lidar, the range occupied by these drones in the azimuth and elevation directions is approximately 0.85° and 0.143°, respectively, which is comparable to the field of view range of medium-sized aircraft at a distance of 3 km from the SPC-Lidar. Hence, for drones, the corresponding reference distance can be adjusted to 0.1 km. A low-altitude aircraft equipped with the SPC-Lidar system focuses more on drones within 3 km. Drones beyond 3 km are nearly undetectable. Hence, the corresponding reference slant range r 0 , ref can be determined based on the slant range r 0 of the target.

2.3.2. Simulation of Standard Point Cloud Data of Aircraft

Aircraft standard point clouds refer to point clouds of aircraft simulated through spatial sampling. These point clouds are based on the fine point clouds of aircraft in the acquisition environment shown in Figure 3, and are designed to imitate the point clouds collected by the SPC-Lidar system. The fine point clouds of conventional aircraft involved in this paper are derived from the ModelNet40 dataset [25], while the fine point clouds of drones are fabricated using 3D models from the Free3d website. The ModelNet40 dataset, released by Princeton University, is a point cloud dataset for 3D image classification, encompassing over 100 types of aircraft models. The fine point clouds of the aircraft and drones adopted in this paper are each composed of approximately 10,000 data points, capable of precisely depicting the three-dimensional morphology of the aircraft. Figure 6 presents the top views of 8 types of aircraft and 2 drones. The point clouds of the 8 aircraft are from the ModelNet40 dataset, while the point clouds of the 2 drones are generated using 3D models from the Free3d website. As shown in Figure 7, the point clouds of each aircraft well delineate the key structures of the aircraft.
Taking aircraft 0146 as an example, when the attitude parameters of the aircraft are pitch angle θ p = 2 , yaw angle φ p = 13 , and roll angle γ p = 0 , with the elevation angle of the aircraft being β 0 = 3 . 5 and the slant range being r 0 = 3 km , the distribution of the target point cloud after spatial sampling in the observation coordinate system and its projection on the horizontal plane are shown in Figure 8. As shown in Figure 8a, the point cloud of aircraft 0146 after spatial sampling by SPC-Lidar retains the structural characteristics of the aircraft, and the attitude of the aircraft can be roughly determined from the form of the point cloud. In contrast to aircraft 0146 shown in Figure 7, the number of data points in Figure 8b is significantly smaller than that of the point cloud of aircraft 0146 in Figure 7. In Figure 8b, the data points of aircraft 0146 are mainly concentrated in the positions directly accessible by SPC-Lidar. Owing to the non-penetrability of the aircraft, no corresponding point cloud data can be collected at positions where the light is not directly reachable. As can be observed from Figure 8b, the point cloud of aircraft 0146 obtained by the spatial sampling method conforms to the characteristics of the point cloud collected by SPC-Lidar. Thus, the point cloud of aircraft obtained by this approach can be utilized as the standard point cloud data of aircraft.
As previously stated, aerial targets possess six degrees of freedom. The attitude information of the aircraft will have an impact on the simulation of its standard point cloud data, and the position information of the aircraft will also exert a significant influence on the constitution of the standard point cloud data. Still taking aircraft 0146 as an example, with identical parameter settings as in Figure 8, by only adjusting the slant range r 0 of the target, the three-dimensional point clouds of the target at different slant ranges are obtained, as shown in Figure 9. The point clouds in each subfigure of Figure 9 are presented in the observation coordinate system. By comparing each sub-figure, it can be observed that the spatial dimensions where the three-dimensional point clouds of the target are distributed at different slant ranges are approximately the same, but the point cloud density exhibits obvious variances. As the slant range increases, the density of the aircraft point cloud decreases, and the number of target data points collected decreases significantly. Such a substantial change in the number of data points in the target point cloud resulting from the variation in the slant range will render the training of the generator and discriminator more challenging.

2.3.3. Generation of Depth Image

The aircraft reference point cloud, obtained as described in Section 2.3.2, needs to be converted into a depth image. This depth image forms the standard data shown in Figure 2. Combined with the channelized conditional information, the standard data constitute the real data used for training the improved cGAN.
For SPC-Lidar, the point cloud of the acquisition target is typically represented in spherical coordinates. Converting the point cloud of an aircraft into a depth image entails mapping the slant range information of the data points within each resolution cell of SPC-Lidar to the depth information of the corresponding pixel in the depth image. As stated in Section 2.3.1, the azimuthal angular span and the elevation angular span of a medium-sized aircraft at a distance of 3 km are approximately 0 . 859 and 0 . 239 , respectively. Based on the azimuthal resolution R α and the elevation resolution R β of SPC-Lidar, it can be calculated that a medium-sized aircraft at 3 km occupies approximately 215 angular resolution units in the azimuth and 32 angular resolution units in the elevation. If the depth image has the azimuth as the width and the elevation as the height, and each pixel represents one angular resolution unit, considering the increase in the field of view caused by the aircraft’s attitude changes, an appropriate redundancy is added to the size of the depth image. The width W and height H of the depth image are selected as 256 × 64 . A coordinate system for the image is established with the center of the image as the origin, with the rightward direction from the origin being the positive direction of the width and the upward direction from the origin being the positive direction of the height. The azimuth and elevation angular ranges corresponding to the pixel at ( w , h ) in the image are, respectively,
[ ( w 0.5 ) R α , ( w + 0.5 ) R α ]
and
[ ( h 0.5 ) R β + β 0 , ( h + 0.5 ) R β + β 0 ]
If there exist data points in the corresponding resolution unit, suppose hypothetically that the spherical coordinate of the data point is ( α w , h , β w , h , r w , h ) . Considering that the pixel value range of the depth image output by the generator is [ 1 ,   1 ] , for maintaining the consistency of the training samples, it is advisable to also set the pixel value range of the depth image converted from the aircraft point cloud to be [ 1 ,   1 ] . The pixel value of the pixel point is determined by the position of the slant range of the data point in the corresponding resolution unit relative to the center of mass of the aircraft. The closer the data point is to the SPC-Lidar, the smaller the pixel value of the corresponding pixel point will be, and vice versa. The pixel value at ( w , h ) in the image is
d w , h = 1.8 ( r w , h r 0 ) / l
The coefficient 1.8 in Equation (15) is artificially selected, with the aim of making the distribution range of the pixel values of the aircraft point cloud approximately [−0.9, 0.9]. This range not only covers most of the pixel values in the image but also ensures a clear distinction between the target pixels and the background pixels, which is crucial for point cloud recovery using the Im2PC module. In the depth image, the resolution units not occupied by the aircraft point cloud should logically be 1. However, in this case, unoccupied pixels have minimal grayscale difference from the target outline, which could hinder the effectiveness of the training of the cGAN. In order to emphasize the pixel value difference between the resolution units occupied by the target and the unoccupied resolution units, the pixel values corresponding to background are set to −1.
Taking aircraft 0146 as an example, the three-dimensional point cloud of the aircraft shown in each subgraph of Figure 9 is transformed into a depth image of size 256 × 64 , and the corresponding results are presented in Figure 10. The fuselage length of aircraft 0146 is l = 49.02 m. Through comparing each subgraph in Figure 9, it can be observed that as the slant range of the aircraft increases, the size within the field of view of SPC-Lidar gradually diminishes.

2.3.4. Normalization of Target Pixel Occupancy

With the increase in the target slant range, the pixel proportion occupied by the target in the depth image will gradually decrease. When the target pixel occupancy rate is too low, it is not conducive to extracting the key information representing the target, thereby affecting the training effect of the improved cGAN. In order to train the improved cGAN more effectively, it is necessary to adjust the pixel occupancy rate of the target in the depth image at different slant ranges to approximately the same level. The standard depth image size for training the improved cGAN is set based on the size of the target at the reference slant range r 0 , ref . Therefore, when normalizing the target occupancy rate of the depth image at different slant ranges, the depth image at the reference slant range r 0 , ref is also taken as the reference. Such regularization processing is conducive to maintaining the consistency of the training data, thereby enhancing the performance of the improved cGAN model and the quality of the generated point cloud.
Since the slant range of the target is far greater than its size, the scaling of the target in the depth image can be regarded as linear scaling, and the corresponding ratio is
q = r 0 / r 0 , ref
Therefore, the pixel value at ( w , h ) in the scaled image is
d w , h = d w / q , h / q
where · represents the floor operation.
The various subplots in Figure 10 were amplified in accordance with Equation (12), and the corresponding results are presented in Figure 9. It can be observed from Figure 11 that after normalization processing, the disparity of pixel occupancy rates of the target in the depth image at different slant ranges has not been mitigated. Although resolution of the target reduces after normalization as the slant range increases, the morphological characteristics of the target become more readily extractable and identifiable. This phenomenon indicates that the normalization processing, while maintaining the morphological characteristics of the target, has an acceptable influence on the resolution.

2.3.5. Aircraft Standard Data Generator

Based on the previous preparatory efforts, the structure of the standard data generator of the enhanced cGAN is shown in Figure 12. The core component of this generator is the Simulator module. The fine point clouds of 76 types of aircraft and 4 types of drones are selected for type numbering. The fine point clouds of the corresponding aircraft are selected based on the aircraft type, and the point cloud data are sent to the Simulator module. The Simulator module determines the position of the aircraft’s center of mass in the observation coordinate system in accordance with the position parameters within the aircraft parameters and constructs the corresponding auxiliary coordinate system. Then, it ascertains the distribution of the point cloud data in the auxiliary coordinate system based on the aircraft’s attitude parameters. Eventually, the Simulator module samples the original point cloud based on the Lidar parameters and the occlusion relationship among the point clouds, thereby obtaining the point cloud data of the aircraft in Lidar mode, namely the standard point cloud of the aircraft. These standard point cloud data are then fed into the PC2Im module to generate depth images. The PC2Im module normalizes the data points collected by each resolution unit of the Lidar along the slant range direction. Using the aircraft’s position as the center and the fuselage length as the reference, this process follows Equation (15) to generate depth image data of standard size. These depth images are then sent to the Norm. module to adjust the pixel proportion of the target within the image. The Norm. module determines the corresponding scaling ratio based on the slant range of the target and assigns values to all the pixels in the new images in accordance with Equation (17), ultimately forming the standard dataset for the training of the enhanced cGAN.

3. Experimental Results

This section will expound the training effect of the improved cGAN through experiments and the necessity of the enhancement on the cGAN. On this basis, the validity of the proposed APCG in this paper will be verified by computing the MMD and JSD between the generated point clouds and the standard point clouds under the same conditions.

3.1. Training of Improved cGAN

To train the improved cGAN, sufficient standard data are required. The standard data generator shown in Figure 12 can generate standard point clouds along with their corresponding standard dataset in the form of depth images. To enhance the generalization capability of the model proposed in this article, the fine point clouds of 76 types of aircraft were extracted from the ModelNet40 dataset, and the fine point clouds of 4 types of drones were generated using the 3D models of drones from the Free3D website, thereby forming the basic aircraft fine point cloud dataset. Utilizing this dataset, under specific SPC-Lidar technical parameters and condition parameters, a standard dataset for training the improved cGAN was generated. By increasing the variety of aircraft in the fine point cloud dataset, the types of aircraft encompassed by the model can be further expanded. Although the generation process of the corresponding standard dataset and the training process of the improved cGAN do not undergo essential changes, an excessive number of aircraft types will augment the parameter quantity of the neural network, leading to increased computational complexity during the generation of the standard dataset and prolonged training time for the cGAN.
The range of values for the target’s position and attitude parameters is presented in Table 1. To render the standard data generated more representative, each parameter of the aircraft is uniformly distributed within the designated range. The standard data generator combines the random sample values of each parameter to generate the corresponding standard data samples.
By integrating the number of aircraft types, the parameters in Table 1 can be combined into ( 76 + 4 ) × 5 × 10 × 10 × 5 × 5 = 10 6 types of condition vectors. In the experiment, disregarding the influence of noise and the environment, each condition vector will generate an aircraft depth image. Consequently, based on the above experimental conditions, the sample quantity of standard data can be obtained as 10 6 . The standard data are partitioned into standard training data and standard test data at a ratio of 4:1. The 8 × 10 5 standard training image data are employed in batches for training the improved cGAN. The 2 × 10 5 standard test image data are utilized to retrace the corresponding standard point cloud and condition vector. The retraced condition vector is input into the APCG to generate the corresponding generated point cloud of the condition vector. The similarity between the generated point cloud and the standard point cloud is compared to assess the performance of the improved cGAN.
The improved cGAN of this paper was constructed and operated on the Kaggle GUP 100 platform. Both the generator and discriminator of the improved cGAN were trained with the Adam optimizer, and the parameters were updated based on the adaptive estimations of the first and second moments of the gradient. The specific training parameters are presented in Table 2.
The standard training data were fed to the improved cGAN in batches of 32 for training, and the discriminator loss L D and the generator loss L G were obtained, respectively, by Equations (10) and (12). With the aim of minimizing | L D | and L G , 1000 epochs of adversarial training were conducted on the improved cGAN. The variation in the absolute value of the discriminator loss L D and the generator loss L G with the training epoch is shown in Figure 13. As can be seen from Figure 13, both | L D | and L G were at relatively high values and had large fluctuations at the initial stage of network training. As the network training iterations proceeded, | L D | and L G gradually decreased and tended to stabilize, fluctuating near a relatively low value. This indicates that the improved cGAN used in this paper continuously learned and adjusted in the adversarial training between the discriminator and the generator until reaching stability.
The training speed of the improved cGAN is assessed by the variations in the similarity between the standard point cloud and the generated point cloud. The main metrics for evaluating the similarity between two point clouds primarily encompass Frechet Point Cloud Distance (FPD), Maximum Mean Discrepancy (MMD), Jensen–Shannon Divergence (JSD), etc. [19,20]. The concept of FPD stems from Frechet Inception Distance (FID) [26], where features of two sets of point clouds are extracted through a pre-trained network, and the distance between them is calculated to evaluate the quality of the point clouds. Unlike FID, there is currently no unanimously acknowledged reliable network for feature extraction in FPD. Different feature extraction networks may have dissimilar capabilities in capturing the geometric information of point clouds, thereby exerting a significant influence on the results of FPD and weakening the objectivity of the evaluation. Hence, in this paper, the average value of MMD and the average value of JSD are selected as the quantitative indicators for evaluating the similarity between the standard point cloud set and the generated point cloud set. To pay more attention to the analysis of the experimental results in this section, the introduction and calculation methods of the average of MMD and the average of JSD are placed in Appendix A.
The methods compared with the improved cGAN mainly comprise voxel-based point cloud generation methods [14], direct point cloud generation methods [17], and point cloud generation methods based on diffusion models [27]. Voxel-based point cloud generation methods, direct point cloud generation methods, and the improved cGAN method proposed in this paper fall into the category of point cloud generation methods based on GANs. Point cloud generation based on diffusion models has emerged as an important research direction in the field of 3D generation in recent years. Diffusion models generate data through a progressive denoising process. Owing to their high-quality generation capabilities and theoretical advantages, they have gradually become a popular method for point cloud generation. Point cloud generation methods based on diffusion models represent a novel type of method distinct from those based on GANs. In this paper, reference [27] is selected as a representative to be compared with the improved cGAN proposed herein.
Utilizing the standard training dataset containing 8 × 10 5 samples generated earlier, different methods were trained, and the variations in the MMD average and JSD average between the generated point cloud set and the standard point cloud set during each training stage were compared, as shown in Figure 14. As evident from Figure 14a, with the increase in training time, the MMD average of all methods rapidly decreases and approaches stability. This implies that as training proceeds, the similarity between the generated point cloud dataset acquired by each method and the standard point cloud dataset increases. At the same moment, the MMD average of the diffusion-based method is smaller than those of the voxel-based point cloud generation method and the direct point cloud generation method, but it is higher than that of the improved cGAN. It can be observed that the point cloud dataset generated by the improved cGAN is more analogous to the standard point cloud dataset, and the improved cGAN can reach the convergence value more rapidly. The JSD average presented in Figure 14b possesses similar traits. Hence, the improved cGAN exhibits a faster convergence rate and the optimal point cloud generation performance.
The occupancy of GPU resources by the four methods during the training process is shown in Figure 15. As observed from Figure 15, for each method during the training process, the occupancy of GPU resources is relatively stable, fluctuating around a certain memory usage amount. Overall, the GPU resource occupancy rate of the voxel-based point cloud generation method is the highest, and that of the diffusion-based method is marginally lower. The GPU resource occupancy rates of the direct point cloud generation method and the improved cGAN are notably lower than those of the other two methods. Among them, the improved cGAN proposed in this paper has the lowest GPU resource occupancy rate. A comprehensive analysis in conjunction with Figure 14 reveals that the improved cGAN possesses the fastest convergence speed and the smallest GPU resource occupancy rate.

3.2. Ablation Experiment on Improved cGAN

The improved cGAN proposed in this paper mainly makes three improvements: replacing the fully connected layers in the generator and the discriminator with transposed convolutional layers (TConv) and convolutional layers (Conv), respectively; introducing the self-attention layer (SA) in the generator and the discriminator; and using the Wasserstein distance with gradient penalty to represent the discriminator loss.
To deeply analyze the specific contributions of these improvement modules to enhancing the performance of the improved cGAN, we conducted ablation experiments. In the experiments, the effects of different module combinations of the cGAN after 1000 epochs of adversarial training on processing 2 × 10 5 standard test image data were considered. Through these experiments, the actual impact of each improvement measure on the model performance, as well as their role and importance in the overall architecture, was evaluated.
In the ablation experiments, the Frechet Inception Distance (FID) was used to evaluate the performance of the improved cGAN. FID is used to measure the Frechet distance (also known as Wasserstein-2 distance) between two multivariate Gaussian distributions. By comparing the distribution differences between the generated images of the improved cGAN under different combinations and the test image dataset, FID can quantify the quality and diversity of the generated images. A lower FID value indicates that the generated images are not only visually close to the test images in quality but also similar to the test image distribution in diversity.
The standard test images and the corresponding generated images extract feature vectors through the pre-trained Inception v3, and the corresponding FID is calculated using the average and covariance matrix of the feature vectors, i.e.,
FID = μ s μ g 2 + Tr ( Σ s + Σ g 2 Σ s Σ g )
where μ s and μ g are the averages of the standard test images and the generated images in the feature space, respectively, Σ s and Σ g are the covariance matrices of the two images in the feature space, respectively, and the function Tr ( · ) is the operation of calculating the trace of the matrix.
The results of the ablation experiments are shown in Table 3. In Table 3, “✓” indicates that the corresponding module was used in the cGAN. As can be seen from Table 3, the cGAN without any improvement had the highest corresponding FID, which was 52.6. The addition of each improvement module improved the FID value to varying degrees. From the perspective of a single module, the convolutional module (TConv/Conv) contributed the most, and the self-attention module (SA) contributed the least. When two modules were combined, the ability to improve the performance of cGAN was further enhanced. Among them, the combination of TConv/Conv + Wasserstein Loss had the best improvement effect. When all three improvement modules were applied to the network, the FID value of the cGAN was the lowest, reaching 40.6.
The results of the ablation experiments show that each improvement module added in the cGAN in this paper effectively improves the network performance, making the images generated by this network closer to the test images in both quality and diversity.

3.3. Performance of APCG

In this section, the performance of APCG is evaluated through diverse experiments. To begin with, the generation of aircraft point clouds by APCG under varying conditions is examined. Four types of aircraft targets from the ModelNet40 dataset, numbered 0146, 0123, 0155, and 0265, along with one type of drone, are selected as the experimental subjects. It is hypothesized that the attitude parameters of all experimental aircraft are identical, namely, pitch angle θ p = 2 . 3 , yaw angle ϕ p = 13 . 2 , and roll angle γ p = 0 . 13 . The elevation angles of these targets in the observation coordinate system are also the same, all being β 0 = 3 . 4 . The slant ranges of the aircraft are investigated in three scenarios, 4.5 km, 9 km, and 15 km, while those of the drone are explored in three cases: 150 m, 210 m, and 550 m. The target point clouds generated by APCG are presented in Figure 16.
By observing the point cloud distribution of each aircraft in Figure 14, it can be found that the point clouds are mainly concentrated in the direction facing the Lidar and show obvious non-uniformity, which is consistent with the distribution characteristics of the point clouds collected by the actual Lidar. Comparing the aircraft point clouds at different slant ranges, although the number of data points of the aircraft point clouds gradually decreases with the increase in distance, the point clouds generated by APCG can still maintain the main shape of the aircraft well. Therefore, intuitively, the point clouds generated by APCG are not only closer to the actual characteristics collected by the Lidar but also can meet the stability requirements for the characterization of aircraft features at different slant ranges.
The MMD average and JSD average are hereinafter employed to quantitatively analyze the quality of the point clouds generated by APCG. The same voxel-based point cloud generation method, direct point cloud generation method, and diffusion-based method utilized in the preceding experiments are still adopted for comparison. In the experiment, the variations in the MMD average and JSD average between the point cloud sets generated by different methods and the benchmark point cloud set with respect to the target slant range are compared. Due to the distinct slant range intervals of concern for aircraft and drones, the standard test data for each type of target are utilized for statistical analysis. The parameter settings for each type of target in the experiment are identical to those in Section 3.1, where the sample number of the standard point cloud for aircraft is 76 × 5 × 10 × 5 × 5 = 9.5 × 10 4 , and the corresponding sample number for drones is 4 × 5 × 10 × 5 × 5 = 5 × 10 3 . The MMD and JSD between the generated point clouds and the standard point clouds with the same slant range are calculated, and the corresponding average values are obtained. The variations in MMD ¯ and JSD ¯ with the target slant range for different point cloud generation methods are shown in Figure 17.
For the APCG method proposed in this paper, two APCG curves under two conditions, namely with or without the Norm. module, are shown in Figure 17. APCG (w/ Norm.) implies the inclusion of the Norm. module, while APCG (w/o Norm.) indicates its absence. By contrasting the corresponding parameter variations of APCG (w/ Norm.) and APCG (w/o Norm.), the necessity of the Norm. module in the APCG method can be analyzed. Figure 17a,b exhibit the variations in the quality parameters of the generated aircraft point clouds with the target slant range, while Figure 17c,d present the variations in the quality parameters of the generated unmanned aerial vehicle (drone) point clouds with the target slant range. The outcomes from all subfigures in Figure 17 reveal that, for both aircraft and drones, regardless of whether MMD ¯ or JSD ¯ is used, the curves corresponding to APCG (w/ Norm.) and APCG (w/o Norm.) are superior to other typical methods, and APCG (w/ Norm.) outperforms APCG (w/o Norm.). Hence, the APCG method proposed in this paper surpasses other classic point cloud generation methods, and the existence of the Norm. module enables the APCG method to obtain superior performance.
Subsequently, the time consumed by APCG in generating aircraft point clouds of varying quantities is employed to analyze the rate of point cloud generation by APCG. Under the prescribed sample size, when the point cloud set generated by APCG encompasses four scenarios, two types, four types, and eight types of targets with an identical sample count, as well as eight types of targets with random sample counts, the time consumption of APCG is statistically examined, as shown in Figure 18. Evident from Figure 18, in the four circumstances, the time consumption of APCG is nearly uniform, and the time spent on generating 106 samples is less than 6 s. Thus, the time consumption of APCG for generating samples is not related to the types of samples or the non-uniformity of samples but is only associated with the number of samples, and the rate of sample generation by APCG is extremely high.
APCG can rapidly generate point clouds of aircraft under the SPC-Lidar system. This is because the trained aircraft depth image generator only needs to perform multiplication and addition operations on the input vector under fixed parameters. The core task of APCG in generating point clouds is to provide training data for low-altitude environment perception systems. The generated aircraft point cloud data can be combined with radar or optical imaging data and used as training data through multi-sensor fusion to further improve the performance of low-altitude environment perception systems.
Finally, we carried out experiments to analyze the error situation of the aircraft point cloud produced by APCG. As stated in Section 2.1, when the Im2PC module of APCG transforms the depth image of the aircraft into the aircraft point cloud, it determines the azimuth and elevation angles of the data points by using the pixel position of the image, as expressed in Equations (3) and (4). Since the pixel coordinates and are both integers, the data points calculated by Equations (3) and (4) are the centers of the corresponding resolution units. Nevertheless, in practice, the data points may be located at any position within the resolution unit, which is bound to cause an error between the point cloud generated by APCG and the actual point cloud. To measure the influence of the error between the two, a uniformly distributed noise was added to the point cloud generated by APCG within the resolution unit, and this noised point cloud was used to simulate the actual point cloud. Using the aircraft data from the standard test dataset obtained in Section 3.1, generate the corresponding point clouds. Calculate the MMD and JSD between each generated point cloud and the corresponding noised point cloud, and statistically record the corresponding MMD ¯ (Noised) and JSD ¯ (Noised). Meanwhile, to highlight the changes in MMD ¯ and JSD ¯ before and after adding noise, calculate the MMD and JSD between each generated point cloud and the corresponding standard point cloud, and statistically record the corresponding MMD ¯ (Standard) and JSD ¯ (Standard). Ultimately, the curves of MMD ¯ and JSD ¯ varying with the target slant range are shown in Figure 19.
As shown in Figure 19, with the increase in the target slant range, both MMD ¯ (Noised) and JSD ¯ (Noised) gradually increase. The rising rate of MMD ¯ (Noised) is significantly higher than that of JSD ¯ (Noised). As the target slant range increases, the physical scale of the resolution cell also increases. This results in a wider range for the noise component of each data point. Consequently, the difference between the point cloud generated by APCG and the noised point cloud becomes larger. Therefore, both MMD ¯ (Noised) and JSD ¯ (Noised) increase with the increase in the target slant range. However, even when the target is at a larger slant range, MMD ¯ (Noised) and JSD ¯ (Noised) remain at a relatively low level. By comparing with the MMD ¯ (Standard) and JSD ¯ (Standard) curves, respectively, it is found that the added noise does not significantly affect the variation in MMD ¯ and JSD ¯ with the slant range. Therefore, there is a certain error between the point cloud generated by APCG and the actual point cloud, but this error does not affect the effectiveness of the generated point cloud in the actual environment.

4. Discussion

The core of the APCG method proposed in this paper is the aircraft depth image generator, which is acquired through adversarial training of an improved cGAN. APCG can effectively address the issue of point cloud generation for aircraft under the SPC-Lidar system. The aircraft point cloud dataset generated by APCG features controllability and diversity and can be utilized directly, or after straightforward contamination treatment, as training data for low-altitude environment perception systems. The main contributions of this paper lie in the following:
1.
The improved cGAN possesses superior generalization ability. Convolutional layers, transposed convolutional layers, and self-attention layers are introduced into the basic framework of the cGAN, and the loss function based on Wasserstein distance is adopted to replace the traditional loss function based on binary cross-entropy. The results of the ablation experiments indicate that these improvements all facilitate the enhancement of the quality and diversity of the images generated by the cGAN.
2.
APCG possesses the capability of generating aircraft point clouds efficiently based on specific conditional information. APCG constructs an aircraft point cloud generation model with the aircraft depth image generator as the core. The aircraft depth image generator is the generator in the successfully trained improved cGAN, and the training data of this improved cGAN are the aircraft depth image data converted from the characteristics of the aircraft point clouds collected by the SPC-Lidar system. Hence, APCG can rapidly generate aircraft point clouds that conform to the collection characteristics of the SPC-Lidar system by utilizing the type, position, and attitude information of the aircraft.
Although APCG, with its generator trained by an improved cGAN using standard aircraft depth images, can quickly produce diverse point cloud data, its input consists of a noise vector and conditions. The NeRF method, which uses images to train deep learning models like convolutional neural networks, can generate high-fidelity application scenarios [20]. However, the quality of the point clouds generated by NeRF is limited by the input image quality and perspective, and its computational complexity is relatively high.
The diversity of the point cloud generated by APCG for aircraft depends on its core depth image generator. The improved cGAN is trained under specific SPC-Lidar system parameters, which leads to the generator’s lack of adaptability to the parameters of the SPC-Lidar system. How to improve the adaptability of APCG to the parameters of the SPC-Lidar is a future research topic.
The application of SPC-Lidar in low-altitude environmental perception is still in its early stages, lacking sufficient experimental data. This paper verifies the effectiveness of APCG by comparing its generated aircraft point clouds with simulated standard point clouds. While this comparison reflects the performance of APCG, the idealized experimental conditions limit the credibility of the results. As research advances and experimental conditions improve, real low-altitude aircraft target point clouds from SPC-Lidar will become available, enhancing the credibility of APCG evaluations. Therefore, developing an experimental system for low-altitude environmental perception based on SPC-Lidar and using real point cloud data to assess APCG are key future research directions.

5. Conclusions

To address the problem of the scarcity of aircraft point cloud training data for low-altitude environment perception systems based on SPC-Lidar, a point cloud generation method for aircraft with SPC-Lidar, named APCG, is proposed in this paper. APCG adopts the generator in the improved cGAN as the core and constructs a framework that can directly generate point cloud data of specific aircraft types, positions, and attitudes. The improved cGAN is trained using aircraft depth images that reflect the spatial characteristics of aircraft point clouds. These aircraft depth images are generated from the fine point clouds of 80 types of aircraft or drones at different positions and attitudes. Hence, the trained improved cGAN possesses strong generalization ability and enables APCG to generate diverse aircraft point cloud data based on conditions. Experimental results demonstrate that the improved cGAN employed in this paper has a faster convergence rate and superior generalization ability. The APCG constructed using the generator of the improved cGAN can rapidly output diverse aircraft point cloud data. Statistical analysis reveals that the average values of MMD and JSD between the point cloud data generated by APCG and the corresponding standard point clouds are relatively low, indicating that APCG can generate aircraft point clouds that conform to the acquisition characteristics of SPC-Lidar.

Author Contributions

Conceptualization, Z.S.; methodology, Z.S.; software, S.L. and J.H.; validation, Z.S. and J.H.; formal analysis, Z.S. and S.L.; investigation, S.L. and J.H.; resources, Z.S.; data curation, S.L. and B.H.; writing—original draft preparation, Z.S. and S.L.; writing—review and editing, Z.S. and S.L.; supervision, Z.S.; project administration, Z.S.; funding acquisition, Z.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Tianjin Municipal Education Commission under Grant No. 2022KJ059.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
APCGAircraft Point Cloud Generation
SPCSingle Photon Counting
cGANConditional Generative Adversarial Network
GANGenerative Adversarial Network
FPDFrechet Point Cloud Distance
MMDMaximum Mean Discrepancy
JSDJensen–Shannon Divergence
KLKullback–Leibler
FIDFrechet Inception Distance
GPUGraphic Processing Unit
SASelf-Attention
FCFully Connected

Appendix A. Average Values of MMD and JSD

Appendix A.1. Average Value of MMD

MMD is a statistical measure for the distance between the distributions of two point clouds. The smaller the MMD value, the more similar the distributions of the two point clouds. Therefore, MMD can be used to measure the quality of generated point clouds. Suppose two different point clouds with data point numbers n and m are, respectively, represented as A = { a i } i = 1 n and B = { b j } j = 1 m , where a i and b j represent the position vectors of the data points in the corresponding point clouds, and the position vectors are composed of the rectangular coordinates of the data points in the observation coordinate system. The MMD between two point clouds can be defined as
MMD = 1 n 2 i , j k ( a i , a j ) + 1 m 2 i , j k ( b i , b j ) 2 n m i , j k ( a i , b j )
where k ( a i , b j ) represents the Gaussian kernel function, and its specific expression is
k ( a i , b j ) = exp 1 2 σ 2 a i b j 2
with σ being known as the kernel bandwidth of the Gaussian kernel function, which is usually represented by the median distance between two data points from different point clouds, i.e.,
σ = Me i , j a i b j
In order to evaluate the similarity of the two point cloud sets { A i } and { B i } , it is initially necessary to calculate the MMD i between the corresponding point clouds in the two sets based on Equation (A1), and subsequently calculate the average value of all MMD i , i.e.,
MMD ¯ = 1 N i MMD i
where N represents the number of point clouds in the point cloud set. The similarity between two point-cloud sets is measured by MMD ¯ .

Appendix A.2. Average Value of JSD

JSD, as an indicator for measuring the similarity between two probability distributions, can also be used to assess the quality of generated point clouds. JSD is a symmetrized version based on Kullback–Leibler (KL) divergence, i.e.,
JSD ( P     Q ) = 1 2 KL ( P     S ) + 1 2 KL ( Q     S )
where P and Q denote the two probability distributions under comparison. S = ( P + Q ) / 2 is the average probability distribution of P and Q. The function KL ( · ) is employed to compute the KL divergence between the two probability distributions, specifically,
KL ( P     Q ) = i P ( i ) log 2 P ( i ) Q ( i )
where P ( i ) and Q ( i ) , respectively, indicate the probability values corresponding to the two probability distributions P and Q at i.
JSD possesses non-negativity, symmetry, and boundedness. When JSD is adopted to assess the quality of generated point clouds, the closer JSD is to 0, the higher the similarity between the generated point cloud and the standard point cloud, that is, the better the quality of the generated point cloud; the farther JSD is from 0, the greater the disparity between the generated point cloud and the standard point cloud, which implies that the quality of the generated point cloud is poorer.
The calculation of JSD between two point clouds cannot be directly carried out using the position vectors of the data points. Instead, the spatial distribution of the point clouds needs to be transformed into the form of probability distributions. According to the data point distributions of the two point clouds, a cuboid is chosen that encompasses all the data points of both point clouds simultaneously. Each edge of the cuboid is parallel to the corresponding coordinate axis of the observation coordinate system. The cuboid is uniformly segmented into c intervals along each coordinate axis direction, thereby obtaining c 3 identical small cuboids. All the small cuboids are numbered. For the i-th small cuboid, the probabilities of the two point clouds are, respectively,
P A ( i ) = n i n
and
P B ( i ) = m i m
where n i and m i , respectively, denote the number of data points of point clouds A and B contained within the i-th small cuboid. The probability values of all the small cuboids form the probability distributions P A = { P A ( i ) } i = 1 c 3 and P B = { P B ( i ) } i = 1 c 3 of the two point clouds. By substituting the probability distributions of the two point clouds into Equation (A5), the quality assessment indicator JSD for the two point clouds can be acquired.
Similar to MMD, the average JSD is introduced for evaluating the similarity between two point-cloud sets, i.e.,
JSD ¯ = 1 N i JSD i

References

  1. Chu, T.; Wang, L. Research on Flight Safety Assessment Method of UAV in Low Altitude Airspace. In Proceedings of the International Conference on Autonomous Unmanned Systems, Singapore, 23 September 2022; pp. 1299–1307. [Google Scholar]
  2. Rapp, J.; Tachella, J.; Altmann, Y.; McLaughlin, S.; Goyal, V.K. Advances in single-photon lidar for autonomous vehicles: Working principles, challenges, and recent advances. IEEE Signal Process. Mag. 2020, 37, 62–71. [Google Scholar] [CrossRef]
  3. Popescu, S.C.; Zhou, T.; Nelson, R.; Neuenschwander, A.; Sheridan, R.; Narine, L.; Walsh, K.M. Photon counting LiDAR: An adaptive ground and canopy height retrieval algorithm for ICESat-2 data. Remote Sens. Environ. 2018, 208, 154–170. [Google Scholar] [CrossRef]
  4. Ma, Y.; Zhang, W.; Li, S.; Cui, T.; Li, G.; Yang, F. A new wind speed retrieval method for an ocean surface using the waveform width of a laser altimeter. Can. J. Remote Sens. 2017, 43, 309–317. [Google Scholar] [CrossRef]
  5. Wang, Z.; Menenti, M. Challenges and opportunities in Lidar remote sensing. Front. Remote Sens. 2021, 2, 641723. [Google Scholar] [CrossRef]
  6. Wald, I.; Woop, S.; Benthin, C.; Johnson, G.S.; Ernst, M. Embree: A kernel framework for efficient CPU ray tracing. ACM Trans. Graph. 2014, 33, 143. [Google Scholar] [CrossRef]
  7. Su, Z.; Sang, L.; Hao, J.; Han, B.; Wang, Y.; Ge, P. Research on Ground Object Echo Simulation of Avian Lidar. Photonics 2024, 11, 153. [Google Scholar] [CrossRef]
  8. Laine, S.; Karras, T. Efficient Sparse Voxel Octrees—Analysis, Extensions, and Implementation; NVIDIA Corporation: Santa Clara, CA, USA, 2010; Volume 2. [Google Scholar]
  9. Choi, S.; Zhou, Q.Y.; Koltun, V. Robust reconstruction of indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5556–5565. [Google Scholar]
  10. Macklin, M.; Müller, M.; Chentanez, N.; Kim, T.Y. Unified particle physics for real-time applications. ACM Trans. Graph. 2014, 33, 153. [Google Scholar] [CrossRef]
  11. Koschier, D.; Bender, J.; Solenthaler, B.; Teschner, M. A survey on SPH methods in computer graphics. Comput. Graph. Forum. 2022, 41, 737–760. [Google Scholar] [CrossRef]
  12. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
  13. Zhou, Y.; Tuzel, O. VoxelNet: End-to-end learning for point cloud-based 3D object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4490–4499. [Google Scholar]
  14. Wang, X.; Ang, M.H.; Lee, G.H. Voxel-based network for shape completion by leveraging edge generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 13189–13198. [Google Scholar]
  15. Achlioptas, P.; Diamanti, O.; Mitliagkas, I.; Guibas, L. Learning representations and generative models for 3D point clouds. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 40–49. [Google Scholar]
  16. Shu, D.W.; Park, S.W.; Kwon, J. 3D point cloud generative adversarial network based on tree-structured graph convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3859–3868. [Google Scholar]
  17. Fan, H.; Su, H.; Guibas, L.J. A point set generation network for 3D object reconstruction from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 605–613. [Google Scholar]
  18. Mandikal, P.; Navaneet, K.L.; Agarwal, M.; Babu, R.V. 3D-LMNet: Latent embedding matching for accurate and diverse 3D point cloud reconstruction from a single image. In Proceedings of the British Machine Vision, Newcastle upon Tyne, UK, 3–6 September 2018; pp. 3–6. [Google Scholar]
  19. Zyrianov, V.; Zhu, X.; Wang, S. Learning to generate realistic lidar point clouds. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 17–35. [Google Scholar]
  20. Zhang, J.; Zhang, F.; Kuang, S.; Zhang, L. Nerf-lidar: Generating realistic lidar point clouds with neural radiance fields. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 7–14 February 2024; pp. 7178–7186. [Google Scholar]
  21. Mirza, M. Conditional generative adversarial nets. arXiv 2014, arXiv:1411.1784. [Google Scholar]
  22. Radford, A. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
  23. Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-attention generative adversarial networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 7354–7363. [Google Scholar]
  24. Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of Wasserstein GANs. Adv. Neural Inf. Process. Syst. 2017, 30, 5769–5779. [Google Scholar]
  25. Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; Xiao, J. 3D ShapeNets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1912–1920. [Google Scholar]
  26. Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Adv. Neural Inf. Process. Syst. 2017, 30, 6629–6640. [Google Scholar]
  27. Luo, S.; Hu, W. Diffusion probabilistic models for 3d point cloud generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 2837–2845. [Google Scholar]
Figure 1. Relationship among several methods.
Figure 1. Relationship among several methods.
Photonics 12 00205 g001
Figure 2. Schematic diagram of APCG.
Figure 2. Schematic diagram of APCG.
Photonics 12 00205 g002
Figure 3. Establishment of the observation coordinate system and auxiliary coordinate system.
Figure 3. Establishment of the observation coordinate system and auxiliary coordinate system.
Photonics 12 00205 g003
Figure 4. Improved cGAN based on depth image.
Figure 4. Improved cGAN based on depth image.
Photonics 12 00205 g004
Figure 5. Structure of generator.
Figure 5. Structure of generator.
Photonics 12 00205 g005
Figure 6. Structure of discriminator.
Figure 6. Structure of discriminator.
Photonics 12 00205 g006
Figure 7. Top views of the point clouds of several types of aircraft.
Figure 7. Top views of the point clouds of several types of aircraft.
Photonics 12 00205 g007
Figure 8. Point cloud of aircraft 0146 after spatial sampling: (a) Three-dimensional form of the point cloud. (b) Projection of point cloud on horizontal plane.
Figure 8. Point cloud of aircraft 0146 after spatial sampling: (a) Three-dimensional form of the point cloud. (b) Projection of point cloud on horizontal plane.
Photonics 12 00205 g008
Figure 9. Three-dimensional point clouds of aircraft 0146 after spatial sampling at different slant ranges: (a) r 0 = 3 km. (b) r 0 = 6 km. (c) r 0 = 10.5 km. (d) r 0 = 16.5 km.
Figure 9. Three-dimensional point clouds of aircraft 0146 after spatial sampling at different slant ranges: (a) r 0 = 3 km. (b) r 0 = 6 km. (c) r 0 = 10.5 km. (d) r 0 = 16.5 km.
Photonics 12 00205 g009
Figure 10. Depth images of aircraft 0146 after spatial sampling at different slant ranges: (a) r 0 = 3 km. (b) r 0 = 6 km. (c) r 0 = 10.5 km. (d) r 0 = 16.5 km.
Figure 10. Depth images of aircraft 0146 after spatial sampling at different slant ranges: (a) r 0 = 3 km. (b) r 0 = 6 km. (c) r 0 = 10.5 km. (d) r 0 = 16.5 km.
Photonics 12 00205 g010
Figure 11. Results after normalization of the target pixel occupancy rate: (a) r 0 = 3 km. (b) r 0 = 6 km. (c) r 0 = 10.5 km. (d) r 0 = 16.5 km.
Figure 11. Results after normalization of the target pixel occupancy rate: (a) r 0 = 3 km. (b) r 0 = 6 km. (c) r 0 = 10.5 km. (d) r 0 = 16.5 km.
Photonics 12 00205 g011
Figure 12. Schematic diagram of standard data generator.
Figure 12. Schematic diagram of standard data generator.
Photonics 12 00205 g012
Figure 13. Loss variations with training epoch: (a) | L D | . (b) L G .
Figure 13. Loss variations with training epoch: (a) | L D | . (b) L G .
Photonics 12 00205 g013
Figure 14. Quality variations in generated point clouds of each model with training time: (a) MMD ¯ . (b) JSD ¯ .
Figure 14. Quality variations in generated point clouds of each model with training time: (a) MMD ¯ . (b) JSD ¯ .
Photonics 12 00205 g014
Figure 15. GPU resource occupancy rates of different point cloud generation methods.
Figure 15. GPU resource occupancy rates of different point cloud generation methods.
Photonics 12 00205 g015
Figure 16. Aircraft point clouds generated by APCG at different slant ranges.
Figure 16. Aircraft point clouds generated by APCG at different slant ranges.
Photonics 12 00205 g016
Figure 17. Quality variations in generated point clouds of each model with the slant range: (a) MMD ¯ of aircraft. (b) JSD ¯ of aircraft. (c) MMD ¯ of drones. (d) JSD ¯ of drones.
Figure 17. Quality variations in generated point clouds of each model with the slant range: (a) MMD ¯ of aircraft. (b) JSD ¯ of aircraft. (c) MMD ¯ of drones. (d) JSD ¯ of drones.
Photonics 12 00205 g017
Figure 18. Time consumption for APCG generating aircraft point clouds.
Figure 18. Time consumption for APCG generating aircraft point clouds.
Photonics 12 00205 g018
Figure 19. The variation in error with slant distance.
Figure 19. The variation in error with slant distance.
Photonics 12 00205 g019
Table 1. Configuration of aircraft position and attitude parameters.
Table 1. Configuration of aircraft position and attitude parameters.
ParameterNumberRange
β 0 5 [ 3 , 4 ]
r 0 10 [ 3 km , 18 km ] for aircraft
[ 100 m , 600 m ] for drone
φ P 10 [ 60 , 60 ]
θ P 5 [ 1 , 5 ]
γ P 5 [ 5 , 5 ]
Table 2. Configuration of training parameters.
Table 2. Configuration of training parameters.
ParameterValue
Decay rate for first moment estimate0.5
Decay rate for second moment estimate0.9
Learning rate of generator 2 × 10 4
Learning rate of discriminator 2 × 10 4
Total number of training epochs1000
Batch size32
Table 3. FID of cGAN in different module combinations.
Table 3. FID of cGAN in different module combinations.
cGANFID
TConv/Conv SA Wasserstein Loss
52.6
45.6
48.4
46.7
41.9
44.0
44.8
40.6
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Su, Z.; Liang, S.; Hao, J.; Han, B. Research on Low-Altitude Aircraft Point Cloud Generation Method Using Single Photon Counting Lidar. Photonics 2025, 12, 205. https://doi.org/10.3390/photonics12030205

AMA Style

Su Z, Liang S, Hao J, Han B. Research on Low-Altitude Aircraft Point Cloud Generation Method Using Single Photon Counting Lidar. Photonics. 2025; 12(3):205. https://doi.org/10.3390/photonics12030205

Chicago/Turabian Style

Su, Zhigang, Shaorui Liang, Jingtang Hao, and Bing Han. 2025. "Research on Low-Altitude Aircraft Point Cloud Generation Method Using Single Photon Counting Lidar" Photonics 12, no. 3: 205. https://doi.org/10.3390/photonics12030205

APA Style

Su, Z., Liang, S., Hao, J., & Han, B. (2025). Research on Low-Altitude Aircraft Point Cloud Generation Method Using Single Photon Counting Lidar. Photonics, 12(3), 205. https://doi.org/10.3390/photonics12030205

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop