L-DIG: A GAN-Based Method for LiDAR Point Cloud Processing under Snow Driving Conditions

LiDAR point clouds are significantly impacted by snow in driving scenarios, introducing scattered noise points and phantom objects, thereby compromising the perception capabilities of autonomous driving systems. Current effective methods for removing snow from point clouds largely rely on outlier filters, which mechanically eliminate isolated points. This research proposes a novel translation model for LiDAR point clouds, the ‘L-DIG’ (LiDAR depth images GAN), built upon refined generative adversarial networks (GANs). This model not only has the capacity to reduce snow noise from point clouds, but it also can artificially synthesize snow points onto clear data. The model is trained using depth image representations of point clouds derived from unpaired datasets, complemented by customized loss functions for depth images to ensure scale and structure consistencies. To amplify the efficacy of snow capture, particularly in the region surrounding the ego vehicle, we have developed a pixel-attention discriminator that operates without downsampling convolutional layers. Concurrently, the other discriminator equipped with two-step downsampling convolutional layers has been engineered to effectively handle snow clusters. This dual-discriminator approach ensures robust and comprehensive performance in tackling diverse snow conditions. The proposed model displays a superior ability to capture snow and object features within LiDAR point clouds. A 3D clustering algorithm is employed to adaptively evaluate different levels of snow conditions, including scattered snowfall and snow swirls. Experimental findings demonstrate an evident de-snowing effect, and the ability to synthesize snow effects.


Introduction
The presence of weather conditions affects and corrupts the signal quality of autonomous driving system (ADS) sensors and causes perception failures [1].During the past decade, with the expansion of weather-related datasets [2,3] and the development of sophisticated machine learning techniques, models designed to address adverse weather problems, such as precipitation, in autonomous driving have been widely studied [4].Progress has been made in the weather models of both LiDAR and images with weather conditions commonly treated as uniformly or normally distributed noise that can be represented by linear or monotone functions [5,6].Snow is among the conditions with tangible threats to the sensors but few profundities in research.What makes it even more special is the unexpected irregular swirls phenomena brought by wind or the motions of the ego vehicle itself or passing-by vehicles [7], causing not only randomly distributed scattered noise (salt-and-pepper) but also swirl clusters in LiDAR point clouds, as shown in the disordered clusters near the center in the red boxes of Figure 1a.The study on the snow problem in the point cloud has been focusing on k-d-tree-based neighbor searching outlier filters [8] in recent years and the de-noising performance has almost reached saturation [9].Nonetheless, few attempts at the implementation of deep-learning-based models have been made in snow conditions.Unlike filters with limited explainability, learning-based models have the potential to grasp both the surface and hidden features of snow clusters in a certain driving scene and perform snow synthesizing on top of snow de-noising.
The development of robust weather models benefits from training on paired data, i.e., a pair of weather-corrupted data and clear data with the rest of the elements identical, which are commonly obtained via artificially synthesizing realistic weather effects in previously clear driving scene images [10][11][12].Such an approach has been proven highly effective in rain [13,14], fog [15,16], and snow [17] weather conditions in camera images, plus contaminations on the camera lens [18].However, due to the relatively low data density, the realization of the weather effects in point clouds still largely depends on the collections in weather chambers [19,20] before the mature realization of weather data augmentation in point clouds.Although models have been successfully built for point clouds under rain and fog with additional road data [21], the low domain similarities between chambers and real roads still largely limit the generality.What is more, common experimental facilities with controllable precipitation rates across the world can hardly simulate complicated weather conditions such as dynamic snowfall [4].Therefore, it is necessary to develop a way to work with few paired data or unpaired data.In terms of disentangled data processing, CycleGAN [22] demonstrates a high ability in style conversion and object generation [23] based on datasets with different backgrounds and from different domains, and its implementation in weather models has been proven feasible [17,24].In this research, we propose the 'L-DIG' (LiDAR depth images GAN), a GAN-based method using depth image priors for LiDAR point cloud processing under various snow conditions.The original data format of the LiDAR point cloud in the spatial dimension does not align with the plane dimension as camera images, while a depth image is able to store the third dimension information of a point's spatial coordinates into its pixel value, hence serving as the 2D representation of the LiDAR point cloud, providing the opportunity of employing GAN-based methods [25].Our unpaired training datasets are derived from the Canadian Adverse Driving Conditions (CADC) dataset [2], which is known for its authentic winter driving scenarios and snow diversity.The proposed model aims to perform snow de-noising with a deep-level understanding of the snow features related to the driving scene, and inversely generate fake snow conditions in clear point clouds, exploring the possibility of creating paired synthetic datasets for point clouds with snow conditions.Furthermore, the quantitative evaluation of LiDAR point cloud processing has always been a tricky task.Researchers used to select a certain amount of samples, 100 frames for example, and manually determine if a point is a snow point, in order to calculate the precision and recall of the removal of snow noise points [8,26].Even though straightforward, it has two downsides: For one, it consumes an unimaginable amount of time and manpower to manually annotate a whole dataset while a small portion of samples suffers the risk of bias.Secondly, the accuracy of human annotation on point clouds with over 20,000 points in each frame can be as low as 85% and not satisfied enough to support the subsequent calculations on precision and recall [27].Given that a LiDAR point cloud fundamentally represents a distribution of points within a three-dimensional space, 3D point clustering algorithms can be effectively used to reflect the physical properties and assess the performance of point cloud processing specifically on snow points [28].
Among various algorithms, the ordering points to identify the clustering structure (OP-TICS) algorithm [29] emerges as a preferred choice due to its exceptional capacity in handling clusters of varying densities [30].Derived from the DBSCAN (density-based spatial clustering of applications with noise) methodology, OPTICS identifies distinct cluster groups by densely associating data points within a predetermined radius [31].This principle highly aligns with the adaptive filters, which are predicated on neighbor searching and have been widely employed in the LiDAR point cloud de-noising over recent years.
An example of the OPTICS clustering result is shown in Figure 1b.It can be seen that despite the density variation, everything has been classified into separated groups of clusters, each distinguished by unique colors, while the environmental structures are also well segmented.The conglomeration of snow swirl points, positioned at the lower left of the center, are collectively assigned to large blue and purple clusters.Minor snow clusters, such as those in the immediate right vicinity of the center, along with individual scattered snow points spread across the scene, are categorized into smaller, uniquely colored clusters.
The main contributions of this research are described as follows: 1.
We build a deep-learning-based LiDAR point cloud translation model with unpaired data and depth image priors.A new discriminator structure to better remove snow noise and new loss functions, including depth and SSIM (structural similarity index measure) losses to maintain the driving scene integrity have been designed in the proposed model.

2.
The proposed model is able to perform LiDAR point cloud translations between snow conditions and clear conditions in driving scenes.The model demonstrates a certain level of understanding of the snow features and performs both effective snow de-noising and artificial snow point generation, which could help create paired datasets for training or simulation in autonomous driving applications.

3.
We employ the OPTICS clustering algorithm as the quantitative evaluation metrics of the snow conditions in LiDAR point clouds.We set adaptive parameters to cluster different forms and levels of snow masses, calculate multiple indexes, and present distribution changes in violin plots to reflect the conditions of snow in the whole dataset in a comprehensive way.

Related Works
Our proposed model is a GAN-based deep-learning model for LiDAR point cloud processing including de-noising.As a result, we are going to introduce some related works regarding adaptive noise removal filters, deep-learning-based LiDAR point cloud processing methods, and GAN-based LiDAR translation methods in the following sections.

Adaptive Noise Removal Filters in LiDAR Point Cloud Processing
Noise point removal, or to say de-noising, is one of the most fundamental preliminary tasks of LiDAR point cloud processing to enhance the data quality and is commonly realized by radius outlier removal (ROR) filters.In the 3D space, ROR filters methodically analyze each individual point, assessing neighboring points located within a predefined radius.If the count of these neighboring points is below a certain threshold (k min ), the filter deems the point as noise and eliminates it.This approach is notably effective in removing noise generated by small, separate solid entities such as snowflakes.However, it carries a potential drawback as it might also eliminate distant points in the environment, thus compromising the data integrity of the original point clouds.One of the main reasons for this limitation lies in the mismatch between the fixed predefined search radius and the varying point density that tends to decrease towards the edges of the point clouds.As a result, there is a compelling demand for an adaptive search radius to rectify this discrepancy.
Charron et al. [8] proposed an advanced variant of the conventional filter, referred to as the dynamic radius outlier removal (DROR) filter.The distinctive characteristic of the DROR filter is that it dynamically adjusts the search radius (SR p ) of each point based on its intrinsic geometric attributes, as shown in (1), where r p is the range from the sensor to the point p, α is the horizontal angular resolution of the LiDAR, the product of (r p α) represents point spacing, and β d is the multiplication factor.This approach meticulously eliminates central noise points, while facilitating the retention of pivotal points situated at greater distances within the point clouds, successfully preserving the pertinent structures in distance.
Another noteworthy innovation is the dynamic statistical outlier removal (DSOR) filter introduced by Kurup and Bos [26].In this approach, point removal involves a thorough examination of each point to ascertain if the mean distance to its k nearest neighbors exceeds a dynamic threshold.The calculation of this dynamic threshold is defined as follows: where the variables are designated as follows: µ symbolizes the overall mean distance between every point and their corresponding k nearest neighbors.The global standard deviation of these distances is represented by σ.The term β s stands for a predetermined multiplier parameter.The distance from the sensor to a given point p is indicated by r p , whereas r signifies a scaling factor for the spacing between points.In terms of performance, DSOR proves to be either on par with or superior to DROR in both noise reduction and environmental feature preservation.Remarkably, DSOR also exhibits a significantly higher computational speed compared to DROR, further underlining its advantages in data preparation tasks.Subsequently, a variety of optimized filters based on adaptive parameters such as intensity-based [32], and density-based [33] were put forward.While adaptive filters have made strides in de-noising, they are inherently constrained by predefined removal rules.This limitation becomes particularly evident when considering the diverse and infinite ways that snow can form clusters, a complexity that cannot be fully captured by such rigid algorithms.

Deep-Learning-Based LiDAR Point Cloud Processing
The initial implementation of point cloud de-noising using deep learning was primarily focused on mitigating the effects of rain and fog, which introduce obscurity or diffusion of LiDAR signals due to the presence of small water droplets.Additionally, external disturbances such as wind or spray can lead to the formation of clustered fog and water mist, thereby causing false obstacles for LiDAR [34].Shamsudin et al. [35] developed algorithms for removing fog from 3D point clouds after detection using intensity and geometrical distribution to separate and target clusters of points, which were then removed from the point cloud.The problem is such algorithms are designed for indoor environments where fog behaviors are significantly different than outdoor driving scenarios.
To address the weather problems under driving conditions, Heinzler et al. [21] proposed a CNN-based approach capable of weather segmentation and de-noising.The authors collected both clear and foggy datasets within the CEREMA's climatic chamber [20], enabling controlled precipitation rates and visibility conditions.By utilizing the paired datasets, they facilitated an automated labeling procedure that annotated weather data, enabling the model to acquire knowledge regarding the distinctive attributes of rainfall and fog reflections in point clouds.Subsequently, the trained model demonstrated the ability to discriminate fog/rain-induced point clusters and eliminate noise while preserving the original objects, such as pedestrians or cyclists.To enhance the model's capability of comprehending real-world elements that are challenging to simulate in fog chambers, the authors also incorporated fog and rain data augmentation upon road data in the CNN model.
Besides de-noising techniques, augmentation is another valuable LiDAR point cloud processing method that can offer comparable assistance in managing adverse weather conditions.Hahner et al. [36] developed a method that simulates snow particles in a 2D space corresponding to each LiDAR line and adjusts each LiDAR beam's measurements based on the resulting geometry.In addition, they factored in ground wetness, a common occurrence during snowfall, in their LiDAR point clouds as a supplement of the augmentation.The notable enhancement observed in the performance of 3D object detection subsequent to training on semi-synthetic snowy training data substantiates the successful simulation of snowfall.It is of particular importance to acknowledge that their snow augmentation approach predominantly focuses on light snowfall conditions under the rate of 2.5 mm/h, wherein the prevalent snow effects in LiDAR point clouds manifest as dispersed noise points rather than snow clusters.Snow clusters pose a greater challenge to LiDAR perception in actual driving conditions and our primary focus revolves around the study of snow clusters.

GAN-Based LiDAR Translation among Various Adverse Conditions
The utilization of GANs is particularly suitable for weather de-noising tasks due to the inherent challenge of obtaining diverse weather conditions while maintaining a consistent background at the pixel level, given the ever-changing atmospheric conditions.Notable methods in this area include CycleGAN [22], DiscoGAN [37], and DualGAN [38], which introduce a cycle-consistency constraint to establish connections between the images.Leveraging the capabilities of GANs to generate visually realistic images without requiring paired training data, translations between weather images and clear ones have been widely studied.De-weather frameworks based on GAN structure have proven effective in removing multiple weather conditions [24,39] in a single image.
Consequently, the adaptation of GAN models for the translation of LiDAR point clouds has emerged as a natural progression.Sallab et al. [40,41] were the first to explore the translation from simulated CARLA driving scenes to synthetic KITTI [42] point clouds, employing the CycleGAN framework.Similarly, Lee et al. [43] turned to point cloud translations between sunny, rainy, and foggy weather conditions.They employed the depth and intensity channels of the 2D Polar Grid Map (PGM) for CycleGAN processing.It is worth noting that inter-domain translations between fog chamber scenes and real-world roads present a challenge wherein artificial precipitation from sprinklers can be detected as vertical cylinders rather than genuine rainfall by LiDARs [1].This discrepancy may impact the interpretability of weather reflection features in point clouds, thereby potentially diminishing the overall translation performance.The inability of chambers to simulate snowy conditions has also resulted in a lack of advancements in the processing of LiDAR point cloud data in snow environments.

LiDAR 2D Representation
To adapt LiDAR point cloud data to fit within the structure of the GAN-based model, we initially applied a dimensionality reduction to the point clouds.This yields a 2D visualization of the point clouds, namely, depth images, which signifies the orthographic projection of the point clouds.By unrolling the exterior surface of the LiDAR's horizontal field of view (FOV) cylinder and mapping each point from the point cloud onto this frontal-view plane, a rectangular image encompassing all points within the LiDAR's FOV is obtained.We partition the horizontal field evenly into w columns and distribute the vertical field uniformly into h rows.Consequently, post-projection, a depth image bearing a resolution of w × h can be secured, where the horizontal resolution is proportional to the sensor's rotation rate, and the vertical resolution is proportional to the number of physical layers [44].
An illustration of a specific frame of the depth images under noticeably snowy conditions, along with the corresponding RGB images featuring identical objects, can be found in Figure 2. A close observation reveals that the relevant objects are well reflected and the snow noises, bearing resemblance to 'salt and pepper' speckles, are prominently displayed in the depth image.The position of each pixel reflects the frontal-view projection of the points within the point clouds, while the grayscale pixel value signifies the distance between the points and the observing vehicle.Smaller pixel values (darker) imply greater distances, whereas higher pixel values (brighter) suggest closer distances.A discrepancy is worth mentioning that due to the limited vertical FOV and its relatively high mounting position on the car's roof of the LiDAR used in the sample dataset, the black car in the immediate right vicinity of the ego vehicle partly falls within the "blind spot" and is not clearly delineated in the depth image.Instead, only a few high-intensity signal points from the window glass are visible.
To obtain depth images under clear conditions, we apply the DSOR filter [26] onto snow datasets.In this context, we invert the typical approach of creating synthetic weather conditions in datasets for training, instead of generating an artificial clear dataset.Given that filters cannot guarantee absolute precision and recall in the de-snowing process, the filtered result cannot be considered equivalent to the ground truth.However, the stringent filter still provides a valuable sample for unpaired training.Among the reasons we choose DSOR over alternative filters are its rapid processing speed and its excellent capacity to retain as many environmental elements as possible when the filter parameters are set to an uncompromising level.

LiDAR Depth Images GAN
In this research, an enhanced iteration of CycleGAN [22] serves as the foundational structure for our proposed model, LiDAR depth images GAN (L-DIG).As illustrated in Figure 3, the comprehensive blueprint of our model is depicted.Due to the features of CycleGAN, both the de-snowing (snow → clear) and the fake snow generation (clear → snow) are completed at the same time for one set of data.Table 1 provides the annotations of all the symbols and alphabet designations used in the model architecture.In our model, A and B symbolize two sets of data flow in the forward and backward cycle, respectively, and Snow A and Clear B are the inputs that we provide in the form of depth images.C subscripts correspond to clear weather conditions, whereas S subscripts are used to indicate snowy conditions.The generators responsible for transitioning between snowy to clear and clear to snowy states are represented by G SC and G CS , respectively.Meanwhile, the discriminators are expressed as D A and D B .The two reconstructions are GAN features to keep the translation from overfitting.We provide the pseudo-code of L-DIG in Algorithm 1.

Pixel-Attention Discriminators
We designed a new discriminator structure to enable the model to recognize the snow noise points more accurately.The discriminators D A and D B are each composed of two parts: N-layer Discriminators D nA and D nB with 3 convolutional layers, and pixel discriminators D pA and D pB .The N-layer discriminators concentrate on relevant objects within the scene while the pixel discriminators scrutinize each pixel individually to ascertain its authenticity through a 1 × 1 patch [22].This approach contributes to a minor disturbance to the binary discriminator's threshold, elevating the criteria to achieve a 1 (approved) rather than a 0 (rejected) to a more stringent level for isolated noise points.This strategy markedly enhances the de-snowing effect, particularly in areas surrounding the ego vehicle, where dispersed snow points are densely packed.The pixel-attention discriminators undergo training for several epochs subsequent to the stabilization of the model training with N-layer discriminators, as shown in Figure 3 and Algorithm 1. Generate fake images: Generate reconstructed images: Compute GAN, cycle, depth, and SSIM loss Scale ambiguity poses a challenge for depth images, necessitating the use of loss functions resilient to rough estimations [45].Taking a cue from the Fine network [46], we crafted a depth loss, L depth , that is integrated into the training cycles to uphold consistency in the scale of depth images, as represented in (4).
In the given formula, di and d i symbolize the reconstructed and initial depth, respectively, while the hyperparameter λ depth governs the scale invariance.We assign λ depth = 1 to achieve complete scale invariance.This is due to our objective of preserving the translated point cloud as similar to the original as possible, exclusive of the snow, and safeguarding the relevant objects and environmental elements from distortions in shape and size.

SSIM Loss
As the point cloud translation occurs on the scale of the entire scene, it sometimes involves objects and structures that are partially obscured or incomplete, leading to the model's suboptimal comprehension of these elements.This can cause distortions or alterations in the original forms, particularly in environmental features.To mitigate this, an SSIM (structural similarity index measure) loss, depicted in ( 5) and ( 6), has been incorporated into the training cycle to aid in preserving structural consistency.
where N is the normalized image tensor of the original real image, N is the normalized image tensor of the reconstructed image, µ N is the average of N, µ N is the average of N, σ 2 N is the variance of N, σ 2 N is the variance of N, σ NN is the covariance of N and N, c1 and c22 are two variables to stabilize the division with a weak denominator.
The SSIM loss computation is performed post a subtraction by 1, due to the fact that SSIM loss gauges similarity while the training mechanism is geared towards attaining minimum values.Consequently, we aim to lower the difference by training 1 minus the SSIM function.This also elucidates the necessity for prior normalization on the image tensor between [0, 1].Meanwhile, it is essential to maintain a relatively low λ s weight setting on the SSIM loss to prevent the model from becoming overly rigid, thereby obstructing any desired translation.

Cycle Consistency Loss
We employ the cycle consistency loss (7) derived from CycleGAN, aimed at maintaining the created depth images closely aligned with the original domain.Provided a minimal variation in the background during the translation, the weight λ c of cycle consistency loss can be set as equal to the customized depth loss.

Overall Loss Function
Upon integrating the conventional GAN adversarial losses between the clear and snowy data, denoted by L GAN (G SC , D A , S, C) and L GAN (G CS , D B , C, S), we arrive at the comprehensive objective loss: where λ c , λ d , and λ s denote the weight coefficients of cycle consistency loss, depth loss, and SSIM loss, respectively.The higher the weight, the larger the influence the corresponding loss function has on the model.

OPTICS 3D Clustering Algorithm
OPTICS exhibits great adaptability to variable densities, presents lower parameter sensitivity, exhibits hierarchical relationships among clusters, and possesses improved outlier detection capabilities.These features of OPTICS largely match with snow point behaviors in point clouds and make it a nice evaluation method on snow datasets.By applying an adaptive clustering principle, we significantly simplify the interpretation and extraction of clusters without calling excessive manual parameter tuning.
We produce the following seven metrics based on the OPTICS algorithm:

•
Noise Number: Points without any neighbor points within a designated range (solitary points) are considered noise points, mostly snowflakes.A decrease in noise number is one of the most direct indicators of an effective de-snowing performance.• Cluster Number: A main output of the algorithm, representing groups of data points that are closely related based on their reachability.

•
Reachability Distance: The smallest distance required to connect point A to point B via a path of points that satisfy the density criteria.Normally, the average reachability distance would rise along with larger cluster numbers.

•
Inter-Cluster Distances: The concept here involves identifying the centroid, or the average point, of each cluster, and subsequently computing the distance between every possible pair of centroids.Should there be an increase in the average of these distances, it would suggest a reduction in the number of clusters and a more dispersed cluster distribution.In the context of our study, such a pattern could be interpreted as an effect of de-snowing.• Size of Clusters: This is essentially determined by the average number of points each cluster holds.Under conditions dominated by scattered snow, the snow noise points tend to form numerous small-scale clusters.Their elimination, consequently, leads to an increase in the average size of the clusters.• Silhouette Score: Measures the cohesion within clusters and the separation between clusters.A silhouette score close to 1 indicates a good clustering quality, while a score close to −1 indicates poor clustering.A lower silhouette score is commonly observed in snowy conditions due to the more overlap between clusters.• Davies-Bouldin Index (DBI): Measures the ratio of within-cluster scatter to betweencluster separation and assesses the quality of the overall cluster separation.A lower Davies-Bouldin index indicates better clustering, with zero being the ideal value.Snow conditions with many noise points or swirl clusters exhibit higher values of DBI.
We present three example scenes in Figure 4, from a slight snowfall condition to a normal snowfall condition, as well as a fierce snow swirl condition to illustrate how the OPTICS algorithm evaluates snow conditions, with their metrics summarized in Table 2.
From the statistics, we can tell that with the increase in snow level, the noise number and cluster number of the driving scene are gradually rising, along with the average reachability distances as expected.With fewer noise points and fewer snow clusters, the average inter-cluster distances and the average sizes of clusters decrease correspondingly.The tendencies toward deteriorated clustering and increased overlaps, as indicated by the DBI and silhouette score, are also consistent with conditions of heavier snow conditions.This validation proves the OPTICS algorithm's capability of evaluating the change of snow conditions in a LiDAR point cloud.
In order to more effectively illustrate the differences between mild and heavy snow conditions in terms of reachability distances, inter-cluster distances, and cluster sizes, we provide violin plots for these three metrics in Figure 5, positioning mild snow on the left and heavy snow on the right in each set.These plots not only show their quartiles but more crucially, delineate the differences in distribution.It can be observed that as snow conditions get heavier, the distribution in all three metrics exhibits somewhat abrupt curves with sharper edges and sudden shifts, indicating the inherent disarray characteristic of heavy snow.Generally, scenes with less snow presence display a relatively uniform and smooth distribution, as depicted in the violin plots, with a lower skewness value [47], offering another angle to assess the changes in snow conditions.

Experiments
We conducted experiments with the trained models on two different conditions: (1) Mild snow conditions: snowfalls only without snow swirls; (2) Fierce snow conditions: both snowfalls and snow swirls.We first conducted the experiment on snowfall-only conditions to examine the performance of scattered noise point capture.In the meantime, this less occlusion condition provides a better opportunity to check how well the original environmental structures have been maintained, so as to affirm the model's ability for accurate snow capturing.Then, the same experiment was conducted on conditions with both snowfalls and snow swirls, to comprehensively present the model's ability to handle highly adverse conditions.
To guarantee the realism of the snow effect, we utilized the well-known dataset specializing in snow conditions within an autonomous driving context-the CADC dataset [2].This dataset encompasses over 7000 frames of LiDAR point cloud data gathered during the winter season in Waterloo, Ontario.The driving scenarios span both urban and suburban settings, incorporate high and low-speed conditions, and cover a range of snowfall intensities and heavy snow accumulation situations.The LiDAR system implemented in the CADC dataset has a vertical FOV of 40 • , with a range from +10 • to −30 • , and a horizontal FOV of 360 • .
Training, testing, and data processing were conducted utilizing the Pytorch framework.We initially examined all possible combinations of ResNet residual blocks (ranging from 4 to 9) in G SC and G CS , and convolutional layers (ranging from 1 to 4) in D nA and D nB , and identified the most optimal combination for our model.When variables are kept constant, a combination of four ResNet residual blocks in G CS , G SC and two downsampling convolutional layers in D nA and D nB produce the most superior translation result.
Square-shaped samples randomly cropped from the depth images are input to two NVIDIA RTX 3090Ti graphics cards with a batch size of eight for training.In the second half of the N-Layer discriminator stage training, we adhere to a linearly declining learning rate schedule until the process converges.
The quantitative analysis is conducted based on 500 samples under mild snowfall conditions and the other 500 samples under fierce snow swirl conditions out of the testing dataset.The reported metrics in the following results all mean the average values.

Mild Snow Conditions
Figure 6 shows the translation results of our model under mild snow conditions, which means the majority of the snow is scattered noise points without the snow swirl phenomenon.(a), (b), and (c) sets show three typical scenarios with the top row being the original snow scene from CADC, the middle row being the de-snowed results, and the bottom row being the fake snow results.Each of the three scenarios presented features an overall BEV on the left, while the right shows a magnified third-person view of the point cloud's central region, where the ego vehicle is situated.As indicated by the red arrows and encircled by red boxes, it is clear that the 'saltand-pepper' noise points have been largely erased, with key environmental features left unaltered.Essential components, like vehicles (outlined in green) and pedestrians (highlighted in orange), are not only well preserved but also exhibit a level of point enhancement, as demonstrated in the de-snow (a) set.Moreover, the road sign enclosed in the red box of (a) which was partially obscured by snow points in the earlier image, seems to be better defined, a testament to the deep scene comprehension facilitated by our model.
The quantitative analyses are presented in Table 3.The noticeable reduction in the average noise number, cluster count, and overall reachability distances in the de-snowed results strongly suggests the effectiveness of the de-snowing process.As the majority of clusters now comprise object points and environmental features that are more densely and uniformly packed, the average inter-cluster distances, and average cluster sizes naturally increase.This shift in cluster characteristics is a byproduct of fewer, but more meaningful, clusters primarily representing substantive elements of the environment rather than scattered snow points.Similarly, the declines in the DBI and silhouette score are in line with our expectations for the de-snowing process.In the violin plots of Figure 7, the colored data on the left represents de-snowed data, while the gray data on the right serves as a comparison from the CADC dataset.This arrangement is consistent across all subsequent violin plots.A glance at the better evenness within the cluster distribution on the left half of each violin plot reveals the improvement of the de-snowing process compared to the slightly skewed distribution on the right.This observation is further substantiated by the lower skewness of the de-snowed distributions.Our calculations show that for the reachability distances, inter-cluster distances, and sizes of clusters, the skewness values for the de-snow data are 8.11, 0.23, and 16.10, respectively, while for the CADC data, these values are 9.64, 0.30, and 21.49.Note that the median reachability distance of the de-snow is a little bit higher than with snow.This small anomaly originates from a few detached clusters at a remote distance after de-snowing, which can be seen from very few sample points exceeding the upper limit of the y-axis.For fake snow generations, as seen in the bottom row of Figure 6, the scattered snow features are noticeably reproduced, and there is an apparent enhancement as the number of noise points is higher.This is in line with the noticeable increase in cluster number and DBI, as well as the reduction in cluster sizes, as presented in Table 3.The artificially generated snow demonstrates a remarkable replication capacity, as evidenced by the highly alike violin plots (left and right) in Figure 8, including the quartile lines.The degree of skewness (8.87, 0.33, and 22.43) is remarkably close to the previously mentioned CADC snow skewness (9.64, 0.30, and 21.49), further attesting to the model's ability to accurately reproduce snow effects.

Fierce Snow Conditions
Figure 9 demonstrates the translation outcomes of our model under intense snow conditions, characterized by the presence of snow swirls around the ego vehicle.Three distinctive scenarios have been chosen for illustration, and are presented in the same format as in Figure 6.In these harsh snowy conditions where the snowfall has dramatically increased, it becomes easier to observe that the vibrantly colored airborne snowdrifts (highlighted in shades of red, green, yellow, and cyan) have been substantially mitigated, as indicated by the red arrows.Under these severe snow circumstances featuring dense snow swirl clusters, our attention is more on the noise reduction near the ego vehicle, as indicated by the red boxes, instead of entirely eradicating the snow swirls, as this could lead to a loss of important environmental elements.We strive for a balance between significant snow removal and effective preservation of objects like the vehicles shown in the green boxes.Simultaneously, a certain degree of point cloud restoration can also be observed near the central ground rings.This can be credited to the profound comprehension of the scene by the translation model.
Table 4 provides a quantitative representation of the translation model's performance under extreme snow conditions.Given that the translation effects are applied to more clusters spanning the entire scene during heavy snowfall, all metrics largely veer towards less noise and tidier clustering results in the de-snowing task.From Figure 10, we can tell that the shifts in quartile lines are less prominent, which can be attributed to the fact that snow swirls typically have capacities similar to those of object clusters.Nevertheless, the efficacy of the de-snowing process is evidenced by the smoother and more consolidated distributions in the violin plots.This assertion is additionally validated by the slightly improved skewness of the de-snowed data which stand at 8.87, 0.38, and 28.04, respectively.Conversely, for the CADC data, these values are 10.45, 0.42, and 32.76.As can be observed from the bottom row of Figure 9, our model effectively replicates airborne snowdrifts and the snow swirls surrounding the ego vehicle.However, the model exhibits a slight restraint at the point cloud's central region.This outcome results from our strong focus on comprehending the driving scene, which is to avoid the broken integrity obstructed by the extremely dense snow swirl clusters.This situation leads to a somewhat compromised snow imitation, as corroborated by the close statistical outcomes in Table 4, which do not exhibit any significant jumps.Still, the near-symmetrical violin plots in Figure 11 further substantiate the successful emulation of snow effects.The smoother edges of reachability distances and the more concentrated distribution in cluster sizes of the imitation snow hint at a feature of reduced noise at the center.The skewness values associated with the artificially generated heavy snow are 9.95, 0.42, and 31.75, respectively.These values closely mirror those obtained from actual data, which are 10.45, 0.42, and 32.76, respectively.This statistical similarity provides additional evidence, affirming our model's effectiveness in the task of synthetic snow generation.

Ablation Study
To affirm the significance of our model's key components, we conduct an ablation study using the de-snow model under mild snow conditions.This study investigates the impact of the absence of the pixel-attention discriminator, SSIM loss, depth loss, and the basic CycleGAN.Additionally, we examine a training pair with a considerable domain gap.For this purpose, we select 6000 frames from the LIBRE dataset [1], which was collected under clear conditions in Nagoya, Japan's urban area.This choice serves as a representative due to the substantial domain disparity between Canada and Japan in terms of scenario layouts and traffic patterns.The CADC dataset contains a large portion of suburban scenarios with fewer buildings and more vegetation, which hardly appears in the LIBRE dataset.Table 5 presents the results, using our proposed model as a reference.The absence of the pixel-attention discriminator results in an immediate degradation in the performance, as evidenced by the increased noise number and reachability distance.Failing to remove a certain amount of solitary noise points substantiates the importance of the pixel-attention discriminator in de-snowing.
More noise points are observed in the scenario without SSIM loss.Apart from the slightly reduced cluster number, other metrics, especially the elevated reachability distance, indicate a breakdown in structural integrity during the translation process.A primary objective of our model is to maintain the crucial objects and environmental elements as effectively as possible, thus affirming the critical role of SSIM loss in our model.
The scenario without depth loss indicates a complete failure in de-snowing, as evidenced by the significant plummeting in all metrics toward noisy and poor clustering.The cause of this failure lies in the unique properties of depth images, which are highly sensitive to non-linear scale changes during the conversion back to point clouds.Consequently, the depth loss forms the cornerstone of our translation model based on depth images.
In the basic CycleGAN model, the mediocre statistics could be interpreted as an utter ineffectiveness in point cloud translation, without managing to preserve the original states either.This result underscores the necessity of all the components in our model for achieving successful translation outcomes.
Finally, when trained on datasets with a substantial domain gap, the model does not yield satisfactory de-snowing performance.This is suggested by the exceedingly high noise number, reachability distances, and low cluster sizes, at least under the same parameter settings as before.The unjustifiably high noise and cluster numbers are the result of poor clustering, which is corroborated by the exceedingly high DBI.This result, derived under extreme conditions, serves to confirm our judicious decision to generate unpaired clear data with filters.However, it does not necessarily suggest that our model lacks generality.Despite this, the model's robustness against domain gaps does stand as a major limitation of our current translation model.

Conclusions
In this research, we introduced a GAN-driven model capable of translating LiDAR point cloud data, encompassing both the removal of snow and artificial snow production.Utilizing depth image priors of point clouds, our model was trained on unpaired datasets and supplemented with depth loss and SSIM loss functions to ensure scale and structure consistencies.Furthermore, we crafted a novel discriminator structure with an emphasis on pixels.This feature was integrated alongside the original convolution layers, thereby enhancing the snow removal capability in the vicinity of the ego vehicle.Experiments carried out using authentic snow conditions from the CADC dataset revealed a profound comprehension of snow characteristics and driving scenes as well as exceptional performance in snow removal and snow reproduction.The 3D clustering results from the OPTICS algorithm and their corresponding violin plots evidently prove the successful translation between snow and clear conditions.
LiDAR point cloud processing under snowy conditions has consistently faced the challenge of lacking reliable snow-affected data.Given the difficulty in acquiring paired or quasi-paired data under both snowy and clear conditions, our current model must strike a balance between the strength of translation and model stability, which subsequently leads to domain sensitivity.Moreover, the limited resources of the CADC dataset intensify the adversity for training and testing.To address these limitations, our future goal is to develop the capability to generate high-quality paired data under snowy conditions.This aim is to augment the LiDAR point cloud with snow based on a deep understanding of the driving scene, with the ultimate intention of preserving the original state of the scene to the greatest extent possible.

Figure 1 .
Figure 1.A frame of the point cloud featuring both dispersed noise points and snow clusters.(a) The original point cloud from the Canadian Adverse Driving Conditions (CADC) dataset [2], with colors representing height-the lighter the color, the greater the distance from the ground.Red boxes annotate scattered snow and snow swirl points.(b) The clustering result of the same point cloud based on the OPTICS (Ordering points to identify the clustering structure) algorithm, where varying colors signify different cluster groups.Objects and structures with regularized forms are aptly segmented into large-scale clusters, while scattered snow noise points constitute smaller clusters with minimal points.The prominent purple cluster at the center represents the snow swirl.

Figure 2 .
Figure 2.An illustration of a specific frame of the depth image under heavy snow conditions.The middle row displays the depth image, while the top and bottom rows depict corresponding RGB images derived from the CADC(Canadian Adverse Driving Conditions) dataset [2].These images are captured from multiple cameras targeting different directions around the ego vehicle.Color-coded boxes indicate objects that are present in both the RGB and depth images.Green-A pole with a road sign and traffic lights.Yellow-Three trees.Red-A vehicle at the intersection.Blue-Two pick-up trucks waiting in line.Please note that the images provided do not represent the original resolutions; they have been adjusted to enhance their illustrative capacity.

Figure 3 . 5 :
Figure 3. Proposed LiDAR translation model architecture.Datasets A and B are in the form of depth images.G SC and G CS are the generators.D nA and D nB are the N-layer discriminators, and D p A and D p B are the pixel-attention discriminators.Algorithm 1 LiDAR Depth Images GAN (L-DIG) Require: Training data pairs (A, B) Snow A and Clear B Ensure: Generator networks G SC and G CS , N-Layer Discriminators D n A , D n B , and Pixel Discriminators D p A , D p B 1: Initialize generators G SC , G CS and N-Layer Discriminators D n A , D n B 2: Define loss functions including GAN, cycle, depth, and SSIM loss 3: Define optimizers for generators and discriminators 4: while epoch ≤ (total_epochs -continued_epochs) do 5:for each data pair (A, B) in data_loader do

9 :
Update discriminators D n A , D n B and generators G SC , G CS 10: end for 11: end while 12: Initialize Pixel Discriminators D p A , D p B 13: while (total_epochs -continued_epochs) < epoch ≤ total_epochs do 14: for each data pair (A, B) in data_loader do 15: Use the same generators to produce fake images as in previous training 16: Compute GAN, cycle, depth, and SSIM loss 17: Update discriminators D p A , D p B , D n A , D n B and generators G SC ,

Figure 4 .
Figure 4.The clustering results of three example scenes from the OPTICS algorithm, where varying colors signify different cluster groups.(a) Slight snowfall condition with bits and pieces of snow points.(b) Normal snowfall condition with both scattered snow points and snow clusters around the center.(c) Fierce snow swirl condition with huge snow swirl clusters surrounding the ego vehicle.

Figure 5 .
Figure 5. Violin plots for the comparison between mild snow and heavy snow.Limits on the y-axes are set for a better illustration of the distributions.

Figure 6 .
Figure 6.Qualitative results of point cloud translations in scattered snow conditions (without snow swirls).Colors are encoded by height.The top row is the snow conditions from the CADC dataset, the middle row is our de-snowed results, and the bottom row is the fake snow results obtained from our model.Three scenarios are presented, with the left sides of each set being the overall BEV scenes and the right sides of each set being the enlarged third-person view center part around the ego vehicle.Red boxes and arrows denote the locations where snow's effects are alleviated.Green boxes annotate vehicles.Orange boxes annotate pedestrians.

Figure 7 .
Figure 7. Violin plots for de-snow results under mild snow conditions.Limits on the y-axes are set for a better illustration of the distributions.

Figure 8 .
Figure 8. Violin plots for fake snow results under mild snow conditions.Limits on the y-axes are set for a better illustration of the distributions.

Figure 9 .
Figure 9. Qualitative results of point cloud translations in snow conditions (with snow swirls).Colors are encoded by height.The top row is the snow conditions from the CADC dataset, the medium row is our de-snowed results, and the bottom row is the fake snow results obtained from our model.Three scenarios are presented, with the left sides of each set being the overall BEV scenes and the right sides of each set being the enlarged third-person view center part around the ego vehicle.Red boxes and arrows denote the locations where snow's effects are alleviated.Green boxes annotate vehicles.

Figure 10 .
Figure 10.Violin plots for de-snow results under fierce snow conditions with swirls.Limits on the y-axes are set for a better illustration of the distributions.

Figure 11 .
Figure 11.Violin plots for fake snow results under fierce snow conditions with swirls.Limits on the y-axes are set for a better illustration of the distributions.

Table 1 .
Annotation of symbols in the model architecture.

Table 2 .
The 3D clustering metrics from the OPTICS algorithm for ascending snow levels.

Table 3 .
The 3D clustering metrics (avg.) from the OPTICS algorithm under scattered snow conditions.

Table 4 .
The 3D clustering metrics (avg.) from the OPTICS algorithm under snow swirls conditions.

Table 5 .
Ablation study based on the de-snow model under mild snow conditions.