REHEARSE-3D: A Multi-Modal Emulated Rain Dataset for 3D Point Cloud De-Raining
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper introduces REHEARSE-3D, a large-scale multimodal dataset designed for studying de-raining in 3d point clouds. The dataset combines high-resolution LiDAR-256 and 4D Radar point clouds captured under controlled environment and lighting conditions (day/night). It is the first dataset with point-wise semantic annotations of raindrops as a dedicated class, enabling supervised benchmarking of de-raining methods. The dataset is publicly released, and the authors present a comprehensive benchmark of statistical and deep learning-based de-raining methods. Overall, the paper is technically sound and well-structured.
REHEARSE-3D is a large dataset and covers multiple rain intensities and lighting conditions with point-wise rain annotations. The dataset could be valuable for sensor modeling and sim-to-real research. The paper presents a thorough comparison of statistical and deep learning methods, considering runtime, which is vital for real-time autonomous driving. The comparison between simulated rain and real emulated rain is interesting and highlights the limitations of purely simulated weather models. Although the manuscript is well-written and easy to follow, a few comments could further enhance the paper.
1) While the generalization beyond static scenes is acknowledged as a limitation, please discuss more explicitly how the conclusion drawn from REHEARSE-3D may generalize to dynamic driving scenarios.
2) Radar labels are obtained through nearest-neighbor transfer from LiDAR annotations, which assumes accurate extrinsic calibration. However, rain may affect Radar and LiDAR differently, and this label transfer may introduce some label noise. It would be interesting to see a brief quantitative or qualitative analysis of label transfer, along with a discussion on its impact on benchmarks.
3) Although the data is multimodal, the benchmark focuses on early fusion of both modalities, and most baselines are originally LiDAR-centric. Please clarify whether the models use the Radar-specific features and why the late fusion or modality-specific architectures are not evaluated.
4) Add the paper "Doppler-Aware LiDAR-RADAR Fusion for Weather-Robust 3D Detection" as related work and clarify how REHEARSE-3D enables complementary research.
Author Response
|
Comments 1: While the generalization beyond static scenes is acknowledged as a limitation, please discuss more explicitly how the conclusion drawn from REHEARSE-3D may generalize to dynamic driving scenarios.
|
|
Response 1: We thank the reviewer for his / her comment. As the reviewer mentioned, the generalization beyond static scenes is acknowledged as a limitation. While REHEARSE-3D focuses on static scenes, the conclusions regarding sensor noise modeling and point-level weather degradation remain highly relevant to dynamic scenarios. No matter whether the surrounding objects are moving or stationary, the physical interaction between the LiDAR beam and atmospheric particles (rain) will, to a large extent, introduce noise to be removed. Our experimental findings on how weather impacts point-cloud density and intensity at the local level offer foundational "noise prior" knowledge which is essential for downstream tasks, such as dynamic object detection/segmentation, as it enables algorithms to more effectively decouple environmental noise from actual object motion.
|
|
Comments 2: Radar labels are obtained through nearest-neighbor transfer from LiDAR annotations, which assumes accurate extrinsic calibration. However, rain may affect Radar and LiDAR differently, and this label transfer may introduce some label noise. It would be interesting to see a brief quantitative or qualitative analysis of label transfer, along with a discussion on its impact on benchmarks. |
|
Response 2: We thank the reviewer for his / her comment. We agree with this observation. However, in our experimental evaluations, we found that rain points are rarely detected by our RADAR setup due to the sparsity of RADAR point clouds. Consequently, the impact of noise is negligible.
Comments 3: Although the data is multimodal, the benchmark focuses on early fusion of both modalities, and most baselines are originally LiDAR-centric. Please clarify whether the models use the Radar-specific features and why the late fusion or modality-specific architectures are not evaluated.
Response 3: We thank the reviewer for this insightful comment. As previously noted, the sparsity of RADAR point clouds prevents us from incorporating RADAR-specific features (e.g., velocity) into the early fused point clouds. For the same reason, we did not explore alternative fusion strategies such as mid- or late-fusion. We strongly believe that a comprehensive experimental evaluation of different fusion methods lies beyond the scope of this study, as each modality demands dedicated backbones (e.g., specialized encoders for RADAR’s Doppler and RCS channels) to extract meaningful features prior to fusion.
Comments 4: Add the paper "Doppler-Aware LiDAR-RADAR Fusion for Weather-Robust 3D Detection" as related work and clarify how REHEARSE-3D enables complementary research.
Response 4: We thank the reviewer for pointing us toward this interesting study. While we acknowledge the importance of the suggested work in the context of 3D object detection and LiDAR-RADAR fusion, our research focuses on a fundamentally different task: “point-level semantic labeling for weather-induced noise removal”. Our dataset is specifically curated and annotated for environmental noise analysis rather than object detection; as such, it does not currently include 3D bounding box annotations. Because the suggested paper focuses on a different problem domain (3D detection) and requires ground-truth labels that are not present in our study, we consider it to be outside the scope of our work.
|
|
4. Response to Comments on the Quality of English Language |
|
Point 1: The English is fine and does not require any improvement. |
|
Response 1: We thank the reviewer for his / her comment.
|
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript presents REHEARSE-3D, a multi-modal dataset and benchmarking framework for point-cloud de-raining, featuring high-density LiDAR-256 and 4D radar data. It further provides point-level semantic annotations and comparative experiments under different rain intensities. A key strength of the work is the use of an emulated rainfall setup to generate controllable yet realistic rain-induced degradations, which helps bridge the gap between purely simulated rain artifacts and real sensor degradation. In order to further improve the quality of the paper, I would like to offer the following suggestions for revision.
- Around Line 286, the phrase “to assess the 3D point cloud draining performance” appears to contain a typo. In addition, I recommend adopting a consistent spelling throughout the manuscript (e.g., “raindrop” vs. “rain drop”, and “de-raining” vs. “deraining”) to improve professionalism and readability.
- The manuscript uses the metric mIoU, but the definition provided in the text corresponds to the IoU of a single target class (e.g., the “rain” class) rather than a mean computed across multiple classes. Please check the description.
- In point-level de-raining and robust perception, “rain/splash/noise-like returns” typically constitute a minority (long-tailed) class (such as DOI: 10.1109/TPAMI.2025.3553051), which can introduce class imbalance and cognitive bias during training. It is suggested to expand the discussion on this issue in order to enhance the applicability of the Dataset.
Author Response
Comments 1: The manuscript presents REHEARSE-3D, a multi-modal dataset and benchmarking framework for point-cloud de-raining, featuring high-density LiDAR-256 and 4D radar data. It further provides point-level semantic annotations and comparative experiments under different rain intensities. A key strength of the work is the use of an emulated rainfall setup to generate controllable yet realistic rain-induced degradations, which helps bridge the gap between purely simulated rain artifacts and real sensor degradation. In order to further improve the quality of the paper, I would like to offer the following suggestions for revision.
Response 1: We thank the reviewer for his / her comment.
Comments 2: Around Line 286, the phrase “to assess the 3D point cloud draining performance” appears to contain a typo. In addition, I recommend adopting a consistent spelling throughout the manuscript (e.g., “raindrop” vs. “rain drop”, and “de-raining” vs. “deraining”) to improve professionalism and readability.
Response 2: We thank the reviewer for this helpful comment. We corrected all these typos and ensured spelling consistency throughout the manuscript. All modifications are highlighted in red. Below is a detailed list of the changes made:
· “to assess the 3D point cloud draining performance” >> “to assess the 3D point cloud de-raining performance”
· “Benchmark on 3D Point Cloud Deraining” >> “Benchmark on 3D Point Cloud De-raining”
· “Such point cloud denoising (a.k.a. deraining) is a binary segmentation task where each rain drop is treated as an outlier to be filtered out.” >> “Such point cloud denoising (a.k.a. de-raining) is a binary segmentation task where each raindrop is treated as an outlier to be filtered out.”
· “to evaluate the performance of several statistical and deep learning models on deraining REHEARSE-3D point clouds” >> “to evaluate the performance of several statistical and deep learning models on de-raining REHEARSE-3D point clouds”
· “Rain drop detection results on the REHEARSE-3D validation and test splits.” >> “Raindrop detection results on the REHEARSE-3D validation and test splits.”
· “SalsaNext stands out as the best model for deraining 3D point clouds” >> “SalsaNext stands out as the best model for de-raining 3D point clouds”
Comments 3: The manuscript uses the metric mIoU, but the definition provided in the text corresponds to the IoU of a single target class (e.g., the “rain” class) rather than a mean computed across multiple classes. Please check the description.
Response 3: We thank the reviewer for his / her comment. We agree with the reviewer and changed mIoU to IoU, highlighting the change in red in the manuscript.
Comments 4: In point-level de-raining and robust perception, “rain/splash/noise-like returns” typically constitute a minority (long-tailed) class (such as DOI: 10.1109/TPAMI.2025.3553051), which can introduce class imbalance and cognitive bias during training. It is suggested to expand the discussion on this issue in order to enhance the applicability of the Dataset.
Response 4: We thank the reviewer for the comment and fully agree with this observation. To clarify this point, we have added the following paragraph to the discussion section:
“Similar to the WADS, SemanticSpray, and WeatherNet datasets, our dataset also exhibits class imbalance, as the number of points generated from raindrops constitutes only a small portion of the overall point cloud. Since this issue is common across all real-world scanned datasets, all baseline models adopt strategies such as class-weighted loss functions to mitigate the imbalance.”
4. Response to Comments on the Quality of English Language
Point 1: The English could be improved to more clearly express the research.
Response 1: We thank the reviewer for the comment. We corrected all typos, improved spelling consistency throughout the manuscript, and performed an additional round of proofreading.
Reviewer 3 Report
Comments and Suggestions for Authors
These authors present REHEARSE-3D as a new multimodal dataset for weather-aware autonomous driving research. This is an interesting and relevant contribution, as it details important precipitation-related features, including rain intensity, droplet size distribution, wind data, and visibility information. However, a major concern is that several sections of the manuscript are not well organized and lack cohesion. For example, the abstract, discussion, and conclusion sections should be clearly aligned and consistently reflect the main contributions and findings of the work. Improving the logical flow and structural consistency across these sections would significantly enhance the clarity and impact of the manuscript. Addressing the following observations and suggestions would substantially improve the overall quality of the paper.
Abstract section>>
The abstract, in its current form, lacks both quantitative and qualitative information regarding the results obtained. Without the inclusion of specific data points, performance metrics, or illustrative qualitative insights, the reader is left without a clear understanding of the study’s empirical contributions.
Introduction section>>
Adding a brief summary at the end of the Introduction would give a better impression to readers. Such a summary is commonly expected in academic writing and helps clearly position the contributions of the work within the existing literature.
Discussion section>>
The Discussion section could be strengthened by including at least two references to relevant and recent works. This would help contextualize the results, highlight similarities or differences with existing approaches, and reinforce the contribution of the study.
Conclusion section>>
In the conclusion section the authors did not mention the evaluation of the performance of various statistical and deep-learning models. Another issue is that this section is too short. I would suggest separate the section 5 for better flow of the lecture.
**Although the following words are well known in the field I would suggest defining them : LiDAR and RADAR.
** Please standarize the format ‘’ light (10 mm/h), medium (25 mm/h), and heavy (50 56
mm/h)’’. In some many setences appear in two different formats (normal text in line 56 and equation text in line 147.)
** It is no clear how that deep neural networks: SalsaNext [34], LiSnowNet-L1 [35], 273 and 3D-OutDet, were selected.
** Give a brief description of the protocol refenced as ‘’training protocol in [3] ‘’.
** This is the same case as the radar word to be stadarized. RADAR in line 223 and Radar in line 42
Comments on the Quality of English Language
no problems detected.
Author Response
|
Comments 1: These authors present REHEARSE-3D as a new multimodal dataset for weather-aware autonomous driving research. This is an interesting and relevant contribution, as it details important precipitation-related features, including rain intensity, droplet size distribution, wind data, and visibility information. However, a major concern is that several sections of the manuscript are not well organized and lack cohesion. For example, the abstract, discussion, and conclusion sections should be clearly aligned and consistently reflect the main contributions and findings of the work. Improving the logical flow and structural consistency across these sections would significantly enhance the clarity and impact of the manuscript. Addressing the following observations and suggestions would substantially improve the overall quality of the paper.
|
|
Response 1: We thank the reviewer for the comment. We have revised the aforementioned sections, as detailed below.
|
|
Comments 2: Abstract section>> The abstract, in its current form, lacks both quantitative and qualitative information regarding the results obtained. Without the inclusion of specific data points, performance metrics, or illustrative qualitative insights, the reader is left without a clear understanding of the study’s empirical contributions.
|
|
Response 2: We thank the reviewer for the comment. To improve the clarity of our empirical contribution, we added the following information to the abstract and highlighted it in red.
“...First, it is the largest point-wise annotated (9.2 billion annotated points)...Our comprehensive study further evaluates the performance of various statistical and deep-learning models, where SalsaNext and 3D-OutDet achieve above 94% IoU for raindrop detection.”
Comments 3: Introduction section>>
Response 3: We thank the reviewer for his / her comment. We would like to note that we have provided an overall summary of our contribution as a list at the end of the introduction. For the sake of clarity, we provide it here one more time:
In short, our contributions are listed below: · We introduce a new large-scale multi-modal emulated rain dataset, named REHEARSE-3D, with 9.2 billion point annotations, logged in various rain intensities in daytime and nighttime conditions in a controlled weather environment. · Leveraging REHEARSE-3D, we benchmark various state-of-the-art denoising algorithms to de-rain the early-fused LiDAR and RADAR point clouds. · We use the point cloud from clean weather and statistically generate synthetic raindrops to study the emulated-to-simulated domain gap.
Comments 4: Discussion section>>
Response 4: We thank the reviewer for his / her comment. We agree with the reviewer. We separated out the Discussion section from the conclusion and added the following text to the discussion with proper citations: “Compared to other similar datasets, e.g., WADS[7], SemanticSpray[8], and WeatherNet[9], our proposed REHEARSE-3D has more annotated points across eight different semantic classes. As the scenes were static, no new classes were introduced after the scenes were set up. We also provide a plethora of information collected from other sensors, which makes REHEARSE-3D unique. Our experimental results show that deep learning-based models excel at raindrop detection, whereas the statistical models struggle with it (see Table 2). We also show that the gap between simulated and emulated data is quite large (see Table 3). Therefore, we argue that simulated data alone is insufficient for studying the nature of weather-generated point cloud noise.”
Comments 5: Conclusion section>> Response 5: We thank the reviewer for the comment. We agree with the reviewer's suggestion to provide separate Discussion and Conclusion sections. We have also added the following text to the conclusion: “As shown in Table 2, the supervised 3D-OutDet and SalsaNext models deliver robust performance, achieving 94.03% and 94.92% IoU, respectively. Conversely, the unsupervised models struggle significantly; DSOR reaches a peak IoU of only 20.35%. This performance gap highlights the utility of our dataset as a rigorous testbed for developing more effective unsupervised raindrop detection algorithms.”
Comment 6: **Although the following words are well known in the field I would suggest defining them : LiDAR and RADAR.
Response 6: We thank the reviewer for his / her comment. In the Introduction, we have included the full definitions for LiDAR (Light Detection and Ranging) and RADAR (Radio Detection and Ranging).
Comment 7: Please standarize the format ‘’ light (10 mm/h), medium (25 mm/h), and heavy (50 56
Response 7: We thank the reviewer for his / her comment. We switched to the standard equation format.
Comment 8: ** It is no clear how that deep neural networks: SalsaNext [34], LiSnowNet-L1 [35], 273 and 3D-OutDet, were selected.
Response 8: We thank the reviewer for his / her comment. These models represent different model groups: SalsaNext is a general-purpose supervised segmenter that can be used for denoising; 3D-OutDet is a supervised denoiser; and LiSnowNet-L1 is an unsupervised denoiser. We added the following text to section 4.1: “These models were selected to represent distinct segmentation categories: SalsaNext as a supervised general-purpose segmenter, 3D-OutDet as a supervised point cloud denoiser, and LiSnowNet as an unsupervised denoiser.”
Comment 9: ** Give a brief description of the protocol refenced as ‘’training protocol in [3] ‘’.
Response 9: We thank the reviewer for his / her comment. We added the following text to section 4.1:
“In brief, the training protocol adopts the default hyperparameters for SalsaNext and LiSnowNet as specified in their original publications. In contrast, 3D-OutDet was trained using a learning rate of $1e-2$, a neighborhood size of 9, and a loss function combining weighted cross-entropy and Lovász-Softmax.”
Comment 10: ** This is the same case as the radar word to be stadarized. RADAR in line 223 and Radar in line 42
Response 10: We thank the reviewer for pointing this out. We have standardized the spelling of 'RADAR' throughout the manuscript to ensure consistency.
|
|
4. Response to Comments on the Quality of English Language |
|
Point 1: no problems detected.
|
|
Response 1: We thank the reviewer for the comment.
|
Reviewer 4 Report
Comments and Suggestions for Authors- The manuscript repeatedly frames REHEARSE-3D as „multi-modal", but point-wise semantic labels are explicitly limited to MEMS LiDAR and 4D Radar. Please rephrase the contribution to „multi-sensor recordings with dual-modal point-wise annotations", or extend annotations to the remaining modalities.
- Because target objects are placed at fixed positions, models may overfit to scene geometry rather than learning rain artefacts. How did the authors mitigate this risk?
- Radar labels are obtained by transferring from the nearest LiDAR points but this can introduce non-trivial noise due to different sampling density, sensing physics, and calibration or synchronization errors. Please document the exact transforms, NN thresholds, and provide an error estimation.
- Non-uniform rain intensity is attributed to wind, but there is no quantitative analysis of wind regimes or correlation with model errors. Please report wind direction and speed distributions.
- The paper notes strong class imbalance across the 8 semantic classes, yet does not provide mitigation or per-class reporting. Please add per-class metrics and consider class oversampling where relevant.
- Authors define precision/recall/F1 and mIoU, but it is unclear whether mIoU is computed for binary rain/non-rain or for the full 8-class setup.
- Authors report scan-level split sizes, but they do not state whether the split is performed by sequence. With repeated or static layouts, scan-level splitting risks leakage. Please explicitly state „split by sequence" and verify that there are no near-duplicate layouts across splits.
Author Response
|
Comments 1: The manuscript repeatedly frames REHEARSE-3D as „multi-modal", but point-wise semantic labels are explicitly limited to MEMS LiDAR and 4D Radar. Please rephrase the contribution to „multi-sensor recordings with dual-modal point-wise annotations", or extend annotations to the remaining modalities.
|
|
Response 1: We thank the reviewer for the comment. However, we respectfully disagree with this perspective. While annotation is a prerequisite for supervised learning, it is not required for unsupervised learning. A multimodal dataset is defined by the inclusion of multiple data modalities—which our dataset provides—rather than by annotating every modality. This approach is standard in the field; for instance, the Canadian Adverse Weather Dataset is a recognized multimodal dataset that provides annotations only for point clouds. Similarly, the KITTI dataset offers multimodal data but provides only LiDAR-based bounding box labels.
|
|
Comments 2: Because target objects are placed at fixed positions, models may overfit to scene geometry rather than learning rain artefacts. How did the authors mitigate this risk?
|
|
Response 2: We thank the reviewer for this comment. We appreciate the reviewer’s perspective. However, we would like to clarify that benchmarking for target objects was intentionally excluded because the targets remained stationary throughout the data collection. Our primary objective is to evaluate de-raining algorithms, and our results demonstrate that the models successfully learn to identify rain artifacts rather than overfitting to static scene geometry. We note that object detection falls outside the scope of this particular study.
Comments 3: Radar labels are obtained by transferring from the nearest LiDAR points but this can introduce non-trivial noise due to different sampling density, sensing physics, and calibration or synchronization errors. Please document the exact transforms, NN thresholds, and provide an error estimation. Response 3: We thank the reviewer for this comment. Due to the inherent sparsity of RADAR data relative to LiDAR, the projection of labels from LiDAR to RADAR inevitably introduces a degree of noise; however, we consider this effect negligible given the RADAR point cloud sparsity. The spatial transformation used in this study follows the methodology established in the original REHEARSE dataset publication (Poledna et al.) and is also available in the original downloadable content. Therefore, we did not elaborate on the transformation process. Regarding the nearest neighbor threshold, we utilized a value of 5 and have updated section 3.2 accordingly:
“The corresponding 4D RADAR sparse point clouds are then automatically annotated by transferring labels from the nearest LiDAR points, utilizing a nearest-neighbor threshold of k=5”
Comments 4: Non-uniform rain intensity is attributed to wind, but there is no quantitative analysis of wind regimes or correlation with model errors. Please report wind direction and speed distributions.
Response 4: We thank the reviewer for his / her comment. To clarify the matter we have added the following text in the manuscript: “At a rain intensity of 10 mm/h, our sensor suite recorded an average wind speed of 4.14 km/h with a standard deviation of 1.33 km/h. The average wind direction is 24.7°, exhibiting a high standard deviation of 27.4°. This significant variance is expected, reflecting the natural fluctuations in wind direction during precipitation events.”
Comments 5: The paper notes strong class imbalance across the 8 semantic classes, yet does not provide mitigation or per-class reporting. Please add per-class metrics and consider class oversampling where relevant.
Response 5: We thank the reviewer for this comment. We would like to clarify that our work focuses specifically on a single-class task—de-raining—rather than multi-class segmentation. Consequently, per-class metrics for all categories do not directly reflect the performance of our raindrop detection and removal objectives. However, we provide annotations for all eight classes to facilitate further research by the community. The common practice for addressing the inherent class imbalance in the dataset is to use weighted loss functions with weights inversely proportional to class frequencies.
Comment 6: Authors define precision/recall/F1 and mIoU, but it is unclear whether mIoU is computed for binary rain/non-rain or for the full 8-class setup.
Response 6: We thank the reviewer for the comment and agree with this observation. The metrics were indeed calculated according to the provided equation; the use of 'mIoU' was a case of typo. We have corrected this to IoU throughout the revised manuscript.
Comment 7: Authors report scan-level split sizes, but they do not state whether the split is performed by sequence. With repeated or static layouts, scan-level splitting risks leakage. Please explicitly state „split by sequence" and verify that there are no near-duplicate layouts across splits.
Response 7: We thank the reviewer for the comment. We would like to clarify that the scene is intended to be static in order to provide a consistent framework for studying raindrops. Since the key focus of our study is raindrop detection or filtering, there is no risk of scan-level data leakage, as raindrop patterns are stochastic and vary across scans due to changes in wind velocity and direction. The analysis of the static layout itself is not the intended purpose of REHEARSE-3D.
|
|
4. Response to Comments on the Quality of English Language |
|
Point 1: The English is fine and does not require any improvement.
|
|
Response 1: We thank the reviewer for their comment |
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have adequately addressed the reviewer’s comments and recommendations. The revised manuscript shows a clear improvement in clarity and quality. In its current form, the manuscript is considered satisfactory, and no further comments or questions are raised at this stage. The discussion and conclusion sections have been clearly separated, resulting in a more logical structure and an improved narrative flow for the reader.
Comments on the Quality of English Languageno problems detected.
