RSICDNet: A Novel Regional Scribble-Based Interactive Change Detection Network for Remote Sensing Images
Highlights
- We propose RSICDNet, an interactive change detection model with regional scribble interaction. The model leverages regional scribble interactions, which provide rich spatial priors, and incorporates an Interaction Fusion and Refinement Module (IFRM) to effectively fuse these interactions with high-level semantic features for efficient change detection.
- We develop a human–computer interactive change detection application based on RSICDNet, which significantly improves the efficiency of change label annotation.
- Experimental results on three public datasets demonstrate that RSICDNet outperforms mainstream interactive models in interaction efficiency, validating its superiority.
- A significant performance gain in interactive change detection is achieved by integrating the proposed regional scribble interaction as an efficient paradigm and the IFRM as an effective fusion module.
Abstract
1. Introduction
- We propose a regional scribble-based interaction method that effectively captures spatial priors—such as the location, shape, and structure of changed areas—significantly enhancing the interaction efficiency of the model. Furthermore, an automated regional scribble generation approach is developed to simulate regional scribble interactions during model training and evaluation, thereby streamlining the workflow.
- An Interaction Fusion and Refinement Module (IFRM) is proposed, which effectively enhances the feature representation of bi-temporal imagery by fusing interactive features with high-level semantic features, thus significantly improving the change area perception capability of RSICDNet.
- By integrating RSICDNet with a graphical user interface (GUI), we develop an interactive application that simplifies and accelerates change annotation, thereby facilitating the assembly of CD datasets.
2. Materials and Methods
2.1. The RSICDNet Architecture
2.2. Regional Scribble Interaction and Its Automated Simulation
- Obtain positive and negative sampling areas: The masks for the positive sampling area () and negative sampling area () are obtained based on the ground truth () and the previous CD result (). Specifically, during the initial sampling round—due to the absence of —the sampling areas are initialized directly from . More precisely, if change areas exist in , they are designated as ; otherwise, itself is treated as . During iterative sampling, the masks for missed detection areas and false positive areas are computed by evaluating the discrepancies between and . These resulting discrepancy masks are then used as and , respectively.
- Sampling area preprocessing: To address the inevitable annotation ambiguity and high uncertainty near the boundaries of the sampling area masks, a morphological erosion operation is applied for optimization [38]. During the erosion of these masks, 1 to 5 iterations are randomly performed. The dual purpose of this strategy is to mitigate the influence of boundary noise while enhancing sampling diversity. While this enables the generation of variably sized regional scribbles that better simulate real user interactions, it is crucial to avoid excessive iterations, which would unduly shrink the effective sampling area. Therefore, the upper limit for the number of iterations is set to 5.
- Determine final sampling area and type: The largest connected components of the positive and negative sampling areas ( and ) are extracted and their areas are compared. If the area of is larger, it is designated as the final sampling area (), and a positive regional scribble () is sampled within it. Otherwise, is used as for sampling a negative regional scribble ().
- Generate three shapes of regional scribbles: Based on , three shapes of regional scribbles are generated, including a rectangular scribble (), triangular scribble (), and circular scribble (), as shown in Figure 5. Specifically, is obtained by extracting the inscribed rectangle of ; is obtained by randomly selecting three vertices from and connecting them sequentially; and is obtained by extracting the largest inscribed circle of .
- Output regional scribble: One scribble is randomly selected from the three generated shapes (, , ) and provided to RSICDNet as either or .
| Algorithm 1 Automated Regional Scribble Generation Method |
| : previous CD result. |
| : Negative regional scribble. |
| Step1: Obtain positive and negative sampling areas |
| exists then |
| 1) |
| 0) |
| 4: else |
| contains change areas then |
| ) |
| 8: else |
| ) |
| 11: end if |
| 12: end if |
| Step2: Sampling areas preprocessing |
| randint(1, 5)) |
| randint(1, 5)) |
| Step3: Determine final sampling area and type |
| ) |
| ) |
| ) then |
| true |
| 20: else |
| false |
| 23: end if |
| Step4: Generate three shapes of regional scribbles |
| ) |
| ) |
| , 3)) |
| ) |
| Step5: Output regional scribble |
| true then |
| ) |
| None |
| 31: else |
| None |
| ) |
| 34: end if |
2.3. Interactive Feature Fusion and Refinement
2.4. Loss Function
3. Results
3.1. Datasets
3.2. Experimental Environment and Parameter Settings
3.3. Evaluation Metrics
3.4. Comparative Experiments
- RITM [33]: This method employs HRNet-W32 as its backbone network. It processes disk-encoded click interactions through a Conv1S module and performs feature fusion via element-wise addition. Furthermore, it leverages the previous prediction as an additional input to enhance stability across iterative refinements.
- FocalClick [35]: This method employs HRNet-W32 as its backbone network. It utilizes two convolutional layers to adjust the dimensionality of the click maps and performs feature fusion at the shallow layers of the backbone. By predicting and updating masks within local regions, it effectively enhances model efficiency.
3.4.1. WHU-CD Dataset
3.4.2. LEVIR-CD Dataset
3.4.3. CLCD Dataset
4. Discussion
4.1. Ablation Study
- Base: The baseline model for RSICDNet. It performs interactive feature fusion only at the shallow layers of the HRNet backbone and employs click-based interaction.
- Base + IFRM: This variant introduces the IFRM to the Base, fusing click interactions with high-level semantic features.
- Base + RSI + CSE: This variant replaces click-based interaction with regional scribble interaction and processes this input with the CSE-equipped IPSNet.
- Base + RSI + IFRM: This variant incorporates both regional scribble interaction and the IFRM, but does not equip the IPSNet with the CSE.
- RSICDNet: The complete model proposed in this paper, integrating regional scribble interaction, the CSE, and the IFRM.
4.2. Comparison with End-to-End Models
- Spatial-Temporal Attention Neural Network (STANet) [17]: This method employs a Siamese convolutional network as the encoder and incorporates a spatial–temporal attention mechanism in the decoder to capture spatial–temporal dependencies.
- ChangeFormer [18]: This method utilizes a Siamese Transformer encoder to extract multi-scale features and employs an MLP in the decoder to generate the change map.
- ChangeViT [20]: This method employs a plain ViT as the feature extractor. It introduces a detail-capture module to address ViT’s limitations in identifying small objects and merges the extracted detailed features with high-level semantic features through a feature injector.
- CD-Lamba [23]: This method employs a Locally Adaptive State-Space Scan strategy to enhance bi-temporal local perception, and achieves pixel-wise cross-fusion through a Cross-Temporal State-Space Scan strategy.
4.3. Model Complexity Analysis
4.4. Human–Computer Interactive Change Detection Application
- Image import and browsing: Users can import pre-change and post-change images into the application via the “Import T1 Image” and “Import T2 Image” buttons. The application then displays the two images in two separate graphics views. Users can zoom and pan the graphics views using the mouse for flexible browsing of the bi-temporal images.
- Interactive change detection: After switching to annotation creation mode, users can click the “Interactive Model” radio button to load the pre-trained weights and deploy RSICDNet to the specified device. The application then allows users to invoke RSICDNet for ICD by drawing regional scribbles with the mouse in the graphics views, where pressing and dragging with the left mouse button draws a positive interaction and doing so with the right mouse button draws a negative one. Furthermore, iterative prediction can be performed by adding interactions to progressively refine the CD results until satisfactory. If an erroneous scribble is drawn, users can click the “Undo Interaction” button to undo the last scribble, and the application will automatically restore the CD results to the previous state. Finally, clicking the “Finish Detection” button creates annotation instances based on the current results.
- Annotation instance recording and management: The application records all created annotation instances and displays their attributes in the “Annotation Management” table. In annotation adjustment mode, users can manage selected instances, such as modifying their class, editing notes, or deleting them.
- Change map generation and export: Clicking the “Generate Mask” button produces a change map based on the annotation instances. The application supports three output types: binary, grayscale, and color maps. The generated change map is displayed in the “Mask Display” graphics view for inspection. Finally, users can click the “Export Mask” button to export the change map as an image file to a specified file path.
4.5. Limitations and Future Work
- By employing regional scribble interaction, RSICDNet requires fewer user inputs to handle complex changes. However, click-based interaction still holds advantages in terms of operational simplicity. The user workload and time cost of drawing a regional scribble are relatively high, which could somewhat affect practical interaction efficiency. Future work will aim to reduce the interaction burden of regional scribbles and further explore synergistic mechanisms among different interaction forms, thereby enhancing the flexibility and efficiency of ICD systems.
- The limitation of generating only regular scribbles fails to fully emulate real user behavior. Consequently, the model may develop a bias towards these shapes during training, limiting its ability to generalize to diverse, real-world interactions. Future work will focus on two key improvements: advancing free-form scribble simulation and refining automated generation, both aimed at more accurately emulating real user behavior.
- The proposed RSICDNet follows a fully supervised learning paradigm, which heavily relies on large amounts of annotated data. Building on rapid advances in visual foundation models, future work will explore their integration with RSICDNet. We aim to capitalize on their zero-shot generalization to develop robust semi- and weakly supervised ICD methods, thereby enhancing performance on unseen scenarios and strengthening cross-domain generalization.
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| CD | Change detection |
| DLCD | Deep learning-based change detection |
| CNN | Convolutional neural network |
| HRNet | High-resolution network |
| OCR | Object-contextual representation |
| MLP | Multilayer perceptron |
| ViT | Vision transformer |
| SSM | State space models |
| SAM | Segment anything model |
| f-BRS | Feature backpropagation refinement scheme |
| ICD | Interactive change detection |
| RSICDNet | Interactive change detection model with regional scribble interaction |
| IFRM | Interaction fusion and refinement module |
| GUI | Graphical user interface |
| IPSNet | Interaction processing sub-network |
| CDSNet | Change detection sub-network |
| CSE | Contour-skeleton extractor |
| ECA | Efficient channel attention |
| NFL | Normalized focal loss |
| NoI | Number of interactions |
| IoU | Intersection over union |
| OA | Overall accuracy |
| F1 | F1-score |
| DMF | Distance maps fusion |
| RSI | Regional scribble interaction |
| STANet | Spatial-temporal attention neural network |
| MACs | Multiply-accumulate operations |
References
- Tian, S.; Zhong, Y.; Zheng, Z.; Ma, A.; Tan, X.; Zhang, L. Large-scale deep learning based binary and semantic change detection in ultra high resolution remote sensing imagery: From benchmark datasets to urban application. ISPRS J. Photogramm. Remote Sens. 2022, 193, 164–186. [Google Scholar] [CrossRef]
- Pelletier, F.; Cardille, J.A.; Wulder, M.A.; White, J.C.; Hermosilla, T. Inter- and intra-year forest change detection and monitoring of aboveground biomass dynamics using Sentinel-2 and Landsat. Remote Sens. Environ. 2024, 301, 113931. [Google Scholar] [CrossRef]
- Zou, Y.; Shen, T.; Chen, Z.; Chen, P.; Yang, X.; Zan, L. A Transformer-Based Neural Network with Improved Pyramid Pooling Module for Change Detection in Ecological Redline Monitoring. Remote Sens. 2023, 15, 588. [Google Scholar] [CrossRef]
- Wang, X.; Fan, X.; Xu, Q.; Du, P. Change detection-based co-seismic landslide mapping through extended morphological profiles and ensemble strategy. ISPRS J. Photogramm. Remote Sens. 2022, 187, 225–239. [Google Scholar] [CrossRef]
- Mas, J.-F. Monitoring land-cover changes: A comparison of change detection techniques. Int. J. Remote Sens. 1999, 20, 139–152. [Google Scholar] [CrossRef]
- Bovolo, F.; Bruzzone, L. A Theoretical Framework for Unsupervised Change Detection Based on Change Vector Analysis in the Polar Domain. IEEE Trans. Geosci. Remote Sens. 2007, 45, 218–236. [Google Scholar] [CrossRef]
- Wu, C.; Du, B.; Cui, X.; Zhang, L. A post-classification change detection method based on iterative slow feature analysis and Bayesian soft fusion. Remote Sens. Environ. 2017, 199, 241–255. [Google Scholar] [CrossRef]
- Hussain, M.; Chen, D.; Cheng, A.; Wei, H.; Stanley, D. Change detection from remotely sensed images: From pixel-based to object-based approaches. ISPRS J. Photogramm. Remote Sens. 2013, 80, 91–106. [Google Scholar] [CrossRef]
- Miller, O.; Pikaz, A.; Averbuch, A. Objects based change detection in a pair of gray-level images. Pattern Recogn. 2005, 38, 1976–1992. [Google Scholar] [CrossRef]
- Hazel, G.G. Object-level change detection in spectral imagery. IEEE Trans. Geosci. Remote Sens. 2001, 39, 553–561. [Google Scholar] [CrossRef]
- Volpi, M.; Tuia, D.; Bovolo, F.; Kanevski, M.; Bruzzone, L. Supervised change detection in VHR images using contextual information and support vector machines. Int. J. Appl. Earth Obs. Geoinf. 2013, 20, 77–85. [Google Scholar] [CrossRef]
- Im, J.; Jensen, J.R. A change detection model based on neighborhood correlation image analysis and decision tree classification. Remote Sens. Environ. 2005, 99, 326–340. [Google Scholar] [CrossRef]
- Bai, T.; Sun, K.; Deng, S.; Li, D.; Li, W.; Chen, Y. Multi-scale hierarchical sampling change detection using Random Forest for high-resolution satellite imagery. Int. J. Remote Sens. 2018, 39, 7523–7546. [Google Scholar] [CrossRef]
- Bai, T.; Wang, L.; Yin, D.; Sun, K.; Chen, Y.; Li, W.; Li, D. Deep learning for change detection in remote sensing: A review. Geo-spat. Inf. Sci. 2023, 26, 262–288. [Google Scholar] [CrossRef]
- Peng, D.; Zhang, Y.; Guan, H. End-to-End Change Detection for High Resolution Satellite Images Using Improved UNet++. Remote Sens. 2019, 11, 1382. [Google Scholar] [CrossRef]
- Wang, Z.; Liu, D.; Liao, X.; Pu, W.; Wang, Z.; Zhang, Q. SiamHRnet-OCR: A Novel Deforestation Detection Model with High-Resolution Imagery and Deep Learning. Remote Sens. 2023, 15, 463. [Google Scholar] [CrossRef]
- Chen, H.; Shi, Z. A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection. Remote Sens. 2020, 12, 1662. [Google Scholar] [CrossRef]
- Bandara, W.G.C.; Patel, V.M. A Transformer-Based Siamese Network for Change Detection. In Proceedings of the IGARSS 2022—2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 207–210. [Google Scholar]
- Yan, W.; Cao, L.; Yan, P.; Zhu, C.; Wang, M. Remote sensing image change detection based on swin transformer and cross-attention mechanism. Earth Sci. Inform. 2024, 18, 106. [Google Scholar] [CrossRef]
- Zhu, D.; Huang, X.; Huang, H.; Shao, Z.; Cheng, Q. ChangeViT: Unleashing Plain Vision Transformers for Change Detection. arXiv 2024, arXiv:2406.12847. [Google Scholar]
- Chen, H.; Song, J.; Han, C.; Xia, J.; Yokoya, N. ChangeMamba: Remote Sensing Change Detection with Spatiotemporal State Space Model. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4409720. [Google Scholar] [CrossRef]
- Zhang, H.; Chen, K.; Liu, C.; Chen, H.; Zou, Z.; Shi, Z. CDMamba: Incorporating Local Clues into Mamba for Remote Sensing Image Binary Change Detection. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4405016. [Google Scholar] [CrossRef]
- Wu, Z.; Ma, X.; Lian, R.; Zheng, K.; Ma, M.; Zhang, W.; Song, S. CD-Lamba: Boosting Remote Sensing Change Detection via a Cross-Temporal Locally Adaptive State Space Model. arXiv 2025, arXiv:2501.15455. [Google Scholar] [CrossRef]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 3992–4003. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 8748–8763. [Google Scholar]
- Caron, M.; Touvron, H.; Misra, I.; Jegou, H.; Mairal, J.; Bojanowski, P.; Joulin, A. Emerging Properties in Self-Supervised Vision Transformers. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 9630–9640. [Google Scholar]
- Tan, X.; Chen, G.; Wang, T.; Wang, J.; Zhang, X. Segment Change Model (SCM) for Unsupervised Change Detection in VHR Remote Sensing Images: A Case Study of Buildings. In Proceedings of the IGARSS 2024—2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; pp. 8577–8580. [Google Scholar]
- Qin, Y.; Chen, J.; Wang, C.; Pan, C. BiSAM-CD: Zero-Shot Remote Sensing Change Detection via Bidirectional Temporal Memory in SAM2. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4417812. [Google Scholar] [CrossRef]
- Li, K.; Cao, X.; Deng, Y.; Pang, C.; Xin, Z.; Meng, D.; Wang, Z. DynamicEarth: How Far are We from Open-Vocabulary Change Detection? arXiv 2025, arXiv:2501.12931. [Google Scholar]
- Xu, N.; Price, B.; Cohen, S.; Yang, J.; Huang, T. Deep Interactive Object Selection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 373–381. [Google Scholar]
- Sofiiuk, K.; Petrov, I.; Barinova, O.; Konushin, A. f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 8620–8629. [Google Scholar]
- Lin, Z.; Zhang, Z.; Chen, L.-Z.; Cheng, M.-M.; Lu, S.-P. Interactive Image Segmentation with First Click Attention. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 13336–13345. [Google Scholar]
- Sofiiuk, K.; Petrov, I.A.; Konushin, A. Reviving Iterative Training with Mask Guidance for Interactive Segmentation. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 3141–3145. [Google Scholar]
- Liu, Q.; Zheng, M.; Planche, B.; Karanam, S.; Chen, T.; Niethammer, M.; Wu, Z. PseudoClick: Interactive Image Segmentation with Click Imitation. In Proceedings of the 2022 European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; pp. 728–745. [Google Scholar]
- Chen, X.; Zhao, Z.; Zhang, Y.; Duan, M.; Qi, D.; Zhao, H. FocalClick: Towards Practical Interactive Image Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 1290–1299. [Google Scholar]
- Liu, Q.; Xu, Z.; Bertasius, G.; Niethammer, M. SimpleClick: Interactive Image Segmentation with Simple Vision Transformers. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 22233–22243. [Google Scholar]
- Jiang, Z.; Zhou, X.; Cao, W.; Sun, Z.; Wu, C. ICD: VHR-Oriented Interactive Change-Detection Algorithm. ISPRS Int. J. Geo-Inf. 2022, 11, 503. [Google Scholar] [CrossRef]
- Wang, Z.; Xu, M.; Wang, Z.; Guo, Q.; Zhang, Q. ScribbleCDNet: Change detection on high-resolution remote sensing imagery with scribble interaction. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103761. [Google Scholar] [CrossRef]
- Chen, X.; Cheung, Y.S.J.; Lim, S.-N.; Zhao, H. ScribbleSeg: Scribble-based Interactive Image Segmentation. arXiv 2023, arXiv:2303.11320. [Google Scholar]
- Xu, N.; Price, B.; Cohen, S.; Yang, J.; Huang, T. Deep GrabCut for Object Selection. arXiv 2017, arXiv:1707.00243. [Google Scholar] [CrossRef]
- Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y. Deep High-Resolution Representation Learning for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3349–3364. [Google Scholar] [CrossRef]
- Yuan, Y.; Chen, X.; Wang, J. Object-Contextual Representations for Semantic Segmentation. In Proceedings of the 2020 European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; pp. 173–190. [Google Scholar]
- Suzuki, S.; Abe, K. Topological structural analysis of digitized binary images by border following. Comput. Vis. Graph. Image Process. 1985, 30, 32–46. [Google Scholar] [CrossRef]
- Zhang, T.Y.; Suen, C.Y. A fast parallel algorithm for thinning digital patterns. Commun. ACM 1984, 27, 236–239. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
- Sofiiuk, K.; Barinova, O.; Konushin, A. AdaptIS: Adaptive Instance Selection Network. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7354–7362. [Google Scholar]
- Ji, S.; Wei, S.; Lu, M. Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set. IEEE Trans. Geosci. Remote Sens. 2019, 57, 574–586. [Google Scholar] [CrossRef]
- Liu, M.; Chai, Z.; Deng, H.; Liu, R. A CNN-Transformer Network With Multiscale Context Aggregation for Fine-Grained Cropland Change Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 4297–4306. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]













| Models | WHU-CD | LEVIR-CD | CLCD | ||||||
|---|---|---|---|---|---|---|---|---|---|
| f-BRS | 1.54 | 1.87 | 2.52 | 2.33 | 4.09 | 8.19 | 8.63 | 10.57 | 12.62 |
| RITM | 1.30 | 1.44 | 1.76 | 1.79 | 2.53 | 5.88 | 5.17 | 7.22 | 9.89 |
| FocalClick | 1.26 | 1.40 | 1.83 | 1.89 | 2.62 | 6.13 | 5.29 | 7.10 | 9.38 |
| SimpleClick | 1.18 | 1.31 | 1.67 | 1.75 | 2.59 | 6.25 | 4.66 | 6.52 | 8.94 |
| RSICDNet | 1.15 | 1.25 | 1.51 | 1.45 | 1.98 | 4.67 | 3.42 | 5.14 | 7.59 |
| Models | WHU-CD | LEVIR-CD | CLCD | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Base | 1.30 | 1.47 | 1.77 | 1.79 | 2.50 | 5.95 | 5.22 | 7.20 | 9.64 |
| Base + IFRM | 1.24 | 1.39 | 1.72 | 1.79 | 2.43 | 5.76 | 4.96 | 7.13 | 9.48 |
| Base + RSI + CSE | 1.21 | 1.31 | 1.52 | 1.50 | 2.02 | 4.70 | 3.62 | 5.24 | 7.97 |
| Base + RSI + IFRM | 1.16 | 1.32 | 1.59 | 1.47 | 2.01 | 4.85 | 3.60 | 5.16 | 7.69 |
| RSICDNet | 1.15 | 1.25 | 1.51 | 1.45 | 1.98 | 4.67 | 3.42 | 5.14 | 7.59 |
| Models | |||
|---|---|---|---|
| RSICDNet w/IFRM-(3,5,7,9) | 1.18 | 1.27 | 1.55 |
| RSICDNet w/IFRM-(3,7,11,15) | 1.15 | 1.25 | 1.51 |
| RSICDNet w/IFRM-(3,9,15,21) | 1.17 | 1.27 | 1.50 |
| Models | |||
|---|---|---|---|
| RSICDNet w/CSE-(C + D) | 1.16 | 1.28 | 1.56 |
| RSICDNet w/CSE-(S + D) | 1.17 | 1.25 | 1.53 |
| RSICDNet w/CSE-(C + S) | 1.17 | 1.29 | 1.59 |
| RSICDNet w/CSE-(C + S + D) | 1.15 | 1.25 | 1.51 |
| Models | WHU-CD | LEVIR-CD | CLCD | ||||||
|---|---|---|---|---|---|---|---|---|---|
| IoU | OA | F1 | IoU | OA | F1 | IoU | OA | F1 | |
| STANet | 73.61 | 98.73 | 84.80 | 78.70 | 98.61 | 88.08 | 47.52 | 94.58 | 64.43 |
| ChangeFormer | 75.79 | 98.95 | 86.22 | 82.32 | 98.92 | 90.30 | 41.56 | 94.03 | 58.72 |
| ChangeViT | 89.66 | 99.57 | 94.55 | 84.39 | 99.05 | 91.54 | 63.54 | 96.77 | 77.70 |
| CD-Lamba | 86.49 | 99.44 | 92.76 | 81.79 | 98.86 | 89.98 | 62.53 | 96.68 | 76.94 |
| RSICDNet (1 Interaction) | 91.45 | 99.65 | 95.53 | 85.68 | 99.14 | 92.29 | 75.50 | 98.04 | 86.04 |
| Models | Parameters (M) | MACs (G) | Inference Time (ms) | |
|---|---|---|---|---|
| f-BRS | 2.33 | 58.43 | 62.84 | 81.69 |
| RITM | 1.79 | 30.95 | 16.97 | 65.56 |
| FocalClick | 1.89 | 30.97 | 17.15 | 69.41 |
| SimpleClick | 1.75 | 97.05 | 27.87 | 25.26 |
| RSICDNet | 1.45 | 31.04 | 17.21 | 76.76 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Peng, D.; He, C.; Guan, H. RSICDNet: A Novel Regional Scribble-Based Interactive Change Detection Network for Remote Sensing Images. Remote Sens. 2026, 18, 204. https://doi.org/10.3390/rs18020204
Peng D, He C, Guan H. RSICDNet: A Novel Regional Scribble-Based Interactive Change Detection Network for Remote Sensing Images. Remote Sensing. 2026; 18(2):204. https://doi.org/10.3390/rs18020204
Chicago/Turabian StylePeng, Daifeng, Chen He, and Haiyan Guan. 2026. "RSICDNet: A Novel Regional Scribble-Based Interactive Change Detection Network for Remote Sensing Images" Remote Sensing 18, no. 2: 204. https://doi.org/10.3390/rs18020204
APA StylePeng, D., He, C., & Guan, H. (2026). RSICDNet: A Novel Regional Scribble-Based Interactive Change Detection Network for Remote Sensing Images. Remote Sensing, 18(2), 204. https://doi.org/10.3390/rs18020204

