Robust Image Watermarking via Clustered Visual State-Space Modeling
Abstract
1. Introduction
- To overcome the limitations of existing watermarking models in balancing computational efficiency and interaction depth, we develop CCViM, a full state-space-based watermarking framework. By exploiting linear-time complexity, CCViM effectively alleviates the performance bottleneck encountered when processing high-resolution images.
- We propose a Watermark Representation Learning Module (WRLM), which replaces naive dimension replication with cascaded visual state-space blocks, transforming watermark signals into structured features with intrinsic resilience and substantially improving robustness against signal-processing attacks.
- We design an Interwoven Fusion Enhancement Module (IFEM) together with a context-clustering-based feature grouping strategy, enabling deep watermark–image fusion while effectively balancing local adaptivity and global consistency.
2. Related Work
2.1. Deep Learning-Based Watermarking Architectures
2.2. Visual Backbones
2.3. Visual State-Space Models
3. Methodology
3.1. CCViM Framework
3.2. Watermark Representation Learning Module
3.3. Interwoven Fusion Enhancement Module
4. Experiments
4.1. Datasets
4.2. Experimental Settings and Metrics
4.3. Baselines
- HiDDen [20]: A pioneering END framework that performs adversarial training by simulating attacks with differentiable noise layers.
- TSDL [37]: A two-stage decoupled training framework that tackles real, non-differentiable attacks by freezing the encoder and training the decoder separately.
- MBRS [22]: A mini-batch training strategy that mixes real and simulated JPEG compression to specifically improve robustness against JPEG attacks.
- Fang [38]: Extends TSDL to a three-stage training pipeline and introduces mask-guided frequency enhancement to withstand stronger real-world distortions.
- De-END [23]: A novel “decoder-driven” design that reduces redundant feature embedding by tightening encoder–decoder coupling.
- WFormer [13]: A Transformer-based soft-fusion model that exploits self-attention and cross-attention to capture long-range correlations between images and watermarks, thereby improving the fusion process.
4.4. Discussion and Analysis of Parameters
4.5. Comparison Under Distortion-Specific Training
4.6. Performance Comparison Under Combined Attacks
4.7. Statistical Significance Analysis
4.8. Ablation Study
- H_CCViM/C_CCViM/T_CCViM: Replace WRLM with simple tiling, convolutional layers, or a Transformer, respectively.
- D_CCViM: Remove IFEM and perform feature fusion via direct concatenation.
- NI_CCViM: Remove the interleaved design and execute the fusion process sequentially.
- T_IFEM_CCViM: Replace the Vmamba module within IFEM with a Transformer module.
- NC_CCViM: Remove the context clustering module from IFEM.
4.9. Calculation Cost Analysis
4.10. Performance Under Untrained Attacks and Model Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Cao, Y.; Li, S.; Liu, Y.; Yan, Z.; Dai, Y.; Yu, P.; Sun, L. A survey of AI-generated content (AIGC). ACM Comput. Surv. 2025, 57, 125. [Google Scholar] [CrossRef]
- Trigka, M.; Dritsas, E. The evolution of generative AI: Trends and applications. IEEE Access 2025, 13, 98504–98529. [Google Scholar] [CrossRef]
- Mirsky, Y.; Lee, W. The creation and detection of deepfakes: A survey. ACM Comput. Surv. 2021, 54, 7. [Google Scholar] [CrossRef]
- Verdoliva, L. Media forensics and deepfakes: An overview. IEEE J. Sel. Top. Signal Process. 2020, 14, 910–932. [Google Scholar] [CrossRef]
- Wu, X.; Liao, X.; Ou, B. SepMark: Deep separable watermarking for unified source tracing and deepfake detection. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 1190–1201. [Google Scholar]
- Zhang, F.; Wang, H.; He, M.; Xia, J. Robust blind symmetry-based watermarking in the frequency domain against social network processing and desynchronization attacks. IEEE Trans. Circuits Syst. Video Technol. 2024. [Google Scholar] [CrossRef]
- Wang, G.; Ma, Z.; Liu, C.; Yang, X.; Fang, H.; Zhang, W.; Yu, N. MuST: Robust image watermarking for multi-source tracing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 21–27 February 2024; Volume 36, pp. 5364–5371. [Google Scholar]
- Tang, Y.; Wang, C.; Xiang, S.; Cheung, Y.-M. A Robust reversible watermarking scheme using attack-simulation-based adaptive normalization and embedding. IEEE Trans. Inf. Forensics Secur. 2024, 19, 4114–4129. [Google Scholar] [CrossRef]
- Wan, W.; Wang, J.; Zhang, Y.; Li, J.; Yu, H.; Sun, J. A comprehensive survey on robust image watermarking. Neurocomputing 2022, 488, 226–247. [Google Scholar] [CrossRef]
- Luo, X.; Zhan, R.; Chang, H.; Yang, F.; Milanfar, P. Distortion agnostic deep watermarking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 13548–13557. [Google Scholar]
- Geiping, J.; Goldstein, T.; Kirchenbauer, J.; Wen, Y. Tree-rings watermarks: Invisible fingerprints for diffusion images. Adv. Neural Inf. Process. Syst. 2023, 36, 58047–58063. [Google Scholar]
- Zhang, X.; Li, R.; Yu, J.; Xu, Y.; Li, W.; Zhang, J. EditGuard: Versatile image watermarking for tamper localization and copyright protection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 19–21 June 2024; pp. 11964–11974. [Google Scholar]
- Luo, T.; Wu, J.; He, Z.; Xu, H.; Jiang, G.; Chang, C.-C. WFormer: A transformer-based soft fusion model for robust image watermarking. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 8, 4179–4196. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. In Proceedings of the 2021 International Conference on Learning Representations, Virtual Event, Austria, 3–7 May 2021. [Google Scholar]
- Zhu, L.; Liao, B.; Zhang, Q.; Wang, X.; Liu, W.; Wang, X. Vision Mamba: Efficient visual representation learning with bidirectional state space model. In Proceedings of the International Conference on Machine Learning, PMLR, Vienna, Austria, 21–27 July 2024; pp. 62429–62442. [Google Scholar]
- Ma, X.; Zhou, Y.; Wang, H.; Qin, C.; Sun, B.; Liu, C.; Fu, Y. Image as set of points. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
- Agustsson, E.; Timofte, R. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1122–1131. [Google Scholar]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
- Zhu, J.; Kaplan, R.; Johnson, J.; Li, F.-F. Hidden: Hiding data with deep networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 682–697. [Google Scholar]
- Ahmadi, M.; Norouzi, A.; Karimi, N.; Samavi, S.; Emami, A. ReDMark: Framework for residual diffusion watermarking based on deep networks. Expert Syst. Appl. 2020, 146, 113157. [Google Scholar] [CrossRef]
- Jia, Z.; Fang, H.; Zhang, W. MBRS: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression. In Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, China, 20–24 October 2021; pp. 41–49. [Google Scholar]
- Fang, H.; Jia, Z.; Qiu, Y.; Zhang, J.; Zhang, W.; Chang, E.-C. De-END: Decoder-driven watermarking network. IEEE Trans. Multimed. 2022, 25, 7571–7581. [Google Scholar] [CrossRef]
- Fang, H.; Qiu, Y.; Chen, K.; Zhang, J.; Zhang, W.; Chang, E.-C. Flow-based robust watermarking with invertible noise layer for black-box distortions. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 5054–5061. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1106–1114. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef] [PubMed]
- Khan, S.; Naseer, M.; Hayat, M.; Zamir, S.W.; Khan, F.S.; Shah, M. Transformers in vision: A survey. ACM Comput. Surv. 2022, 54, 200. [Google Scholar] [CrossRef]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. In Proceedings of the First Conference on Language Modeling, Philadelphia, PA, USA, 7–9 October 2024. [Google Scholar]
- Somvanshi, S.; Islam, M.M.; Mimi, M.S.; Polock, S.B.B.; Chhetri, G.; Das, S. From s4 to mamba: A comprehensive survey on structured state space models. arXiv 2025, arXiv:2503.18970. [Google Scholar]
- Jiao, J.; Liu, Y.; Liu, Y.; Tian, Y.; Wang, Y.; Xie, L.; Ye, Q.; Yu, H.; Zhao, Y. Vmamba: Visual state space model. Adv. Neural Inf. Process. Syst. 2024, 37, 103031–103063. [Google Scholar]
- Shi, Y.; Xia, B.; Jin, X.; Wang, X.; Zhao, T.; Xia, X.; Xiao, X.; Yang, W. Vmambair: Visual state space model for image restoration. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 5560–5574. [Google Scholar] [CrossRef]
- Liu, L.; Zhang, M.; Yin, J.; Liu, T.; Ji, W.; Piao, Y.; Lu, H. Defmamba: Deformable visual state space model. In Proceedings of the Computer Vision and Pattern Recognition Conference, Nashville, TN, USA, 11–15 June 2025; pp. 8838–8847. [Google Scholar]
- Ma, J.; Li, F.; Wang, B. U-Mamba: Enhancing long-range dependency for biomedical image segmentation. arXiv 2024, arXiv:2401.04722. [Google Scholar]
- Huang, T.; Pei, X.; You, S.; Wang, F.; Qian, C.; Xu, C. Localmamba: Visual state space model with windowed selective scan. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; Springer Nature: Cham, Switzerland, 2024; pp. 12–22. [Google Scholar]
- Guo, H.; Li, J.; Dai, T.; Ouyang, Z.; Ren, X.; Xia, S.-T. MambaIR: A simple baseline for image restoration with state-space model. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; pp. 222–241. [Google Scholar]
- Liu, Y.; Guo, M.; Zhang, J.; Zhu, Y.; Xie, X. A novel two-stage separable deep learning framework for practical blind watermarking. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 1509–1517. [Google Scholar]
- Fang, H.; Jia, Z.; Zhou, H.; Ma, Z.; Zhang, W. Encoded feature enhancement in watermarking network for distortion in real scenes. IEEE Trans. Multimed. 2022, 25, 2648–2660. [Google Scholar] [CrossRef]






| PSNR (dB) | SSIM | BA [%] | |
|---|---|---|---|
| 10, 10, 0.0001 | 46.72 | 0.9868 | 97.39 |
| 5, 10, 0.0001 | 44.58 | 0.9851 | 98.75 |
| 3, 10, 0.0001 | 44.46 | 0.9805 | 99.47 |
| 1, 10, 0.0001 | 40.91 | 0.9612 | 99.76 |
| 3, 10, 0.0005 | 45.11 | 0.9836 | 98.48 |
| 3, 10, 0.00005 | 43.78 | 0.9759 | 99.51 |
| PSNR | BA [%] (QF = 80) | BA [%] (QF = 70) | BA [%] (QF = 60) | BA [%] (QF = 50) | BA [%] (QF = 40) | BA [%] (Average) | |
|---|---|---|---|---|---|---|---|
| (4, 4) | 43.98 | 99.98 | 99.86 | 99.65 | 99.11 | 96.55 | 99.03 |
| (6, 4) | 44.25 | 99.99 | 99.91 | 99.78 | 99.28 | 97.02 | 99.20 |
| (8, 4) | 44.46 | 100 | 99.95 | 99.82 | 99.45 | 97.89 | 99.42 |
| (10, 4) | 44.41 | 100 | 99.94 | 99.80 | 99.41 | 97.81 | 99.39 |
| (8, 2) | 43.82 | 99.95 | 99.80 | 99.52 | 98.91 | 96.13 | 98.86 |
| (8, 6) | 44.15 | 100 | 99.92 | 99.75 | 99.33 | 97.54 | 99.31 |
| Epoch | PSNR | SSIM | BA [%] (QF = 60) | BA [%] (QF = 50) | BA [%] (QF = 40) | BA [%] (Average) |
|---|---|---|---|---|---|---|
| 20 | 42.45 | 0.9733 | 98.10 | 96.70 | 94.60 | 96.47 |
| 40 | 43.28 | 0.9731 | 98.74 | 98.50 | 96.20 | 97.81 |
| 60 | 43.39 | 0.9745 | 99.40 | 98.70 | 96.50 | 98.20 |
| 80 | 44.31 | 0.9776 | 99.70 | 99.10 | 97.10 | 98.63 |
| 100 | 44.52 | 0.9811 | 99.85 | 99.70 | 97.28 | 98.94 |
| 120 | 44.50 | 0.9803 | 99.70 | 98.98 | 97.01 | 98.56 |
| Method | Gaussian Noise (%) | Salt-and-Pepper Noise (%) | ||||||||||
| = 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | Average | = 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | Average | |
| HiDDen | 89.55 | 86.48 | 83.95 | 83.09 | 79.15 | 84.44 | 95.10 | 93.75 | 93.41 | 92.88 | 90.38 | 93.10 |
| TSDL | 92.10 | 91.25 | 88.31 | 87.05 | 82.94 | 88.33 | 97.25 | 95.61 | 93.52 | 92.68 | 91.41 | 94.09 |
| MBRS | 99.90 | 99.38 | 98.05 | 96.05 | 94.10 | 97.50 | 98.08 | 98.71 | 98.30 | 97.55 | 96.65 | 97.86 |
| Fang | 90.48 | - | - | - | - | 90.48 | 97.01 | 97.28 | 97.69 | 97.10 | 96.98 | 97.21 |
| De-END | 99.98 | 99.69 | 98.30 | 96.55 | 95.88 | 98.08 | 99.38 | 99.48 | 99.20 | 99.08 | 98.69 | 99.17 |
| WFormer | 100 | 99.85 | 98.90 | 98.25 | 98.00 | 99.00 | 99.85 | 99.65 | 99.35 | 99.15 | 98.50 | 99.30 |
| Proposed | 100 | 99.92 | 99.45 | 98.73 | 98.50 | 99.32 | 99.95 | 99.88 | 99.75 | 99.55 | 99.12 | 99.65 |
| Method | Cropout (%) | Dropout (%) | ||||||||||
| = 90% | 80% | 70% | 60% | 50% | Average | = 80% | 70% | 60% | 50% | 40% | Average | |
| HiDDen | 95.58 | 94.70 | 88.72 | 76.85 | 61.65 | 83.50 | 90.26 | 89.51 | 87.08 | 86.78 | 82.74 | 87.27 |
| TSDL | 98.68 | 98.45 | 96.88 | 93.71 | 93.20 | 96.18 | 97.59 | 95.29 | 93.54 | 92.33 | 90.47 | 93.84 |
| MBRS | 99.70 | 99.21 | 97.18 | 90.41 | 83.50 | 94.00 | 96.31 | 96.12 | 94.18 | 92.64 | 90.66 | 93.98 |
| Fang | 98.28 | 97.88 | 97.10 | 95.30 | - | 97.14 | 97.38 | - | - | - | - | 97.38 |
| De-END | 100 | 99.98 | 99.48 | 97.25 | 91.20 | 97.58 | 100 | 100 | 100 | 99.50 | 94.65 | 98.83 |
| WFormer | 99.95 | 99.90 | 99.80 | 98.75 | 97.80 | 99.24 | 99.51 | 99.16 | 98.68 | 97.62 | 95.60 | 98.11 |
| Proposed | 100 | 100 | 99.95 | 99.92 | 98.35 | 99.64 | 99.58 | 99.19 | 98.74 | 97.71 | 95.92 | 98.23 |
| Method | Gaussian Blur (%) | JPEG Compression (%) | ||||||||||
| = 0.0001 | 0.5 | 1 | 2 | Average | QF = 40 | 50 | 60 | 70 | 80 | 90 | Average | |
| HiDDen | 95.38 | 95.15 | 94.28 | 84.40 | 92.30 | 86.70 | 91.31 | 92.91 | 93.35 | 93.51 | 94.31 | 92.02 |
| TSDL | 99.89 | 99.72 | 98.40 | 93.21 | 97.81 | 91.07 | 91.41 | 93.90 | 94.23 | 94.33 | 94.71 | 93.28 |
| MBRS | 98.58 | 98.20 | 97.61 | 87.75 | 95.54 | 94.82 | 94.98 | 96.65 | 97.73 | 97.64 | 98.80 | 96.77 |
| Fang | - | 90.35 | 92.05 | 91.98 | 91.46 | - | 91.46 | 92.47 | 93.65 | 94.35 | 95.08 | 93.40 |
| De-END | 99.98 | 99.95 | 99.45 | 94.32 | 98.43 | 98.17 | 99.05 | 100 | 100 | 100 | 100 | 99.54 |
| WFormer | 98.85 | 99.10 | 98.55 | 98.10 | 98.65 | 95.61 | 98.01 | 98.72 | 98.94 | 99.82 | 100.00 | 98.52 |
| Proposed | 98.92 | 98.95 | 98.99 | 98.81 | 98.92 | 95.88 | 98.15 | 98.79 | 99.07 | 99.98 | 100.00 | 98.65 |
| Method | Crop (r = 0.035) | Cropout (p = 0.3) | Dropout (p = 0.3) | Gaussian Blur ( = 0.01) | JPEG (QF = 50) | Average |
|---|---|---|---|---|---|---|
| HiDDen | 88.00 | 94.00 | 93.00 | 96.00 | 63.00 | 86.80 |
| TSDL | 89.00 | 97.30 | 97.40 | 98.60 | 76.20 | 91.70 |
| MBRS | 81.15 | 78.57 | 77.13 | 92.80 | 82.83 | 82.50 |
| Fang | 95.85 | 100 | 99.99 | 99.99 | 95.52 | 98.27 |
| De-END | 64.17 | 99.21 | 99.95 | 88.93 | 81.89 | 86.83 |
| WFormer | 97.17 | 100 | 100 | 100 | 97.73 | 98.98 |
| Proposed | 98.55 ± 0.06 | 100 ± 0.00 | 100 ± 0.00 | 100 ± 0.00 | 97.35 ± 0.11 | 99.18 ± 0.04 |
| Attack Scenario | WFormer (Mean ± SD) | Proposed (Mean ± SD) | t-Value | p-Value | Significance (p < 0.05) |
|---|---|---|---|---|---|
| Gaussian Noise ( = 0.04) | 98.25 ± 0.15 | 98.73 ± 0.08 | 6.31 | <0.001 | Yes |
| Gaussian Blur ( = 2) | 98.10 ± 0.18 | 98.81 ± 0.09 | 7.89 | <0.001 | Yes |
| JPEG (QF = 50) | 98.01 ± 0.06 | 98.15 ± 0.11 | 2.50 | 0.037 | Yes |
| Composite Average | 98.98 ± 0.05 | 99.18 ± 0.04 | 6.99 | <0.001 | Yes |
| Method | PSNR (dB) | Crop (r = 0.035) | Cropout (p = 0.3) | Dropout (p = 0.3) | Gaussian Blur ( = 0.01) | JPEG (QF = 50) | Average |
|---|---|---|---|---|---|---|---|
| H_CCViM | 37.25 | 75.10 | 80.25 | 82.50 | 90.75 | 98.00 | 85.32 |
| C_CCViM | 36.75 | 94.20 | 98.15 | 98.50 | 100 | 92.50 | 96.67 |
| T_CCViM | 36.80 | 97.80 | 100 | 100 | 100 | 96.95 | 98.95 |
| D_CCViM | 36.95 | 85.20 | 90.50 | 91.10 | 96.00 | 97.95 | 92.15 |
| NI_CCViM | 36.70 | 95.80 | 98.50 | 99.00 | 100 | 92.10 | 97.08 |
| T_IFEM_CCViM | 36.81 | 98.10 | 100 | 100 | 100 | 96.25 | 98.87 |
| NC_CCViM | 36.79 | 97.20 | 99.30 | 99.70 | 100 | 95.00 | 98.24 |
| Proposed | 36.83 | 98.55 | 100 | 100 | 100 | 97.35 | 99.18 |
| Method | Params [M] | FLOPs [G] | Speed [im/s] |
|---|---|---|---|
| HiDDen | 0.40 | 3.52 | 62.93 |
| MBRS | 5.80 | 13.36 | 17.67 |
| De-END | 0.41 | 3.91 | 18.38 |
| WFormer | 1.72 | 13.83 | 14.29 |
| Proposed | 1.46 | 8.30 | 23.82 |
| Attack | HiDDen | MBRS | De-END | WFormer | Proposed |
|---|---|---|---|---|---|
| Median Filter (w = 5 × 5) | 72.58 | 99.96 | 91.06 | 98.06 | 97.88 |
| Median Filter (w = 7 × 7) | 64.28 | 88.01 | 73.45 | 95.68 | 95.13 |
| Median Filter (w = 9 × 9) | 59.76 | 58.84 | 54.83 | 91.12 | 90.26 |
| Grid Crop (r = 0.7) | 73.02 | 99.93 | 95.49 | 100 | 100 |
| Grid Crop (r = 0.8) | 65.33 | 99.76 | 89.09 | 99.95 | 99.95 |
| Grid Crop (r = 0.9) | 59.40 | 97.99 | 76.63 | 98.79 | 98.79 |
| Adjust Hue (f = 0.44) | 68.34 | 95.44 | 91.53 | 95.68 | 97.23 |
| Adjust Hue (f = 0.46) | 64.10 | 86.46 | 80.32 | 88.27 | 91.68 |
| Adjust Hue (f = 0.48) | 58.20 | 67.33 | 62.60 | 74.59 | 80.33 |
| Adjust Saturation (f = 5.0) | 79.09 | 99.86 | 97.01 | 99.92 | 99.96 |
| Adjust Saturation (f = 10.0) | 77.59 | 99.77 | 96.43 | 99.89 | 99.89 |
| Adjust Saturation (f = 15.0) | 76.36 | 99.63 | 96.19 | 99.69 | 99.76 |
| Average | 68.17 | 91.08 | 83.72 | 95.14 | 95.91 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Liu, B.; Ren, J. Robust Image Watermarking via Clustered Visual State-Space Modeling. Appl. Sci. 2026, 16, 4166. https://doi.org/10.3390/app16094166
Liu B, Ren J. Robust Image Watermarking via Clustered Visual State-Space Modeling. Applied Sciences. 2026; 16(9):4166. https://doi.org/10.3390/app16094166
Chicago/Turabian StyleLiu, Bo, and Jianhua Ren. 2026. "Robust Image Watermarking via Clustered Visual State-Space Modeling" Applied Sciences 16, no. 9: 4166. https://doi.org/10.3390/app16094166
APA StyleLiu, B., & Ren, J. (2026). Robust Image Watermarking via Clustered Visual State-Space Modeling. Applied Sciences, 16(9), 4166. https://doi.org/10.3390/app16094166
