AI-Enhanced Perceptual Hashing with Blockchain for Secure and Transparent Digital Copyright Management
Abstract
1. Introduction
- 1.
- Content-Independent Watermarks: Traditional watermark information (such as author identification, serial numbers, or timestamps) typically comprises metadata that remains unrelated to the image’s intrinsic visual content. This characteristic allows identical watermarks to be illicitly copied and applied to different images, failing to provide a unique, content-binding signature. Consequently, such systems cannot definitively prove that a specific work constitutes the original carrier of the watermark.
- 2.
- Limited Robust Verification for Derivative Works: Digital images frequently undergo modifications including cropping, filtering, and repurposing. Establishing that a modified image (a derivative work) originates from a specific original and, crucially, verifying the sequence of these modifications (along with their respective copyright claims) presents significant challenges within existing systems. Current approaches lack inherent mechanisms to establish verifiable creation lineage.
- 3.
- Dependence on Centralized Trust Models: Contemporary management systems often rely on trusted third-party authorities (TTPs) for watermark registration, storage, and verification. This centralized architecture introduces vulnerabilities including data tampering risks, single points of failure, bureaucratic inefficiencies, and potential information leakage. Moreover, this model contradicts the fundamentally decentralized nature of the modern internet ecosystem.
- 1.
- Novel Integration Framework: This study presents the first comprehensive framework that seamlessly integrates CNN-based perceptual hashing, QR code watermarking in the Discrete Wavelet Transform (DWT) domain, and blockchain technology for end-to-end digital copyright management, addressing both technical implementation and trust verification challenges.
- 2.
- Content-Binding Authentication: In contrast to traditional metadata-based watermarks, our AI-enhanced perceptual hash functions as an unforgeable, content-intrinsic signature that cannot be transferred between disparate images.
- 3.
- Derivative Work Lineage Tracking: We introduce a blockchain-based mechanism for verifiably tracking the complete creation and modification history of derivative works through chained perceptual hashes.
- 4.
- Decentralized Trust Architecture: The proposed system eliminates reliance on centralized authorities by leveraging blockchain’s immutability and IPFS’s distributed storage capabilities, providing a transparent and tamper-proof solution.
- 5.
- Comprehensive Experimental Evaluation: Extensive experiments conducted on both natural images and digital artworks demonstrate superior performance compared to traditional methods and recent deep learning approaches.
2. Related Work
2.1. Digital Watermarking
2.2. Perceptual Hashing
2.3. Blockchain in Copyright Management
2.4. AI in Cryptography and Security
2.5. Systematic Comparison with Previous Approaches
3. Proposed System Architecture
3.1. AI-Based Perceptual Hashing Module
- 1.
- Feature Extraction: The input image undergoes preprocessing before being processed through the pre-trained CNN architecture.
- 2.
- Hash Generation: Activations from specific intermediate layers (e.g., the block5_pool layer in VGG16, yielding a 7 × 7 × 512 feature map) are extracted and flattened into high-dimensional vectors. These vectors undergo dimensionality reduction to 256 principal components using Principal Component Analysis (PCA) to concentrate the most salient features. The selection of 256 dimensions via PCA was empirically determined through ablation studies comparing hash performance at 128, 256, and 512 dimensions. The 256-dimensional representation provided optimal balance between discriminability (maintaining sufficient information) and robustness (reducing noise and dimensionality), while aligning with standard hash lengths for practical storage and comparison. Finally, the 256-dimensional vector undergoes binarization by setting values above the median to 1 and others to 0, generating a fixed-length 256-bit perceptual hash value, denoted as . This hash remains unique to the image’s visual content while demonstrating robustness to minor modifications.
3.2. Watermark Generation and Embedding Module
- 1.
- QR Code Watermark Image: To enhance payload capacity and robustness, the 256-bit perceptual hash value (converted to a 64-character hexadecimal string) along with minimal copyright identifiers undergo encoding into QR code images (Version 5). QR Code Version 5 (37 × 37 modules) was selected due to its capacity of 86 alphanumeric characters, sufficient for encoding the 64-character hexadecimal string of our 256-bit perceptual hash () alongside short copyright identifiers, with remaining capacity reserved for error correction, thereby enhancing recovery robustness. This QR code functions as the visual watermark component.
- 2.
- Frequency-Domain Embedding: The QR code watermark undergoes embedding into original images in the frequency domain utilizing a two-level Discrete Wavelet Transform (DWT). The original image undergoes decomposition into LL, LH, HL, and HH sub-bands. The embedding process employs Haar wavelets, with the LL sub-band undergoing further decomposition to obtain a second-level decomposition. The QR code watermark image undergoes conversion to binary sequences. These sequences undergo adaptive embedding into mid-frequency coefficients of the HL and LH sub-bands of the second-level decomposition using quantization-based techniques with fixed quantization step size . This step size was empirically selected to provide favorable trade-offs between imperceptibility and robustness. This strategic approach provides advantageous balance between imperceptibility and robustness, as mid-frequency components demonstrate reduced sensitivity to noise and compression compared to high-frequency components, while carrying greater visual weight than low-frequency components.
3.3. Blockchain Manager Module
- 1.
- Transaction Creation: Transactions are created containing the following data elements:
- (a)
- The perceptual hash value of the original image;
- (b)
- A cryptographic hash (e.g., SHA-256) of the watermarked image;
- (c)
- Author’s public key and digital signature;
- (d)
- Reference to the stored watermarked image on IPFS.
- 2.
- Timestamping and Storage: These transactions undergo broadcasting to blockchain networks, packaging into blocks, and immutable recording with trusted timestamps.
3.4. IPFS Storage Module
4. AI-Based Perceptual Hashing Scheme
| Algorithm 1 AI-Enhanced Perceptual Hash Generation. |
| Require: Input image I, Pre-trained CNN model M, PCA transformation Ensure: 256-bit perceptual hash
|
5. Blockchain Integration for Copyright Verification
5.1. First-Time Registration
| Algorithm 2 Watermark Embedding Process. |
| Require: Original image I, Perceptual hash , Quantization step Ensure: Watermarked image
|
5.2. Handling Multiple Watermarks (Derivative Works)
- 1.
- Generate new perceptual hash from the edited image;
- 2.
- Embed new watermark containing ;
- 3.
- Register new blockchain transactions containing and references to previous block transactions.
6. Experimental Setup and Results
6.1. Experimental Setup
6.1.1. Datasets
- 1.
- The UCID v2 dataset: A standard benchmark comprising 1338 uncompressed color images of natural scenes, providing baseline performance metrics for conventional photographs.
- 2.
- An Original Digital Art dataset: To test system generalizability, we curated a dataset of 200 diverse digital artworks, including stylized illustrations, digital paintings, and synthetic media (e.g., AI-generated art). This represents modern digital creations where copyright protection proves highly relevant. This dataset remains available from corresponding authors upon reasonable request for non-commercial research purposes.
6.1.2. AI Perceptual Hashing Model
- 1.
- Base Model: VGG16, pre-trained on ImageNet. The selection of VGG16 over newer architectures (e.g., ResNet or MobileNet) was based on its stable, well-understood feature representations and proven competitive robustness in perceptual tasks [12,16]. While newer models exist, VGG16’s architectural simplicity facilitates reproducibility, and its performance remains state-of-the-art for perceptual hashing applications.
- 2.
- Feature Extraction Layer: The block5_pool layer (output shape 7 × 7 × 512) was selected for its high-level, semantically rich features that demonstrate robustness to minor pixel-level changes.
- 3.
- Hash Generation: The 7 × 7 × 512 feature map underwent flattening into 25,088-dimensional vectors. Principal Component Analysis (PCA) was applied to reduce dimensionality to 256 dimensions, concentrating the most salient information. These 256-dimensional vectors subsequently underwent binarization by setting values above the median to 1 and others to 0, generating fixed-length 256-bit perceptual hashes, .
6.1.3. Digital Watermarking Setup
- 1.
- Watermark Payload: The 256-bit (converted to 64-character hexadecimal strings) and short copyright identifiers (“(C)TSNU”).
- 2.
- QR Code: QR Code Version 5 (37 × 37 modules) was employed, offering capacity of 86 alphanumeric characters, sufficient for our payload and error correction requirements.
- 3.
- Embedding Domain: A two-level Discrete Wavelet Transform (DWT) utilizing Haar wavelets was employed. Original images underwent decomposition into LL, LH, HL, and HH sub-bands, with LL sub-bands undergoing further decomposition. The binary sequences of QR codes underwent embedding into mid-frequency coefficients of second-level HL and LH sub-bands using quantization-based techniques with fixed step size .
6.1.4. Comparison Baselines
- 1.
- DCT-Based Hash [7]: A classical methodology where images undergo grayscale conversion, resizing to 32 × 32, and DCT application. The top-left 8 × 8 low-frequency AC coefficients are selected, compared to their median values, and binarized to form 64-bit hashes.
- 2.
- Radial-Variance Hash (Radial) [8]: A method that projects image luminance along radials and computes variance of these projections to generate hashes. We employed authors’ implementations to produce 256-bit hashes.
- 3.
- DeepHash (Li et al. 2024) [7]: A state-of-the-art deep hashing method utilizing high-resolution features for image retrieval, re-implemented with identical hash length (256 bits) for fair comparison.
- 4.
- HashShield (Yang et al. 2025) [9]: A recent DeepFake forensic framework with separable perceptual hashing, adapted for our copyright protection tasks.
- 5.
- Implementation Details for Deep Baselines: For fair comparison, both DeepHash [7] and HashShield [9] underwent re-implementation using official code repositories and training on identical UCID v2 datasets. All methods employed identical hash lengths (256 bits) and underwent evaluation under identical experimental conditions. Training parameters followed original publications: DeepHash utilized HRNet-W48 backbone with contrastive learning, while HashShield employed ResNet-50 with separable hash learning.
6.1.5. Attack Types for Robustness Evaluation
- 1.
- JPEG Compression: Quality factors (QF) of 10, 30, 50, 70, and 90;
- 2.
- Gaussian Noise: Additive noise with standard deviations of 0.5%, 1.0%, 1.5%, and 2.0% of maximum pixel intensity;
- 3.
- Scaling: Scaling ratios of 50%, 75%, 125%, and 150%;
- 4.
- Brightness Adjustment: ±10% and ±20% adjustments;
- 5.
- Contrast Adjustment: ±10% and ±20% adjustments;
- 6.
- Cropping: Center cropping of 5%, 10%, and 15%.
6.2. Evaluation Metrics
- 1.
- Robustness: Bit Error Rate (BER), calculated as the proportion of differing bits between original and processed images’ perceptual hashes. Lower BER indicates higher robustness.
- 2.
- Discriminability: The average Hamming distance between hashes of 10,000 randomly selected, perceptually different image pairs from the UCID dataset. Ideal values approximate 50%. We additionally report BER for visually similar images (different exposures of identical scenes).
- 3.
- Watermark Imperceptibility: Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) between original and watermarked images. PSNR > 35 dB and SSIM > 0.95 indicate high imperceptibility.
- 4.
- Watermark Recovery Success Rate: The percentage of images from which QR code watermarks could be successfully extracted and decoded correctly following each attack.
6.3. Experimental Results and Analysis
6.3.1. Perceptual Hash Robustness
6.3.2. Perceptual Hash Discriminability
6.3.3. Watermark Imperceptibility and Robustness
6.3.4. Computational and Blockchain Performance
6.4. Security Analysis and Adversarial Robustness
- 1.
- Semantic Feature Robustness: CNN features extracted from VGG16’s block5_pool layer capture high-level semantic content that proves difficult for pixel-level GAN manipulations to replicate exactly.
- 2.
- Blockchain Immutability: Even if adversaries create perceptually similar forgeries, they cannot backdate blockchain registrations, rendering such attacks detectable through timestamp verification.
- 3.
- Watermark Consistency: Embedded QR code watermarks containing perceptual hashes provide additional verification layers that must remain consistent with visual content.
7. Discussion
7.1. Performance Summary
7.2. Security Considerations
- 1.
- Partial Watermark Removal: DWT-based embedding in mid-frequency coefficients provides resistance against partial removal attacks, as watermarks distribute across perceptually important components. Even with 15% cropping, recovery rates remain above 95% due to QR code error correction.
- 2.
- Blockchain Replay Attacks: Each transaction includes timestamps and references previous blocks, rendering replay attacks detectable. System integrity relies on blockchain consensus rather than transaction secrecy.
- 3.
- Model Selection Justification: While newer architectures like ResNet and MobileNet exist, VGG16 was selected for its stable, well-understood feature representations and competitive performance in perceptual tasks [12,16]. Its architectural simplicity additionally facilitates reproducibility. As demonstrated in Table 2, the chosen configuration achieves state-of-the-art robustness.
- 4.
- Scalability and Cost: The economic analysis highlights cost challenges for public mainnet deployment. Future work could explore layer-2 solutions or alternative blockchains, informed by recent intelligent frameworks for scalable and secure networks [5].
7.3. Limitations
- 1.
- Model Generalizability: The pre-trained VGG16 model, primarily trained on natural images, shows slightly reduced robustness for highly abstract or stylized digital artworks, with average BER increasing by 8-12% compared to natural images.
- 2.
- Computational Requirements: VGG16 feature extraction, while performed once per image, requires GPU acceleration for practical deployment in high-throughput scenarios.
- 3.
- Blockchain Scalability: As quantified in our economic analysis, public mainnet deployment faces cost challenges that may limit accessibility for individual creators.
- 4.
- Adversarial Robustness: While resistant to conventional attacks, system vulnerability to sophisticated adversarial examples requires further investigation.
8. Future Work
- 1.
- Specialized Model Development: We will develop perceptual hashing-specific CNN architectures, potentially based on MobileNetV3, trained on diverse datasets encompassing natural images, digital art, and synthetic media to enhance cross-domain robustness.
- 2.
- Lightweight Blockchain Integration: Implementation on high-throughput blockchain solutions like Polygon or adoption of Layer-2 scaling (zk-Rollups) targeting sub-$0.01 transaction costs and >1000 TPS throughput.
- 3.
- Post-Quantum Cryptography: Integration of NIST-standardized post-quantum signature schemes (CRYSTALS-Dilithium) to future-proof blockchain components against quantum computing threats.
- 4.
- Real-world Deployment: Large-scale field testing with digital art platforms and content creators to validate practical usability and performance under real-world conditions.
9. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| BER | Bit Error Rate |
| CID | Content Identifier (IPFS) |
| CNN | Convolutional Neural Network |
| DCT | Discrete Cosine Transform |
| DRM | Digital Rights Management |
| DWT | Discrete Wavelet Transform |
| ECDSA | Elliptic Curve Digital Signature Algorithm |
| IPFS | InterPlanetary File System |
| PCA | Principal Component Analysis |
| PoA | Proof-of-Authority |
| PQC | Post-Quantum Cryptography |
| PSNR | Peak Signal-to-Noise Ratio |
| QR Code | Quick Response Code |
| SSIM | Structural Similarity Index Measure |
| SVD | Singular Value Decomposition |
| TPS | Transactions Per Second |
| TTP | Trusted Third Party |
References
- Cox, I.J.; Miller, M.L.; Bloom, J.A.; Fridrich, J.; Kalker, T. Digital Watermarking and Steganography; Morgan Kaufmann Publishers: Burlington, MA, USA, 2007. [Google Scholar]
- Langelaar, G.C.; Setyawan, I.; Lagendijk, R.L. Watermarking Digital Image and Video Data: A State-of-the-Art Overview. IEEE Signal Process. Mag. 2000, 17, 20–46. [Google Scholar] [CrossRef]
- Madushanka, T.; Kumara, D.S.; Rathnaveera, A.A. SecureRights: A Blockchain-Powered Trusted DRM Framework for Robust Protection and Asserting Digital Rights. arXiv 2024, arXiv:2403.06094. [Google Scholar]
- Zhang, Q.; Wu, G.; Yang, R.; Chen, J. Digital Image Copyright Protection Method Based on Blockchain and Zero Trust Mechanism. Multimed. Tools Appl. 2024, 83, 12345–12367. [Google Scholar] [CrossRef]
- Mozumder, A.H.; Basha, M.J. SmartSecChain-SDN: A Blockchain-Integrated Intelligent Framework for Secure and Efficient Software-Defined Networks. arXiv 2025, arXiv:2511.05156. [Google Scholar]
- Bonnacini, L.; Buzzanca, M. Blockchain and Cultural Heritage: New Perspectives for the Museum of the Future. In Blockchain in Cultural Heritage; Springer: Cham, Switzerland, 2020; pp. 1–21. [Google Scholar]
- Li, Y.; Wang, Z.; Chen, J.; Zhang, W. Leveraging High-Resolution Features for Improved Deep Hashing-based Image Retrieval. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 12456–12465. [Google Scholar]
- Zauner, C. Implementation and Benchmarking of Perceptual Image Hash Functions. Master’s Thesis, University of Applied Sciences Hagenberg, Hagenberg, Austria, 2010. [Google Scholar]
- Yang, M.; Qi, B.; Ma, R.; Xian, Y.; Ma, B. HashShield: A Robust DeepFake Forensic Framework With Separable Perceptual Hashing. IEEE Signal Process. Lett. 2025, 32, 1186–1190. [Google Scholar] [CrossRef]
- Meng, Z.; Morizumi, T.; Miyata, S.; Kinoshita, H. Design Scheme of Copyright Management System Based on Digital Watermarking and Blockchain. In Proceedings of the 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), Tokyo, Japan, 23–27 July 2018; Volume 2, pp. 359–364. [Google Scholar]
- Litvinenko, V.; Li, J.; Zhang, Y. Hyperspectral Image Analysis with Subspace Learning-Based Perceptual Hashing for Authentication. IEEE Access 2023, 11, 45678–45689. [Google Scholar]
- Berriche, A.; Adjal, M.Z.; Baghdadi, R. Leveraging high-resolution features for improved deep hashing-based image retrieval. In Proceedings of the European Conference on Information Retrieval, Cham, Switzerland, 6–10 April 2025; pp. 440–453. [Google Scholar]
- Parlak, I.E. Blockchain-Assisted Explainable Decision Traces (BAXDT): An Approach for Transparency and Accountability in Artificial Intelligence Systems. Knowl.-Based Syst. 2025, 329, 114402. [Google Scholar] [CrossRef]
- Zhang, Q.Y.; Wu, G.R. Digital image copyright protection method based on blockchain and perceptual hashing. Int. J. Netw. Secur. 2023, 25, 10–24. [Google Scholar]
- Muwafaq, A.; Alsaad, S.N. Design scheme for copyright management system using blockchain and IPFS. Int. J. Comput. Digit. Syst. 2021, 10, 613–618. [Google Scholar] [CrossRef]
- Gao, G.; Qin, C.; Fang, Y.; Zhou, Y. Perceptual authentication hashing for digital images with contrastive unsupervised learning. IEEE Multimed. 2023, 30, 129–140. [Google Scholar] [CrossRef]
- Meng, Z.; Morizumi, T.; Miyata, S.; Kinoshita, H. An Improved Design Scheme for Perceptual Hashing based on CNN for Digital Watermarking. In Proceedings of the 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid, Spain, 13–17 July 2020; pp. 1789–1794. [Google Scholar]
- Rhayma, H.; Ejbali, R.; Hamam, H. Auto-authentication watermarking scheme based on CNN and perceptual hash function in the wavelet domain. Multimed. Tools Appl. 2024, 83, 60079–60100. [Google Scholar] [CrossRef]
- Xu, D.; Ren, N.; Zhu, C. Integrity authentication based on blockchain and perceptual hash for remote-sensing imagery. Remote Sens. 2023, 15, 4860. [Google Scholar] [CrossRef]
- Zhao, Y.; Qu, Y.; Xiang, Y.; Chen, F.; Gao, L. Context-Aware Consensus Algorithm for Blockchain-Empowered Federated Learning. IEEE Trans. Cloud Comput. 2024, 12, 491–503. [Google Scholar] [CrossRef]

| Scheme | Hash Type | Image Types | Derivative Works Support | Blockchain Usage | Watermarking Domain |
|---|---|---|---|---|---|
| COMPSAC 2018 [10] | Hand-crafted (DCT) | Natural images | Limited | Public, transaction hashes | Spatial domain |
| Zhang et al. 2023 [14] | Hand-crafted | Natural images | No | Public, metadata storage | DCT domain |
| Rhayma et al. 2024 [18] | CNN-based | Natural images | No | Private, hash storage | DWT domain |
| Xu et al. 2023 [19] | Hand-crafted | Remote sensing | No | Consortium, hashes | Frequency domain |
| Proposed Scheme | CNN-based (VGG16) | Natural + Digital art | Yes (chained hashes) | Public/Private, IPFS integration | DWT + QR code |
| Attack Type | Parameters | Proposed-CNN | DeepHash [7] | HashShield [9] | DCT-Based (64-Bit) | Radial (256-Bit) |
|---|---|---|---|---|---|---|
| JPEG Compression | QF = 10 | |||||
| QF = 30 | ||||||
| QF = 70 | ||||||
| Gaussian Noise | ||||||
| Scaling | 50% | |||||
| 75% | ||||||
| 125% | ||||||
| Brightness | +20% | |||||
| Contrast | +20% | |||||
| Cropping | 10% |
| Method | Avg. Hamming Distance (Different Images) | Avg. BER (Visually Similar Images) | Hash Length (Bits) |
|---|---|---|---|
| Proposed-CNN | 49.8% | 15.2% | 256 |
| DeepHash [7] | 48.5% | 22.8% | 256 |
| HashShield [9] | 50.1% | 18.3% | 256 |
| DCT-based | 50.1% | 28.5% | 64 |
| Radial | 49.9% | 32.1% | 256 |
| Attack Type | PSNR (dB) | SSIM | Watermark Recovery Rate |
|---|---|---|---|
| No Attack | 42.5 | 0.98 | 100% |
| JPEG Compression (QF = 30) | 38.2 | 0.96 | 99.5% |
| Gaussian Noise () | 36.8 | 0.94 | 98.7% |
| Scaling (75%) | 39.1 | 0.97 | 99.8% |
| Cropping (10%) | 34.5 * | 0.91 * | 95.2% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Meng, Z.; Zhang, R.; Cao, B.; Zhang, M.; Li, Y.; Xue, H.; Yang, M. AI-Enhanced Perceptual Hashing with Blockchain for Secure and Transparent Digital Copyright Management. Cryptography 2026, 10, 2. https://doi.org/10.3390/cryptography10010002
Meng Z, Zhang R, Cao B, Zhang M, Li Y, Xue H, Yang M. AI-Enhanced Perceptual Hashing with Blockchain for Secure and Transparent Digital Copyright Management. Cryptography. 2026; 10(1):2. https://doi.org/10.3390/cryptography10010002
Chicago/Turabian StyleMeng, Zhaoxiong, Rukui Zhang, Bin Cao, Meng Zhang, Yajun Li, Huhu Xue, and Meimei Yang. 2026. "AI-Enhanced Perceptual Hashing with Blockchain for Secure and Transparent Digital Copyright Management" Cryptography 10, no. 1: 2. https://doi.org/10.3390/cryptography10010002
APA StyleMeng, Z., Zhang, R., Cao, B., Zhang, M., Li, Y., Xue, H., & Yang, M. (2026). AI-Enhanced Perceptual Hashing with Blockchain for Secure and Transparent Digital Copyright Management. Cryptography, 10(1), 2. https://doi.org/10.3390/cryptography10010002

