Adaptive Bit Selection via Deep Reinforcement Learning for Large-Scale Image Hashing
Abstract
1. Introduction
2. Related Work
2.1. Learning to Hash for Large-Scale Retrieval
2.2. Deep Supervised Hashing
2.3. Recent Advances: Transformers, Hash Centers, and Graph/Contrastive Objectives
2.4. Reinforcement Learning for Discrete Representation Optimization
3. Proposed Methodology
3.1. Deep Reinforcement Learning to Hash
3.2. Markov Decision Process Formulation
3.3. Proposed Approach Components
3.3.1. CNN-Based Binary Hash Code Extraction
Training Data Representation
Semantic Similarity Matrix
Hash Function Definition
Hamming Distance and Inner Product Relation
Global Similarity Reconstruction Objective
Continuous Relaxation
Quantization Regularization
Quantization Error Reduction
Substituting with
Block-Wise Optimization Strategy
| Algorithm 1 Hash Code Extraction using Block-wise Similarity Calculation |
|
CNN Mapping via Multi-Binary Classification
| Algorithm 2 Hash Code Mapping via Multi-Binary Classification |
|
3.3.2. Regeneration of Binary Hash Codes by Retaining Valuable Bits
Advantages of Bit Regeneration
CNN Freezing Strategy
4. Experimental Results
4.1. Setup and Datasets
4.2. Hyperparameter Selection and Justification
4.3. Compared Methods
4.4. Evaluation Metrics
4.5. Implementation Details
4.6. Main Quantitative Results
4.7. Precision–Recall Curves
4.8. Precision Within Hamming Radius 2
4.9. Top-N Retrieval Performance
4.10. Comparison with DRLIH
4.11. Qualitative Results and Discussion
4.12. Stability of the Reinforcement Learning Component
4.13. Backbone Selection and Generalization
4.14. Computational Complexity and Efficiency
5. Conclusions and Future Works
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Datta, R.; Joshi, D.; Li, J.; Wang, J.Z. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 2008, 40, 5. [Google Scholar] [CrossRef]
- Jégou, H.; Douze, M.; Schmid, C. Product Quantization for Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 117–128. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Zhang, T.; Song, J.; Sebe, N.; Shen, H.T. A Survey on Learning to Hash. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 769–790. [Google Scholar] [CrossRef] [PubMed]
- Luo, X.; Wang, H.; Wu, D.; Chen, C.; Deng, M.; Huang, J.; Hua, X.-S. A survey on deep hashing methods. ACM Trans. Knowl. Discov. Data 2023, 17, 15. [Google Scholar] [CrossRef]
- Liu, W.; Wang, J.; Ji, R.; Jiang, Y.-G.; Chang, S.-F. Supervised hashing with kernels. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012; pp. 2074–2081. [Google Scholar] [CrossRef]
- Gionis, A.; Indyk, P.; Motwani, R. Similarity search in high dimensions via hashing. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB), Edinburgh, Scotland, UK, 7–10 September 1999; pp. 518–529. [Google Scholar]
- Kulis, B.; Darrell, T. Learning to hash with binary reconstructive embeddings. In Advances in Neural Information Processing Systems 22 (NeurIPS 2009), Vancouver, BC, Canada, 7–12 December 2009; Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C., Culotta, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2009. [Google Scholar]
- Norouzi, M.; Blei, D.M. Minimal loss hashing for compact binary codes. In Proceedings of the 28th International Conference on Machine Learning (ICML 2011), Bellevue, WA, USA, 28 June–2 July 2011; pp. 353–360. [Google Scholar]
- Cao, Z.; Long, M.; Wang, J.; Yu, P.S. HashNet: Deep learning to hash by continuation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV 2017), Venice, Italy, 22–29 October 2017; pp. 5609–5618. [Google Scholar] [CrossRef]
- Li, Q.; Sun, Z.; He, R.; Tan, T. Deep supervised discrete hashing. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Guyon, I., von Luxburg, U., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017. [Google Scholar]
- Li, T.; Zhang, Z.; Pei, L.; Gan, Y. HashFormer: Vision transformer based deep hashing for image retrieval. IEEE Signal Process. Lett. 2022, 29, 827–831. [Google Scholar] [CrossRef]
- Wang, L.; Pan, Y.; Liu, C.; Lai, H.; Yin, J.; Liu, Y. Deep hashing with minimal-distance-separated hash centers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, BC, Canada, 18–22 June 2023; pp. 23455–23464. [Google Scholar]
- Chen, Z.; Yuan, X.; Lu, J.; Tian, Q.; Zhou, J. Deep hashing via discrepancy minimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, 18–23 June 2018; pp. 6838–6847. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, J.; Zhu, L.; Luo, Y.; Lu, G. Deep collaborative graph hashing for discriminative image retrieval. Pattern Recognit. 2023, 139, 109462. [Google Scholar] [CrossRef]
- Shen, X.; Cai, H.; Gong, X.; Zheng, Y. Contrastive transformer masked image hashing for degraded image retrieval. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI 2024), Jeju, Republic of Korea, 3–9 August 2024; pp. 1218–1226. [Google Scholar] [CrossRef]
- Chen, Y.; Zhang, S.; Liu, F.; Chang, Z.; Ye, M.; Qi, Z. TransHash: Transformer-based Hamming Hashing for Efficient Image Retrieval. In Proceedings of the 2022 International Conference on Multimedia Retrieval (ICMR); Association for Computing Machinery: New York, NJ, USA, 2022; pp. 127–136. [Google Scholar] [CrossRef]
- Dubey, A.; Dubey, S.R.; Singh, S.K.; Chu, W.-T. Transformer-based Clipped Contrastive Quantization Learning for Unsupervised Image Retrieval. arXiv 2024, arXiv:2401.15362. [Google Scholar] [CrossRef]
- Chen, Y.; Lu, Z.; Zheng, Y.; Li, P.; Luo, W.; Kang, S. Deep hashing with mutual information: A comprehensive strategy for image retrieval. Expert Syst. Appl. 2025, 264, 125880. [Google Scholar] [CrossRef]
- Yao, D.; Li, Z.; Li, B.; Zhang, C.; Ma, H. Similarity Graph-correlation Reconstruction Network for unsupervised cross-modal hashing. Expert Syst. Appl. 2024, 237, 121516. [Google Scholar] [CrossRef]
- Xu, Y.; Yang, Z.; Ting, K.M. Contrastive Multi-View Graph Hashing. In Proceedings of the 34th ACM International Conference on Information and Knowledge Management; Association for Computing Machinery: New York, NY, USA, 2025; pp. 3666–3676. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images; Technical Report; University of Toronto: Toronto, ON, Canada, 2009. [Google Scholar]
- Chua, T.-S.; Tang, J.; Hong, R.; Li, H.; Luo, Z.; Zheng, Y. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR 2009), Santorini, Greece, 8–10 July 2009; Association for Computing Machinery: New York, NY, USA, 2009; p. 48. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common objects in context. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; Volume 8693, pp. 740–755. [Google Scholar] [CrossRef]
- Xia, R.; Pan, Y.; Lai, H.; Liu, C.; Yan, S. Supervised hashing for image retrieval via image representation learning. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI 2014), Québec City, QC, Canada, 27–31 July 2014; AAAI Press: Palo Alto, CA, USA, 2014; pp. 2156–2162. [Google Scholar]
- Lai, H.; Pan, Y.; Liu, Y.; Yan, S. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, MA, USA, 7–12 June 2015; pp. 3270–3278. [Google Scholar] [CrossRef]
- Cao, Y.; Liu, B.; Long, M.; Wang, J. HashGAN: Deep learning to hash with pair conditional Wasserstein GAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, 18–23 June 2018; pp. 1287–1296. [Google Scholar] [CrossRef]
- Peng, Y.; Zhang, J.; Ye, Z. Deep reinforcement learning for image hashing. IEEE Trans. Multimed. 2020, 22, 2061–2073. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25 (NeurIPS 2012), Lake Tahoe, NV, USA, 3–6 December 2012; Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2012; pp. 1097–1105. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Shen, F.; Shen, C.; Liu, W.; Shen, H.T. Supervised discrete hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, MA, USA, 7–12 June 2015; pp. 37–45. [Google Scholar] [CrossRef]




| Component | Hyperparameter | Value |
|---|---|---|
| Discount factor | ||
| RL Core | Clipping parameter | |
| Learning rate | ||
| Batch size | k (hash length) | |
| Training Setup | Episode structure | Full dataset per episode |
| Optimization epochs | Multiple (PPO standard) | |
| Policy Update | Optimizer | Adam |
| Gradient update | Mini-batch based |
| Method | CIFAR-10 | NUS-WIDE | MS-COCO | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16 | 32 | 48 | 64 | 16 | 32 | 48 | 64 | 16 | 32 | 48 | 64 | |
| SDH [32] | 0.461 | 0.520 | 0.553 | 0.568 | 0.588 | 0.611 | 0.638 | 0.667 | 0.555 | 0.564 | 0.572 | 0.580 |
| CNNH [26] | 0.476 | 0.472 | 0.489 | 0.501 | 0.570 | 0.583 | 0.593 | 0.600 | 0.564 | 0.574 | 0.571 | 0.567 |
| DNNH [27] | 0.559 | 0.558 | 0.581 | 0.583 | 0.598 | 0.616 | 0.635 | 0.639 | 0.593 | 0.603 | 0.605 | 0.610 |
| HashNet [9] | 0.643 | 0.667 | 0.675 | 0.687 | 0.662 | 0.699 | 0.711 | 0.716 | 0.687 | 0.718 | 0.730 | 0.736 |
| HashGAN [28] | 0.668 | 0.731 | 0.735 | 0.749 | 0.715 | 0.737 | 0.744 | 0.748 | 0.697 | 0.725 | 0.741 | 0.744 |
| HashFormer [11] | 0.9121 | 0.9167 | - | 0.9236 | 0.7317 | 0.7418 | - | 0.7597 | - | - | - | - |
| DCDH [16] | - | 0.9192 | 0.9142 | - | - | 0.8870 | 0.8922 | - | - | - | - | - |
| CTMIH [17] | - | - | - | - | 0.795 | 0.816 | - | 0.826 | 0.809 | 0.834 | - | 0.846 |
| Proposed method | 0.727 | 0.790 | 0.788 | 0.780 | 0.790 | 0.798 | 0.805 | 0.807 | 0.748 | 0.767 | 0.790 | 0.776 |
| Method | CIFAR-10 | NUS-WIDE | ||||||
|---|---|---|---|---|---|---|---|---|
| 12 | 24 | 32 | 48 | 12 | 24 | 32 | 48 | |
| DRLIH | 0.816 | 0.843 | 0.855 | 0.853 | 0.823 | 0.846 | 0.845 | 0.853 |
| Proposed method | 0.857 | 0.876 | 0.881 | 0.883 | 0.839 | 0.862 | 0.868 | 0.871 |
| Stage | Operation | Complexity | Frequency |
|---|---|---|---|
| CNN Training | Feature learning + hash prediction | Offline (once) | |
| RL Training (PPO) | Bit selection policy optimization | Offline (once) | |
| Inference | Hash encoding + Hamming distance | Online (per query) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Rezaei, M.; Alaoui Mhamdi, M.A.; Allili, M. Adaptive Bit Selection via Deep Reinforcement Learning for Large-Scale Image Hashing. Electronics 2026, 15, 1735. https://doi.org/10.3390/electronics15081735
Rezaei M, Alaoui Mhamdi MA, Allili M. Adaptive Bit Selection via Deep Reinforcement Learning for Large-Scale Image Hashing. Electronics. 2026; 15(8):1735. https://doi.org/10.3390/electronics15081735
Chicago/Turabian StyleRezaei, Mitra, Mohammed Ayoub Alaoui Mhamdi, and Madjid Allili. 2026. "Adaptive Bit Selection via Deep Reinforcement Learning for Large-Scale Image Hashing" Electronics 15, no. 8: 1735. https://doi.org/10.3390/electronics15081735
APA StyleRezaei, M., Alaoui Mhamdi, M. A., & Allili, M. (2026). Adaptive Bit Selection via Deep Reinforcement Learning for Large-Scale Image Hashing. Electronics, 15(8), 1735. https://doi.org/10.3390/electronics15081735

