Proxy-Based Semi-Supervised Cross-Modal Hashing
Abstract
:1. Introduction
- Category Proxy Network: We design a CPNet to generate feature proxies and hash proxies, which work with two-modal hashing networks during training. This enables the model to consider both the relationships between data and the relationships between data and categories.
- Adaptive Dual-Label Loss: To capture the structural relationships between the data and categories contained in continuous pseudo-labels, an Adaptive Dual-Label Loss model is proposed.
- Experimental Validation: Extensive experiments on three public datasets demonstrate the superiority of our PSSCH method.
2. Related Works
2.1. Supervised Hashing Methods
2.2. Unsupervised Hashing Methods
2.3. Semi-Supervised Hashing Methods
3. Methodology
3.1. Problem Formulation
3.2. Overview of PSSCH Framework
3.3. Model Learning
3.3.1. Pseudo-Label Generation
3.3.2. Feature Proxy Learning
3.3.3. Adaptive Dual-Label Loss
3.4. Optimization
Algorithm 1 The Pseudo-Code of the PSSCH Method |
|
4. Experiments
4.1. Experimental Settings
- MIRFLICKR-25K. This small-scale cross-modal dataset consists of 24,581 image–text pairs, spanning 24 categories, with each sample belonging to at least one category.
- NUS-WIDE. Comprising 269,648 image–text pairs, this dataset includes 81 categories. We filtered out categories with fewer samples and selected 21 common categories, resulting in 195,834 image–text pairs.
- MS COCO. A large-scale dataset commonly used in computer vision, containing 82,785 training images and 40,504 validation images. Each image is associated with textual descriptions and labels across 80 categories. For our experiments, we combined the training and validation sets, with each sample belonging to at least one of these categories.
- LEMON. embeds label information into the hash learning process in order to fully utilize the semantic information of labels to guide the learning of hash functions.
- EDMH. proposes a discrete optimization algorithm that seamlessly integrates three useful discrete constraints into a joint hashing learning model.
- HCCH. mitigates the loss of important discriminative information, a hierarchical hashing scheme from coarse to fine is proposed, which refines useful discriminative information step by step using a two-layer hashing function.
- HMAH. creates a hierarchical message aggregation network within a teacher–student framework, enhancing alignment of heterogeneous modalities and modeling detailed cross-modal correlations.
- SSCH. obtains enhanced semantic information through a pseudo-labeling process that does not require alignment and learns the hash representations of various data via a label enhancement strategy.
- MGCH. employs a multi-view graph to connect the data, utilizing anchor points as a unified semantic hub to achieve semi-supervised cross-modal hashing.
- TS3H. utilizes supervised information; classifiers for different modalities are learned to predict the labels of unlabeled data, and then the hash codes are learned by combining both the new and old labels.
- GCSCH. designs a fusion network to integrate the two modalities and uses a graph convolutional network to capture semantic information from both real-labeled and pseudo-labeled multi-modal data.
- DGCPN. utilizes graph models to explore graph-neighbor consistency, which helps to address the inaccurate similarity calculation in unsupervised cross-modal hashing.
- UCCH. proposes a novel momentum optimizer for learnable hashing in contrastive learning and designs a cross-modal ranking learning loss.
4.2. Performance Comparison
4.3. Ablation Studies
4.4. Sensitivity to Hyper-Parameters
4.5. Training and Encoding Time
4.6. Visualization
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Liu, S.; Qian, S.; Guan, Y.; Zhan, J.; Ying, L. Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 25–30 July 2020; pp. 1379–1388. [Google Scholar]
- Shi, Y.; You, X.; Zheng, F.; Wang, S.; Peng, Q. Equally-Guided Discriminative Hashing for Cross-modal Retrieval. In Proceedings of the IJCAI, Macao, China, 10–16 August 2019; pp. 4767–4773. [Google Scholar]
- Qin, Q.; Huo, Y.; Huang, L.; Dai, J.; Zhang, H.; Zhang, W. Deep Neighborhood-preserving Hashing with Quadratic Spherical Mutual Information for Cross-modal Retrieval. IEEE Trans. Multimed. 2024, 26, 6361–6374. [Google Scholar] [CrossRef]
- Wu, Q.; Zhang, Z.; Liu, Y.; Zhang, J.; Nie, L. Contrastive Multi-Bit Collaborative Learning for Deep Cross-Modal Hashing. IEEE Trans. Knowl. Data Eng. 2024, 36, 5835–5848. [Google Scholar] [CrossRef]
- Song, G.; Huang, K.; Su, H.; Song, F.; Yang, M. Deep Ranking Distribution Preserving Hashing for Robust Multi-Label Cross-modal Retrieval. IEEE Trans. Multimed. 2024, 26, 7027–7042. [Google Scholar] [CrossRef]
- Wang, X.; Zou, X.; Bakker, E.M.; Wu, S. Self-constraining and attention-based hashing network for bit-scalable cross-modal retrieval. Neurocomputing 2020, 400, 255–271. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, Q.; Zhang, Z.; Zhang, J.; Lu, G. Multi-Granularity Interactive Transformer Hashing for Cross-modal Retrieval. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 893–902. [Google Scholar]
- Song, G.; Su, H.; Huang, K.; Song, F.; Yang, M. Deep self-enhancement hashing for robust multi-label cross-modal retrieval. Pattern Recognit. 2024, 147, 110079. [Google Scholar] [CrossRef]
- Gao, Z.; Wang, J.; Yu, G.; Yan, Z.; Domeniconi, C.; Zhang, J. Long-tail cross modal hashing. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 7642–7650. [Google Scholar]
- Zhu, L.; Cai, L.; Song, J.; Zhu, X.; Zhang, C.; Zhang, S. MSSPQ: Multiple Semantic Structure-Preserving Quantization for Cross-Modal Retrieval. In Proceedings of the ICMR ’22: International Conference on Multimedia Retrieval, Newark, NJ, USA, 27–30 June 2022; Oria, V., Sapino, M.L., Satoh, S., Kerhervé, B., Cheng, W., Ide, I., Singh, V.K., Eds.; ACM: New York, NY, USA, 2022; pp. 631–638. [Google Scholar]
- Li, F.; Wang, B.; Zhu, L.; Li, J.; Zhang, Z.; Chang, X. Cross-Domain Transfer Hashing for Efficient Cross-Modal Retrieval. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 9664–9677. [Google Scholar] [CrossRef]
- Wang, Y.; Dong, F.; Wang, K.; Nie, X.; Chen, Z. Weighted cross-modal hashing with label enhancement. Knowl. Based Syst. 2024, 293, 111657. [Google Scholar] [CrossRef]
- Zhang, C.; Song, J.; Zhu, X.; Zhu, L.; Zhang, S. HCMSL: Hybrid Cross-modal Similarity Learning for Cross-modal Retrieval. ACM Trans. Multim. Comput. Commun. Appl. 2021, 17, 2:1–2:22. [Google Scholar] [CrossRef]
- Fan, W.; Zhang, C.; Li, H.; Jia, X.; Wang, G. Three-stage semisupervised cross-modal hashing with pairwise relations exploitation. IEEE Trans. Neural Netw. Learn. Syst. 2023. [Google Scholar] [CrossRef]
- Wang, J.; Li, G.; Pan, P.; Zhao, X. Semi-supervised semantic factorization hashing for fast cross-modal retrieval. Multimed. Tools Appl. 2017, 76, 20197–20215. [Google Scholar] [CrossRef]
- Wang, X.; Liu, X.; Peng, S.J.; Zhong, B.; Chen, Y.; Du, J.X. Semi-supervised discrete hashing for efficient cross-modal retrieval. Multimed. Tools Appl. 2020, 79, 25335–25356. [Google Scholar] [CrossRef]
- Liu, X.; Yu, G.; Domeniconi, C.; Wang, J.; Xiao, G.; Guo, M. Weakly supervised cross-modal hashing. IEEE Trans. Big Data 2019, 8, 552–563. [Google Scholar] [CrossRef]
- Yang, L.; Zhang, K.; Li, Y.; Chen, Y.; Long, J.; Yang, Z. S3ACH: Semi-Supervised Semantic Adaptive Cross-Modal Hashing. In Proceedings of the International Conference on Neural Information Processing, Changsha, China, 20–23 November 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 252–269. [Google Scholar]
- Shen, X.; Zhang, H.; Li, L.; Yang, W.; Liu, L. Semi-supervised cross-modal hashing with multi-view graph representation. Inf. Sci. 2022, 604, 45–60. [Google Scholar] [CrossRef]
- Shen, X.; Yu, G.; Chen, Y.; Yang, X.; Zheng, Y. Graph Convolutional Semi-Supervised Cross-Modal Hashing. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, VIC, Australia, 28 October 2024–1 November 2024; pp. 5930–5938. [Google Scholar]
- Su, M.; Gu, G.; Ren, X.; Fu, H.; Zhao, Y. Semi-supervised knowledge distillation for cross-modal hashing. IEEE Trans. Multimed. 2021, 25, 662–675. [Google Scholar] [CrossRef]
- Zhang, X.; Liu, X.; Nie, X.; Kang, X.; Yin, Y. Semi-supervised semi-paired cross-modal hashing. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 6517–6529. [Google Scholar] [CrossRef]
- Jiang, Q.Y.; Li, W.J. Deep cross-modal hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3232–3240. [Google Scholar]
- Zou, X.; Wu, S.; Bakker, E.M.; Wang, X. Multi-label enhancement based self-supervised deep cross-modal hashing. Neurocomputing 2022, 467, 138–162. [Google Scholar] [CrossRef]
- Xu, C.; Chai, Z.; Xu, Z.; Li, H.; Zuo, Q.; Yang, L.; Yuan, C. HHF: Hashing-guided hinge function for deep hashing retrieval. IEEE Trans. Multimed. 2022, 25, 7428–7440. [Google Scholar] [CrossRef]
- Shu, Z.; Bai, Y.; Zhang, D.; Yu, J.; Yu, Z.; Wu, X.J. Specific class center guided deep hashing for cross-modal retrieval. Inf. Sci. 2022, 609, 304–318. [Google Scholar] [CrossRef]
- Tu, R.C.; Mao, X.L.; Tu, R.X.; Bian, B.; Cai, C.; Wang, H.; Wei, W.; Huang, H. Deep cross-modal proxy hashing. IEEE Trans. Knowl. Data Eng. 2022, 35, 6798–6810. [Google Scholar] [CrossRef]
- Tu, R.C.; Mao, X.L.; Ji, W.; Wei, W.; Huang, H. Data-aware proxy hashing for cross-modal retrieval. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23–27 July 2023; pp. 686–696. [Google Scholar]
- Huo, Y.; Qin, Q.; Dai, J.; Wang, L.; Zhang, W.; Huang, L.; Wang, C. Deep semantic-aware proxy hashing for multi-label cross-modal retrieval. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 576–589. [Google Scholar] [CrossRef]
- Hu, Z.; Cheung, Y.m.; Li, M.; Lan, W. Cross-Modal Hashing Method with Properties of Hamming Space: A New Perspective. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 7636–7650. [Google Scholar] [CrossRef] [PubMed]
- Chen, H.; Zhu, L.; Zhu, X. Deep Class-guided Hashing for Multi-label Cross-modal Retrieval. arXiv 2024, arXiv:2410.15387. [Google Scholar]
- Li, X.; Hu, D.; Nie, F. Deep binary reconstruction for cross-modal hashing. In Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 1398–1406. [Google Scholar]
- Wu, G.; Lin, Z.; Han, J.; Liu, L.; Ding, G.; Zhang, B.; Shen, J. Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; The AAAI Press: Cambridge, MA, USA, 2018; pp. 2854–2860. [Google Scholar]
- Su, S.; Zhong, Z.; Zhang, C. Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27–28 October 2019; pp. 3027–3035. [Google Scholar]
- Zhang, P.F.; Li, Y.; Huang, Z.; Xu, X.S. Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans. Multimed. 2021, 24, 466–479. [Google Scholar] [CrossRef]
- Zhu, L.; Wu, X.; Li, J.; Zhang, Z.; Guan, W.; Shen, H.T. Work together: Correlation-identity reconstruction hashing for unsupervised cross-modal retrieval. IEEE Trans. Knowl. Data Eng. 2022, 35, 8838–8851. [Google Scholar] [CrossRef]
- Wu, F.; Li, S.; Gao, G.; Ji, Y.; Jing, X.Y.; Wan, Z. Semi-supervised cross-modal hashing via modality-specific and cross-modal graph convolutional networks. Pattern Recognit. 2023, 136, 109211. [Google Scholar] [CrossRef]
- Huang, Y.; Hu, B.; Zhang, Y.; Gao, C.; Wang, Q. A semi-supervised cross-modal memory bank for cross-modal retrieval. Neurocomputing 2024, 579, 127430. [Google Scholar] [CrossRef]
- Wang, J.; Gong, T.; Yan, Y. Semi-supervised Prototype Semantic Association Learning for Robust Cross-modal Retrieval. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, USA, 14–18 July 2024; pp. 872–881. [Google Scholar]
- Tu, J.; Liu, X.; Lin, Z.; Hong, R.; Wang, M. Differentiable cross-modal hashing via multimodal transformers. In Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal, 10–14 October 2022; pp. 453–461. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 4 May 2021; pp. 1–22. [Google Scholar]
- Rawat, A.; Dua, I.; Gupta, S.; Tallamraju, R. Semi-supervised Domain Adaptation by Similarity Based Pseudo-Label Injection. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 150–166. [Google Scholar]
- Abdelfattah, R.; Guo, Q.; Li, X.; Wang, X.; Wang, S. Cdul: Clip-driven unsupervised learning for multi-label image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 1348–1357. [Google Scholar]
- Huiskes, M.J.; Lew, M.S. The mir flickr retrieval evaluation. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, Vancouver, BC, Canada, 30–31 October 2008; pp. 39–43. [Google Scholar]
- Chua, T.S.; Tang, J.; Hong, R.; Li, H.; Luo, Z.; Zheng, Y. Nus-wide: A real-world web image database from national university of singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval, Santorini Island, Greece, 8–10 July 2009; pp. 1–9. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision—ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning, Virtually, 13–15 December 2021; pp. 8748–8763. [Google Scholar]
- Wang, Y.; Luo, X.; Xu, X.S. Label embedding online hashing for cross-modal retrieval. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 871–879. [Google Scholar]
- Chen, Y.; Zhang, H.; Tian, Z.; Wang, J.; Zhang, D.; Li, X. Enhanced discrete multi-modal hashing: More constraints yet less time to learn. IEEE Trans. Knowl. Data Eng. 2020, 34, 1177–1190. [Google Scholar] [CrossRef]
- Sun, Y.; Ren, Z.; Hu, P.; Peng, D.; Wang, X. Hierarchical consensus hashing for cross-modal retrieval. IEEE Trans. Multimed. 2023, 26, 824–836. [Google Scholar] [CrossRef]
- Tan, W.; Zhu, L.; Li, J.; Zhang, H.; Han, J. Teacher-student learning: Efficient hierarchical message aggregation hashing for cross-modal retrieval. IEEE Trans. Multimed. 2022, 25, 4520–4532. [Google Scholar] [CrossRef]
- Yu, J.; Zhou, H.; Zhan, Y.; Tao, D. Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 4626–4634. [Google Scholar]
- Hu, P.; Zhu, H.; Lin, J.; Peng, D.; Zhao, Y.P.; Peng, X. Unsupervised contrastive cross-modal hashing. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 3877–3889. [Google Scholar] [CrossRef] [PubMed]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Notation | Definition |
---|---|
Training Dataset | |
N | Number of Samples |
Hash Functions | |
Feature Extractor | |
I | Indicator Function |
The i-th Image Data | |
The i-th Text Data | |
C | the number of categories |
K | The number of bits in the hash code |
Features of Data | |
Feature Proxies | |
Hash Proxies | |
Hash Code | |
Similarity matrix | |
epoch num | The total number of iterations |
Task | Methods | MIRFLICKR-25K | NUS-WIDE | MS COCO | ||||||
---|---|---|---|---|---|---|---|---|---|---|
16 Bits | 32 Bits | 64 Bits | 16 Bits | 32 Bits | 64 Bits | 16 Bits | 32 Bits | 64 Bits | ||
I→T | DGCPN (AAAI 21) | 0.703 | 0.713 | 0.720 | 0.566 | 0.589 | 0.601 | 0.575 | 0.613 | 0.630 |
UCCH (TPAMI 23) | 0.734 | 0.741 | 0.739 | 0.590 | 0.610 | 0.618 | 0.562 | 0.569 | 0.590 | |
LEMON (MM 20) | 0.651 | 0.670 | 0.682 | 0.460 | 0.491 | 0.507 | 0.492 | 0.438 | 0.527 | |
EDMH (TKDE 22) | 0.651 | 0.657 | 0.646 | 0.460 | 0.477 | 0.461 | 0.502 | 0.497 | 0.427 | |
HMAH (TMM 22) | 0.755 | 0.743 | 0.753 | 0.606 | 0.636 | 0.639 | 0.558 | 0.569 | 0.594 | |
HCCH (TMM 23) | 0.719 | 0.730 | 0.736 | 0.625 | 0.638 | 0.649 | 0.560 | 0.606 | 0.634 | |
MGCH (IS 22) | 0.689 | 0.705 | 0.729 | 0.525 | 0.514 | 0.595 | 0.615 | 0.562 | 0.607 | |
SSCH (TCSVT 23) | 0.622 | 0.670 | 0.685 | 0.479 | 0.524 | 0.539 | 0.435 | 0.441 | 0.479 | |
TS3H (TNNLS 23) | 0.717 | 0.741 | 0.742 | 0.613 | 0.642 | 0.671 | 0.618 | 0.624 | 0.690 | |
GCSCH (MM 24) | 0.772 | 0.776 | 0.785 | 0.658 | 0.677 | 0.673 | 0.619 | 0.675 | 0.701 | |
PSSCH (Ours) | 0.794 | 0.797 | 0.818 | 0.666 | 0.678 | 0.684 | 0.644 | 0.689 | 0.723 | |
T→I | DGCPN (AAAI 21) | 0.692 | 0.701 | 0.710 | 0.578 | 0.596 | 0.601 | 0.572 | 0.609 | 0.625 |
UCCH (TPAMI 23) | 0.722 | 0.726 | 0.725 | 0.600 | 0.616 | 0.626 | 0.553 | 0.560 | 0.586 | |
LEMON (MM 20) | 0.666 | 0.695 | 0.708 | 0.472 | 0.508 | 0.517 | 0.487 | 0.475 | 0.535 | |
EDMH (TKDE 22) | 0.668 | 0.677 | 0.667 | 0.475 | 0.487 | 0.477 | 0.501 | 0.494 | 0.427 | |
HMAH (TMM 22) | 0.721 | 0.703 | 0.705 | 0.546 | 0.578 | 0.559 | 0.549 | 0.558 | 0.578 | |
HCCH (TMM 23) | 0.721 | 0.740 | 0.742 | 0.631 | 0.632 | 0.649 | 0.556 | 0.588 | 0.647 | |
MGCH (IS 22) | 0.675 | 0.695 | 0.719 | 0.541 | 0.515 | 0.607 | 0.601 | 0.553 | 0.586 | |
SSCH (TCSVT 23) | 0.623 | 0.664 | 0.688 | 0.482 | 0.526 | 0.557 | 0.440 | 0.443 | 0.474 | |
TS3H (TNNLS 23) | 0.727 | 0.753 | 0.748 | 0.622 | 0.653 | 0.674 | 0.614 | 0.618 | 0.687 | |
GCSCH (MM 24) | 0.780 | 0.791 | 0.791 | 0.661 | 0.673 | 0.684 | 0.620 | 0.661 | 0.688 | |
PSSCH (Ours) | 0.774 | 0.787 | 0.803 | 0.671 | 0.683 | 0.692 | 0.657 | 0.702 | 0.728 |
Task | Dataset | t-Statistic | p-Value | Conclusion |
---|---|---|---|---|
I→T | MIRFLICKR-25K | 19.42 | p < 0.001 | PSSCH is significantly better than GCSCH |
NUS-WIDE | 5.74 | p < 0.001 | ||
MS COCO | 15.25 | p < 0.001 | ||
T→I | MIRFLICKR-25K | 11.00 | p < 0.001 | PSSCH is significantly better than GCSCH |
NUS-WIDE | 9.08 | p < 0.001 | ||
MS COCO | 29.03 | p < 0.001 |
Task | Methods | MIRFLICKR-25K | NUS-WIDE | MS COCO | ||||||
---|---|---|---|---|---|---|---|---|---|---|
16 Bits | 32 Bits | 64 Bits | 16 Bits | 32 Bits | 64 Bits | 16 Bits | 32 Bits | 64 Bits | ||
Img2Txt | PSSCH-1 | 0.783 | 0.789 | 0.801 | 0.661 | 0.669 | 0.674 | 0.641 | 0.672 | 0.711 |
PSSCH-2 | 0.711 | 0.723 | 0.732 | 0.604 | 0.615 | 0.624 | 0.610 | 0.623 | 0.644 | |
PSSCH-3 | 0.764 | 0.771 | 0.784 | 0.643 | 0.652 | 0.659 | 0.638 | 0.664 | 0.697 | |
PSSCH (Ours) | 0.794 | 0.797 | 0.818 | 0.666 | 0.678 | 0.684 | 0.644 | 0.689 | 0.723 | |
Txt2Img | PSSCH-1 | 0.771 | 0.776 | 0.786 | 0.665 | 0.673 | 0.682 | 0.645 | 0.684 | 0.709 |
PSSCH-2 | 0.703 | 0.709 | 0.713 | 0.594 | 0.603 | 0.609 | 0.604 | 0.618 | 0.625 | |
PSSCH-3 | 0.752 | 0.758 | 0.767 | 0.641 | 0.647 | 0.658 | 0.630 | 0.662 | 0.689 | |
PSSCH (Ours) | 0.774 | 0.787 | 0.803 | 0.671 | 0.683 | 0.692 | 0.657 | 0.702 | 0.728 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, H.; Zou, Z.; Zhu, X. Proxy-Based Semi-Supervised Cross-Modal Hashing. Appl. Sci. 2025, 15, 2390. https://doi.org/10.3390/app15052390
Chen H, Zou Z, Zhu X. Proxy-Based Semi-Supervised Cross-Modal Hashing. Applied Sciences. 2025; 15(5):2390. https://doi.org/10.3390/app15052390
Chicago/Turabian StyleChen, Hao, Zhuoyang Zou, and Xinghui Zhu. 2025. "Proxy-Based Semi-Supervised Cross-Modal Hashing" Applied Sciences 15, no. 5: 2390. https://doi.org/10.3390/app15052390
APA StyleChen, H., Zou, Z., & Zhu, X. (2025). Proxy-Based Semi-Supervised Cross-Modal Hashing. Applied Sciences, 15(5), 2390. https://doi.org/10.3390/app15052390