A Hybrid Deep Learning and Optimization Model for Enterprise Archive Semantic Retrieval

Shi, Xiaonan; Chen, Junhe; Wang, Yumo; Fu, Limei

doi:10.3390/app152312381

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

A Hybrid Deep Learning and Optimization Model for Enterprise Archive Semantic Retrieval

¹

School of Artificial Intelligence and Computing, Xi’an University of Science and Technology, Xi’an 710054, China

²

School of Electronics and Signals, Northwestern Polytechnical University, Xi’an 710072, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(23), 12381; https://doi.org/10.3390/app152312381

Submission received: 13 September 2025 / Revised: 7 November 2025 / Accepted: 14 November 2025 / Published: 21 November 2025

Download Versions Notes

Abstract

By searching for and summarizing the relevant information of the enterprise, we can build relevant knowledge maps, supplement and enrich the existing knowledge base, and support existing experiments and subsequent algorithm improvements. The extracted input text of enterprise archives is described via relation extraction and semantic analysis to improve the efficiency of archive retrieval and reduce the cost of communication. On the basis of the analysis of previous research, an enterprise archive semantic retrieval algorithm based on deep learning technology is constructed, that is, the BERT + BiGRU + CRF + HHO_improved model, to extract the relevant information of the enterprise. In the model, the Bidirectional Encoder Representations from Transformers (BERT) model is used to preprocess the Chinese word embedding, and the question-and-answer data are generated from the actual enterprise file database. Next, a Bidirectional Gated Recursive Unit (BiGRU) is used with the attention mechanism to capture the contextual features of the sequence. The Conditional Random Field (CRF) classifier is subsequently used to classify the text related to the enterprise archives, and the obtained data are labeled in sequence. Moreover, the swarm intelligence algorithm is introduced to dynamically optimize the model parameters and data processing strategies further to improve the generalization ability and adaptability of the model. The Harris Hawks Optimizer Improved (HHO_improved) algorithm is used to optimize the parameters of the CRF module to increase the performance and efficiency of named entity recognition. On the independently constructed dataset, the advantages of our algorithm are verified via comparative experiments with a variety of semantic matching algorithms and ablation experiments on the CRF and HHO_improved. The CRF and HHO_improved play essential roles in improving model performance. The obtained knowledge extraction results are expected to supplement and enhance the existing knowledge base, simplify the workflow, assist the enterprise’s dynamic production task management, and improve the efficiency of enterprise operations. The proposed algorithm achieves an accuracy improvement of 36.33%, 43.88%, 15.24%, and 12.41% over the BERT, BiGRU, BERT + BiGRU, and BERT + BiGRU + CRF models, respectively.

Keywords: semantic retrieval; enterprise archives; deep learning; dynamic optimization; Harris Hawks Optimizer; Named Entity Recognition

Share and Cite

MDPI and ACS Style

Shi, X.; Chen, J.; Wang, Y.; Fu, L. A Hybrid Deep Learning and Optimization Model for Enterprise Archive Semantic Retrieval. Appl. Sci. 2025, 15, 12381. https://doi.org/10.3390/app152312381

AMA Style

Shi X, Chen J, Wang Y, Fu L. A Hybrid Deep Learning and Optimization Model for Enterprise Archive Semantic Retrieval. Applied Sciences. 2025; 15(23):12381. https://doi.org/10.3390/app152312381

Chicago/Turabian Style

Shi, Xiaonan, Junhe Chen, Yumo Wang, and Limei Fu. 2025. "A Hybrid Deep Learning and Optimization Model for Enterprise Archive Semantic Retrieval" Applied Sciences 15, no. 23: 12381. https://doi.org/10.3390/app152312381

APA Style

Shi, X., Chen, J., Wang, Y., & Fu, L. (2025). A Hybrid Deep Learning and Optimization Model for Enterprise Archive Semantic Retrieval. Applied Sciences, 15(23), 12381. https://doi.org/10.3390/app152312381

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Hybrid Deep Learning and Optimization Model for Enterprise Archive Semantic Retrieval

Abstract

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI