Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model

Li, Kunping; Liu, Jianhua; Zhuang, Cunbo

doi:10.3390/app15052536

Open AccessArticle

Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model

by

Kunping Li

¹,

Jianhua Liu

^1,2 and

Cunbo Zhuang

^1,2,*

¹

Laboratory of Digital Manufacturing, School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China

²

Hebei Key Laboratory of Intelligent Assembly and Detection Technology, Tangshan Research Institute, Beijing Institute of Technology, Tangshan 063000, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(5), 2536; https://doi.org/10.3390/app15052536

Submission received: 29 January 2025 / Revised: 20 February 2025 / Accepted: 25 February 2025 / Published: 26 February 2025

Download

Browse Figures

Versions Notes

Abstract

In the context of the accelerating new technological revolution and industrial transformation, the issue of talent supply and demand matching has become increasingly urgent. Precise matching talent supply and demand is a critical factor in expediting the implementation of technological innovations. However, traditional methods relying on interpersonal networks for talent ability collection, demand transmission, and matching suffer from inefficiency and are often influenced by the subjective intentions of intermediaries, posing significant limitations. To address this challenge, we propose a novel approach named TSDM for talent supply and demand matching. TSDM leverages prompt learning with pre-trained large language models to extract detailed expressions of talent ability and demand from unstructured documents while utilizing the powerful text comprehension capabilities of pre-trained models for feature embedding. Furthermore, TSDM employs talent-specific and demand-specific encoding networks to perform deep learning on talent and demand features, capturing their comprehensive representations. In a series of comparative experiments, we validated the effectiveness of the proposed model. The results demonstrate that TSDM significantly enhances the accuracy of talent supply and demand matching, offering a promising approach to optimize human resource allocation.

Keywords:

talent supply and demand matching; prompt learning; pre-trained language model; deep learning

1. Introduction

In the National Medium- and Long-Term Talent Development Plan, talent is defined as individuals who possess specialized knowledge or skills, engage in creative labor, and contribute to society, representing the higher-capability and higher-quality segment of the workforce. Talent serves as the engine of economic development, and accurately matching talent supply with demand can accelerate the pace of technological advancement. However, the primary issue between talent supply and demand is information asymmetry [1,2]. Enterprises are in urgent need of highly skilled and qualified talent to drive technological innovation and industrial upgrading [3]. Yet, many talents face the challenge of a disconnect between their skills and available opportunities, often struggling to find platforms where their innovations can be effectively translated into practical applications [4]. Traditional methods for talent supply and demand matching, typically driven by interpersonal networks, are limited in scope and have a low success rate [5]. These processes are time-consuming and labor-intensive, and their applicability in broader contexts is constrained. Therefore, there is an imperative need for the development of a computer-based system that can efficiently match the talent supply and demand.

Over the past several years, various computational methods have been proposed to address the challenges associated with talent supply and demand matching. These approaches can be broadly classified into content-based recommendation [6,7,8], collaborative filtering [9,10], and hybrid recommendation [11]. Content-based recommendation primarily focuses on the relevance and similarity of textual features, disregarding user behavior. It relies on analyzing item attributes or content descriptors to suggest items that are most similar to those with which a user has interacted or shown interest. For example, Zhang et al. [12] proposed a knowledge distillation-based model for vectorizing processed text data in a job corpus, incorporating convolutional neural networks (CNN) [13] and bidirectional long short-term memory (Bi-LSTM) [14] networks to classify job categories in the vectorized data. Similarly, Liu et al. [15] introduced a deep-learning-based personalized recommendation algorithm, specifically addressing the talent screening challenges faced by large domestic enterprises in the recruitment process.

In contrast, collaborative filtering leverages user behavior data to identify patterns and similarities between users, recommending items based on the preferences or actions of others with comparable interests or behaviors. Collaborative filtering can be further categorized into user-based [16] or item-based [17] methods, depending on whether the focus is on identifying similar users or similar items. For instance, Yao et al. [18] proposed an optimal talent allocation mechanism, which introduces the concepts of talent–job matching and talent utilization rates as key evaluation criteria for optimal resource configuration. Shan et al. [19] introduced a nonlinear attention similarity model (NASM) for project-based collaborative filtering, which incorporates local attention embeddings and integrates novel nonlinear attention mechanisms to capture both local and global project information.

Hybrid recommendation methods combine elements from both content-based and collaborative filtering approaches, aiming to capitalize on the strengths of each method. For instance, Girase et al. [20] combined the advantages of collaborative filtering and content-based filtering algorithms to enhance recommendation effectiveness by leveraging an interactive user persona. Chou et al. [21] developed a system to recommend job vacancies by analyzing trends in online discussions. Biswas et al. [22] proposed a hybrid recommender system that combines alternating least squares (ALS)-based collaborative filtering with deep learning to improve recommendation performance.

These computational methods have shown promising results in specific contexts. However, significant challenges remain in refining these models to effectively address the complexity and dynamic nature of talent supply and demand matching in real-world scenarios. For instance, many demands and talent abilities are described in natural language [23,24,25], and manual processing methods (e.g., rule-based parsing [26], keyword extraction [27], and text categorization [28]) are insufficient for feature extraction from such unstructured data. Furthermore, existing methods commonly use talent evaluation indices (TEI) [29] to assess talent skills. However, TEI often relies on human expertise, which hinders the model’s ability to uncover the true underlying patterns in raw data. Additionally, missing data in TEI can lead to biased learning and inaccurate model representations. To address the aforementioned challenges, we propose a novel approach, TSDM, for matching talent supply and demand. TSDM employs prompt learning [30] to extract talent ability and their corresponding demand information from the unstructured documents. Subsequently, a BERT-based pre-trained language model [31] is utilized to encode the talent ability and demand data, generating feature embeddings. The talent-ability-specific and demand-specific encoder are designed to capture the contextual relationships within these embeddings, along with a decoding network to learn the similarity between talent ability and demand. In a series of comparative experiments, TSDM demonstrated superior performance, highlighting the effectiveness of prompt learning and large language models in predicting talent supply and demand matching. In summary, TSDM has three main contributions as follows:

Proposing a method that leverages prompt learning combined with large language models to extract highly discriminative descriptions of talent ability and demand information from the unstructured data.
Utilizing a pre-trained large language model (BERT) to generate feature embedding, that effectively capture the contextual relationships within the text, reflecting the nuanced dependencies and long-range interactions inherent in the data.
Leveraging talent-ability-specific and demand-specific encoding networks, which consist of 1D-CNN and Bi-LSTM, to capture both local and global representations of talent ability and demand, thus providing a comprehensive expression of these features.

2. Materials and Methods

2.1. Benchmark Datasets

To evaluate the performance of the proposed model, we collected a new dataset from the National Natural Science Foundation of China (NSFC) database, encompassing projects in popular research areas (hot topics are provided in Appendix A, Table A1) completed over the past five years (2020–2024). The dataset covers eight disciplinary categories, comprising a total of 18,090 records, provided in Table 1. The NSFC database archives critical information on the technical capabilities and research directions of high-tech talent, offering valuable insights into future research trends. The constructed dataset is utilized to validate the model’s capabilities in two key areas: recommendation and prediction (cold start problem). The recommendation task is used to match existing talent with new demands, while the prediction task is designed to identify whether new talent matches particular demands. This dual approach evaluates the model’s generalization ability in semantic understanding and the cold start problem [32]. Specifically, projects initiated before 2018 were assigned to the training set, while projects initiated after 2018 containing researchers from the training set were placed in the recommendation test set, with the remaining projects allocated to the matching test set.

2.2. Feature Representation

The dataset collected from the NSFC database contains multiple fields, incorporating both structured data (e.g., project identifiers, user identifiers) and unstructured data (e.g., project title, project abstracts, and conclusions). We use the ideas presented in the project title, project abstracts, and conclusions as a source for evaluating talent ability. Additionally, the project abstracts and conclusions also contain valuable information related to project demands. This focus is motivated by two main reasons: first, the current evaluation system emphasizes the practical problem-solving abilities of talent, moving beyond a narrow focus on academic qualifications and publications. Second, abstracts and conclusions undergo expert review, meaning they reflect not only the practices endorsed by peer reviewers but also the key challenges and research issues of the time.

2.3. The Architecture of TSDM

In this study, we propose a talent supply and demand matching method based on prompt learning and the large language model. The proposed method consists of four key components: a prompt learning-based unstructured data preprocessing module, a BERT-based feature embedding module, a talent-ability-specific and demand-specific encoding network, and a classification prediction network. Since both demand and talent ability are embedded within project abstracts and final reports, we leveraged the capabilities of Qwen [33] and employed prompt learning to extract relevant information regarding demand and talent ability from these documents. To further capture the contextual relationships within the extracted text, we utilized a pre-trained BERT model to encode the text and generate feature embedding. The subsequent encoders are employed to capture the contextual relationships within the demand or talent ability to form a comprehensive semantic representation of the entire sentence. Finally, a multi-layer perceptron (MLP)-based decoder network was employed to integrate the results from the ability and demand encoders, producing the final prediction. The workflow of TSDM is illustrated in Figure 1.

2.4. Feature Extracting with Prompt Learning

Unstructured data often contain a wealth of valuable information [34], yet extracting key insights from such data presents a significant challenge [35]. In recent years, generative artificial intelligence technologies, such as ChatGPT [36] and Qwen, have found widespread application in text processing [37,38]. By leveraging these technologies, we employed prompt learning to extract relevant information. Prompt learning is a technique in natural language processing (NLP) that involves providing a pre-trained language model with a carefully designed “prompt” (or input) to elicit a desired output or response. Specifically, we employed the Qwen model, providing it with project abstracts and final reports, as well as issue targeted prompts to the model, such as “[Extract the research questions in their original form]” and “[Extract the proposed methods in their original form]”, to extract relevant information. The results returned by the model were then analyzed for further use. It is important to note that the results presented here are still unstructured.

2.5. Feature Embedding with the Pre-Trained BERT Model

Recent advancements in large language models have led to significant progress in the field of NLP [39,40]. By undergoing semi-supervised training on vast amounts of data, large language models are capable of deeply learning contextual relationships between words and sentences. Building on this foundation, we employed the pre-trained model BERT model (BERT-Base-Chinese) to process talent ability and demand information. BERT-Base-Chinese is a Chinese-specific pre-trained model based on the BERT architecture, comprising 12 layers of transformer encoders and trained to capture the linguistic characteristics of the Chinese language. In our approach, we utilized the hidden state of the token-level embedding from the final encoding layer of the BERT-Base-Chinese model to represent the sentence. This resulted in an L × 768 feature embedding, where L denotes the length of the sentence.

2.6. Demand-Specific and Talent-Ability-Specific Encoding Network

The description of talent ability and demand often depends on the contextual relationships between words. To extract meaningful information from paragraphs containing talent ability or demand-related content, we employed a hybrid model combining CNN and Bi-LSTM networks, both of which are widely used in NLP [41,42]. CNN has achieved remarkable success in fields such as image processing [43], object detection [44], and semantic segmentation [45], capturing word-level features and structural patterns of phrases through local convolutional operations. In contrast, Bi-LSTM captures bidirectional dependencies within sequences by utilizing both forward and backward network layers, enabling the model to learn contextual information and understand long-range dependencies. The combination of CNN and Bi-LSTM leverages the spatial feature processing capabilities of CNN and the strength in handling sequential data of Bi-LSTM. Specifically, the demand-specific and talent-ability-specific encoding networks share identical architectures, each consisting of three convolutional layers, followed by average pooling and a rectified linear unit (ReLU) activation function. The hidden state size in the Bi-LSTM was set to 64, with a dropout rate of 0.5. Finally, a 1 × 128 embedding matrix was generated for each sentence containing either demand or talent ability information.

2.7. The Classification Prediction Network

A multi-layer perceptron (MLP) was introduced to connect the concatenated talent ability and demand feature embedding to decode similarity. More specifically, the MLP consisted of three layers. To mitigate overfitting, a dropout rate of 0.5 was applied after each MLP layer, except for the final one. A SoftMax layer was subsequently added on top of the final MLP layer to generate the ultimate prediction.

2.8. Loss Function

The NSFC dataset exhibits an imbalance across different disciplinary categories. Consequently, deep learning models may disproportionately prioritize the majority class within the loss function, thereby reducing their focus on the minority class. To address this issue, we employed the focal loss [46] function, a strategic approach that mitigates the impact of class imbalance. This is achieved by assigning lower weights to easier examples, while placing greater emphasis on more challenging instances. The formulation of the focal loss function is shown in Equation (1):

L (p_{t}) = - α_{t} {(1 - p_{t})}^{γ} \log (p_{t})

(1)

where

p_{t}

represents the predicted probability of each class;

α_{t}

denotes a balancing factor that assigns different weights to different class samples; and

γ

controls the focusing parameter, determining the degree of emphasis placed on difficult class.

2.9. Performance Evaluation Metrics

To assess and compare the proposed method with existing approaches, we evaluated performance using several metrics, including Macro-Precision (M-Pre), Macro-Recall (M-Rec), Macro-F1-score (M-F1), and Accuracy (Acc). The formulas for calculating these evaluation metrics are shown in Equations (2)–(8):

M - Pre = \frac{1}{k} \sum_{i = 1}^{k} P r e c i s i o n_{i}

(2)

M - Rec = \frac{1}{k} \sum_{i = 1}^{k} R e c a l l_{i}

(3)

M - F 1 = \frac{1}{k} \sum_{i = 1}^{k} F 1 - s c o r e_{i}

(4)

Acc = \frac{T P + T N}{T P + F N + T N + F P}

(5)

P r e c i s i o n = \frac{T P}{T P + F P}

(6)

R e c a l l = \frac{T P}{T P + F N}

(7)

F 1 - s c o r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(8)

where TP, FN, TN, and FP denote the number of true positive, false negative, true negative, and false positive instances, respectively, and k denotes the number of predicted categories.

3. Results and Discussion

3.1. Feature Embedding Extracted by the BERT Language Model Was Effective in Accounting for Talent Supply and Demand

To evaluate the effectiveness of text feature embeddings derived from the BERT language model, we conducted a series of feature ablation studies. These studies compare BERT embeddings with those generated using Word2Vec [47] and GloVe [48]. Specifically, we utilized a pre-trained Word2Vec model trained on Chinese Wikipedia [49] data, alongside a pre-trained GloVe model based on a subset of Wikipedia, to generate the feature representations, respectively. The results of comparative experiments are presented in Table 2.

As shown in Table 2, for the recommendation task, the feature embeddings based on the BERT model yielded the best results, achieving an accuracy of 82.46%, an M-Pre of 80.40%, an M-Rec of 77.15%, and an M-F1 of 77.48%. In terms of the M-F1 metric, the BERT-based features outperformed those of Word2Vec and GloVe by 2.42% and 1.59%, respectively. For the prediction task, the model using BERT-derived features achieved an accuracy of 78.39%, M-Pre of 77.47%, M-Rec of 72.29%, and M-F1 of 73.21%, surpassing Word2Vec and GloVe by at least 2.06%, 2.14%, 0.79%, and 1.52%, respectively.

Although all models utilize pre-trained embedding, BERT is built on the transformer architecture, which is particularly effective in capturing global contextual relationships between sentences. This capability significantly enhances the prediction performance in downstream tasks.

Additionally, as shown in Figure 2, we present the detailed prediction results of the TSDM model across various disciplinary categories. The results reveal that, in the recommendation task, the highest talent supply and demand matching performance was observed in the Life and Medical disciplines, with F1-scores of 92.00% and 91.04%, respectively. In contrast, the Mathematical discipline achieved the lowest F1-score, at only 56.23%. In the prediction task, the medical discipline also exhibited the highest performance, with an F1-score of 90.03%, while the Mathematical discipline achieved the lowest F1-score at 52.12%. The superior performance in the Life and Medical disciplines can be attributed to the larger training set sizes, which allow the model to better learn the relationships between talent skills and demand within these fields. The low prediction performance of the Mathematical discipline may be due to the more fragmented nature of knowledge within the mathematical field, which makes it more difficult for the model to capture common patterns.

3.2. Comparison of Prompt Learning Effectiveness Across Different Large Language Models

In this study, we leveraged large language models and prompt learning to extract telnet ability and demand from the documents. The selection of an appropriate large language model and the design of effective prompts are critical to the success of this approach. Given that generative artificial intelligence models often face issues related to hallucination [50], we specifically designed our prompts to ensure that the model extracts the relevant original content from the input text without making any modifications. Specifically, we employed several large language models, including ChatGPT [36] (developed by OpenAI, version GPT-4O mini), Qwen [33] (developed by Alibaba Cloud, version 2.5), Doubao [51] (developed by ByteDance, version 1.5), and ERNIE [52] (developed by Baidu, version 4.0). To assess model performance, we randomly selected and extracted 100 unstructured text samples from datasets across various disciplinary categories. The results of the performance comparison are presented in Figure 3.

Figure 3 indicates that under prompt learning, different large language models are able to accurately extract relevant information from text to generate responses, with accuracy approaching that of manually annotated results. Additionally, we also observed notable performance differences across models during the prompt learning process. Specifically, ChatGPT demonstrated superior performance on English sentences compared to Qwen, Doubao, and ERNIE, which are Chinese-language-based models. In contrast, Qwen outperformed the other models on Chinese sentences. The training corpora used for these models influence their performance on downstream tasks. Given that our primary task involves processing large volumes of Chinese text, the Qwen model was selected for prompt learning.

3.3. The Impact of Sentence-Level and Token-Level Feature Embedding from the Pre-Trained BERT Model

The output of BERT includes both the sentence-level and token-level embedding. To compare the differences between these two types of outputs in terms of their performance on talent supply and demand matching, we used sequence-level and token-level features in the separate models. As sequence-level features already capture sentence-level information, an MLP network was utilized for feature learning. In contrast, token-level features represent word-level information; therefore, a CNN+Bi-LSTM network was used to learn the representations of these tokens. The results of the comparative experiments are presented in Table 3.

As shown in Table 3, both recommendation and prediction tasks demonstrated superior performance when using token-level feature embedding compared to sentence-level feature embedding. Specifically, for the recommendation task, token-level embedding achieved Acc, M-Pre, M-Rec, and M-F1 of 82.46%, 80.40%, 77.15%, and 77.48%, respectively, outperforming sentence-level embedding by 6.41%, 9.98%, 9.96%, and 11.10%. In the prediction task, token-level embedding achieved values of 78.39%, 77.47%, 72.29%, and 73.21% for these metrics, respectively, surpassing sentence-level embedding by 2.29%, 3.26%, 5.1%, and 3.8%, respectively.

Furthermore, Figure 4 presents a comparative performance analysis of token-level and sentence-level embedding across different disciplinary categories. The results indicate that token-level embedding consistently outperforms sentence-level embedding in all categories. While sentence-level embedding captures the overall meaning of a sentence, it is more challenging to fine-tune for specific downstream tasks using diverse neural network architectures. In contrast, token-level embedding is more effectively fine-tuned for task-specific applications, enabling superior performance across various downstream deep learning models. Thus, token-level embedding offers a greater advantage over sentence-level embedding in this context.

3.4. Ablation Study

To evaluate the impact of different components within the TSDM network on the prediction results, we conducted a series of ablation experiments. Specifically, we sequentially employed MLP, Bi-LSTM, and CNN+Bi-LSTM networks within the TSDM framework to encode talent ability and demand features, assessing their individual contributions to model performance. Notably, in the MLP network, token-level features were averaged column-wise before being input into the MLP. The comparative results are presented in Figure 5.

Figure 5 indicates that for both recommendation and prediction tasks, the CNN+Bi-LSTM feature encoding network outperformed the Bi-LSTM and MLP feature encoding networks. Specifically, for the recommendation task, the CNN+Bi-LSTM achieved an M-F1 of 77.48%, which is an improvement of 1.47% and 2.27% over Bi-LSTM and MLP, respectively. In the prediction task, the CNN+Bi-LSTM achieved an M-F1 of 73.21%, surpassing Bi-LSTM and MLP by 0.78% and 3.19%, respectively. These findings suggest that averaging token-level information fails to adequately capture the complete sentence context, leading to the loss of valuable data and a subsequent decrease in model performance. In contrast, directly employing a Bi-LSTM network to extract sentence-level information from token-level embedding proves to be effective. However, it is noteworthy that augmenting the Bi-LSTM network with a convolutional layer further enhances model performance. The convolutional network enables a better capture of neighboring token context, thereby improving the Bi-LSTM’s ability to extract richer, more accurate sentence-level representations.

4. Conclusions

In this study, we developed a novel deep learning-based method, TSDM, for talent supply and demand matching prediction. Specifically, TSDM employed the prowess of prompt learning and expansive large language models to distill talent ability and demand information from unstructured data. The architecture harnessed a pre-trained large language model (BERT) to encode the extracted unstructured data, generating feature embeddings. Ability-specific and demand-specific encoders were then applied to learn the overall representation of these statements from the feature embeddings. The experimental results demonstrate that TSDM achieved superior performance. We also conducted several experiments to validate the model’s advantages in feature extraction and component selection, including comparisons of the performance of different large language models in prompt learning, the encoding capability of various pre-trained models for text feature embeddings, the impact of sentence-level and token-level feature embeddings on prediction performance, and the influence of different encoding components within TSDM on overall performance. In summary, our findings suggest that TSDM’s exceptional performance is largely attributed to the combination of prompt learning, large language models, and feature encoders.

Although TSDM demonstrates impressive results, there remain opportunities for improvement. Firstly, our current implementation relies on a limited set of records and attributes from the National Natural Science Foundation database. Notably, additional data sources, such as talent social network information, publications, and patents, have not been incorporated. Leveraging these rich datasets could provide a more comprehensive description of talents, thereby potentially improving the predictive capabilities of TSDM. Secondly, in terms of multimodal feature fusion, TSDM currently employs a simple feature concatenation approach prior to the multilayer perceptron network. Exploring more sophisticated fusion strategies, such as integrating features during the encoding stage or designing self-attention-based fusion mechanisms, could uncover deeper latent information within the features, leading to further performance gains.

Furthermore, several promising directions for future research include (1) integrating interpersonal knowledge graphs to embed team-level knowledge into individual capability representations. This approach would not only refine talent supply-demand matching but also extend the framework to support team-level matching, addressing a critical gap in current methodologies. (2) Employing graph convolutional networks to effectively integrate talent capabilities, social interactions, and academic features. Such a strategy would enhance the interpretability of prediction results by capturing complex relational dynamics and providing a more holistic representation of talent profiles. These advancements hold significant potential for advancing the field and addressing existing limitations in talent matching systems.

Author Contributions

Conceptualization, K.L.; methodology, C.Z.; investigation, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. The hot topics in different academic fields.

Academic	Hot Topic
Mathematical	Ab Initio Calculations, Machine Learning, Numerical Simulation, Deep Learning, Two-Dimensional Materials
Chemical	Structure-Activity Relationship, Photocatalysis, Reaction Mechanism, Ionic Liquids, Electrocatalysis
Life	Molecular Mechanism, Gene Function, Transcription Factors, Rice, Regulatory Network
Earth	Climate Change, Numerical Simulation, Tibetan Plateau, Deep Learning, Model Simulation
Engineering and Materials	Mechanical Properties, Numerical Simulation, Nanocomposites, Composites, Multi-Field Coupling
Information	Deep Learning, Machine Learning, Artificial Intelligence, Privacy Protection, Edge Computing
Management	Machine Learning, Health Management, Big Data, Artificial Intelligence, Sustainable Development
Medical	Exosomes, Macrophages, Autophagy, Ferroptosis, Long Non-Coding RNA (lncRNA)

References

Xie, Y. Study on Supply and Demand Matching of High-Skilled Talents in Strategic Emerging Industries in Shanghai. Oper. Res. Fuzziol. 2023, 13, 1601–1609. [Google Scholar] [CrossRef]
Brunello, G.; Wruuck, P. Skill shortages and skill mismatch: A review of the literature. J. Econ. Surv. 2021, 35, 1145–1167. [Google Scholar] [CrossRef]
Liu, Y. Talent scouting to accelerate technological innovation during the COVID-19 global health crisis: The role of organizational innovation in China. Eur. Manag. J. 2024; in press. [Google Scholar] [CrossRef]
Liu, X. Research on the Impact of Technological Talent Loss in High tech Enterprises on the Construction of Enterprise Talent Teams. Front. Bus. Econ. Manag. 2023, 9, 99–102. [Google Scholar] [CrossRef]
Makarius, E.E.; Srinivasan, M. Addressing skills mismatch: Utilizing talent supply chain management to enhance collaboration between companies and talent suppliers. Bus. Horiz. 2017, 60, 495–505. [Google Scholar] [CrossRef]
Jooss, S.; Collings, D.G.; McMackin, J.; Dickmann, M. A skills-matching perspective on talent management: Developing strategic agility. Hum. Resour. Manag. 2024, 63, 141–157. [Google Scholar] [CrossRef]
Moheb-Alizadeh, H.; Handfield, R.B. Developing Talent from a Supply–Demand Perspective: An Optimization Model for Managers. Logistics 2017, 1, 5. [Google Scholar] [CrossRef]
Nath, R.K.; Ahmad, T. Content Based Recommender System: Methodologies, Performance, Evaluation and Application. In Proceedings of the 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), Greater Noida, India, 16–17 December 2022; pp. 423–428. [Google Scholar]
Mustafa, N.; Ibrahim, A.O.; Ahmed, A.; Abdullah, A. Collaborative filtering: Techniques and applications. In Proceedings of the 2017 International Conference on Communication, Control, Computing and Electronics Engineering (ICCCCEE), Khartoum, Sudan, 16–18 January 2017; pp. 1–6. [Google Scholar]
Koren, Y.; Rendle, S.; Bell, R. Advances in Collaborative Filtering. In Recommender Systems Handbook; Ricci, F., Rokach, L., Shapira, B., Eds.; Springer: New York, NY, USA, 2022; pp. 91–142. [Google Scholar]
Chaudhari, A.; Seddig, A.A.H.; Sarlan, A.; Raut, R. A Hybrid Recommendation System: A Review. IEEE Access 2024, 12, 157107–157126. [Google Scholar] [CrossRef]
Zhang, L.; Wei, L. College Student Employment Recommendation Algorithm Based on Convolutional Neural Network. In Proceedings of the 2022 International Conference on Education, Network and Information Technology (ICENIT), Liverpool, UK, 2–3 September 2022; pp. 71–74. [Google Scholar]
Derry, A.; Krzywinski, M.; Altman, N. Convolutional neural networks. Nat. Methods 2023, 20, 1269–1270. [Google Scholar] [CrossRef]
Nguyen, N.K.; Le, A.-C.; Pham, H.T. Deep Bi-Directional Long Short-Term Memory Neural Networks for Sentiment Analysis of Social Data; Springer: Cham, Switzerland, 2016; pp. 255–268. [Google Scholar]
Youping, L.; Jiangang, S. Talent Recruitment Platform for Large-Scale Group Enterprises Based on Deep Learning. In Proceedings of the 2022 7th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), Chengdu, China, 22–24 April 2022; pp. 172–176. [Google Scholar]
Zhao, Z.D.; Shang, M.S. User-Based Collaborative-Filtering Recommendation Algorithms on Hadoop. In Proceedings of the 2010 Third International Conference on Knowledge Discovery and Data Mining, Phuket, Thailand, 9–10 January 2010; pp. 478–481. [Google Scholar]
Sarwar, B.; Karypis, G.; Konstan, J.; Riedl, J. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web, Hong Kong, Hong Kong, 1–5 May 2001; pp. 285–295. [Google Scholar]
Yao, S.; Yi, Z.; Zhang, L. Researches on the best-fitted talents recommendation algorithm. In Proceedings of the 27th Chinese Control and Decision Conference (2015 CCDC), Qingdao, China, 23–25 May 2015; pp. 4247–4252. [Google Scholar]
Shan, Z.P.; Lei, Y.Q.; Zhang, D.F.; Zhou, J. NASM: Nonlinearly Attentive Similarity Model for Recommendation System via Locally Attentive Embedding. IEEE Access 2019, 7, 70689–70700. [Google Scholar] [CrossRef]
Girase, S.; Powar, V.; Mukhopadhyay, D. A user-friendly college recommending system using user-profiling and matrix factorization technique. In Proceedings of the 2017 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, India, 5–6 May 2017; pp. 1–5. [Google Scholar]
Chou, Y.C.; Yu, H.Y. Based on the application of AI technology in resume analysis and job recommendation. In Proceedings of the 2020 IEEE International Conference on Computational Electromagnetics (ICCEM), Singapore, 24–26 August 2020; pp. 291–296. [Google Scholar]
Biswas, P.K.; Liu, S. A hybrid recommender system for recommending smartphones to prospective customers. Expert Syst. Appl. 2022, 208, 118058. [Google Scholar] [CrossRef]
Ponnaboyina, R.; Makala, R.; Venkateswara Reddy, E. Smart Recruitment System Using Deep Learning with Natural Language Processing; Springer: Singapore, 2022; pp. 647–655. [Google Scholar]
Aničin, L.; Stojmenović, M. Understanding Job Requirements Using Natural Language Processing. In International Scientific Conference—Sinteza 2022; Singidunum University: Belgrade, Serbia, 2022. [Google Scholar]
Abisha, D.; Keerthana, S.; Kavitha, K.; Ramya, R. Resspar: AI-Driven Resume Parsing and Recruitment System using NLP and Generative AI. In Proceedings of the 2024 Second International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI), Coimbatore, India, 28–30 August 2024; pp. 1–6. [Google Scholar]
Strübbe, S.M.; Grünwald, A.T.D.; Sidorenko, I.; Lampe, R. A Rule-Based Parser in Comparison with Statistical Neuronal Approaches in Terms of Grammar Competence. Appl. Sci. 2024, 15, 87. [Google Scholar] [CrossRef]
Zhang, Y.; Tuo, M.; Yin, Q.; Qi, L.; Wang, X.; Liu, T. Keywords extraction with deep neural network model. Neurocomputing 2020, 383, 113–121. [Google Scholar] [CrossRef]
Quazi, S.; Musa, S.M. Performing Text Classification and Categorization through Unsupervised Learning. In Proceedings of the 2023 1st International Conference on Advanced Engineering and Technologies (ICONNIC), Kediri, Indonesia, 14 October 2023; pp. 1–6. [Google Scholar]
Qi, S. Evaluation index system of science and technology innovation think tank talents based on competency model. J. Comp. Methods Sci. Eng. 2024, 24, 1101–1117. [Google Scholar] [CrossRef]
Zhu, Y.; Wang, Y.; Qiang, J.; Wu, X. Prompt-Learning for Short Text Classification. IEEE Trans. Knowl. Data Eng. 2024, 36, 5328–5339. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
Yuan, H.; Hernandez, A.A. User Cold Start Problem in Recommendation Systems: A Systematic Review. IEEE Access 2023, 11, 136958–136977. [Google Scholar] [CrossRef]
Bai, J.; Bai, S.; Chu, Y.; Cui, Z.; Dang, K.; Deng, X.; Fan, Y.; Ge, W.; Han, Y.; Huang, F.; et al. Qwen Technical Report. arXiv 2023, arXiv:2309.16609. [Google Scholar]
Tanwar, M.; Duggal, R.; Khatri, S.K. Unravelling unstructured data: A wealth of information in big data. In Proceedings of the 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), Noida, India, 2–4 September 2015; pp. 1–6. [Google Scholar]
Adnan, K.; Akbar, R. An analytical study of information extraction from unstructured and multidimensional big data. J. Big Data 2019, 6, 91. [Google Scholar] [CrossRef]
Ray, P.P. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet Things Cyber-Phys. Syst. 2023, 3, 121–154. [Google Scholar] [CrossRef]
Singh, S.; Singh, S.; Kraus, S.; Sharma, A.; Dhir, S. Characterizing generative artificial intelligence applications: Text-mining-enabled technology roadmapping. J. Innov. Knowl. 2024, 9, 100531. [Google Scholar] [CrossRef]
Jovanović, M.; Campbell, M. Generative Artificial Intelligence: Trends and Prospects. Computer 2022, 55, 107–112. [Google Scholar] [CrossRef]
Zubiaga, A. Natural language processing in the era of large language models. Front. Artif. Intell. 2023, 6, 1350306. [Google Scholar] [CrossRef] [PubMed]
Mishra, T.; Sutanto, E.; Rossanti, R.; Pant, N.; Ashraf, A.; Raut, A.; Uwabareze, G.; Oluwatomiwa, A.; Zeeshan, B. Use of large language models as artificial intelligence tools in academic research and publishing among global clinical researchers. Sci. Rep. 2024, 14, 31672. [Google Scholar] [CrossRef] [PubMed]
Younesi, R.T.; Tanha, J.; Namvar, S.; Mostafaei, S.H. A CNN-BiLSTM based deep learning model to sentiment analysis. In Proceedings of the 2024 20th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP), Babol, Iran, 21–22 February 2024; pp. 1–6. [Google Scholar]
Zhang, L.; Xiang, F. Relation Classification via BiLSTM-CNN; Springer: Cham, Switzerland, 2018; pp. 373–382. [Google Scholar]
Sharma, N.; Jain, V.; Mishra, A. An Analysis of Convolutional Neural Networks for Image Classification. Procedia Comput. Sci. 2018, 132, 377–384. [Google Scholar] [CrossRef]
Amjoud, A.B.; Amrouch, M. Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review. IEEE Access 2023, 11, 35479–35516. [Google Scholar] [CrossRef]
Zou, N.; Xiang, Z.; Chen, Y.; Chen, S.; Qiao, C. Boundary-Aware CNN for Semantic Segmentation. IEEE Access 2019, 7, 114520–114528. [Google Scholar] [CrossRef]
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef]
Jatnika, D.; Bijaksana, M.A.; Suryani, A.A. Word2Vec Model Analysis for Semantic Similarities in English Words. Procedia Comput. Sci. 2019, 157, 160–167. [Google Scholar] [CrossRef]
Pennington, J.; Socher, R.; Manning, C. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
Mostafa, M.M. Twenty years of Wikipedia in scholarly publications: A bibliometric network analysis of the thematic and citation landscape. Qual. Quant. 2023, 57, 5623–5653. [Google Scholar] [CrossRef]
Farquhar, S.; Kossen, J.; Kuhn, L.; Gal, Y. Detecting hallucinations in large language models using semantic entropy. Nature 2024, 630, 625–630. [Google Scholar] [CrossRef]
Zhu, G.; Zhao, B.; Tang, J. A Study of the AIGC-Enabled BOPPPS Smart Teaching Model. In Proceedings of the 2024 International Symposium on Artificial Intelligence for Education, Xi’an, China, 6–8 September 2024; pp. 166–170. [Google Scholar]
Zhang, Z.; Han, X.; Liu, Z.; Jiang, X.; Sun, M.; Liu, Q. ERNIE: Enhanced Language Representation with Informative Entities. arXiv 2019, arXiv:1905.07129. [Google Scholar]

Figure 1. Workflow of TSDM. (A) The architecture of the TSDM model. The descriptive texts for demand and ability were first converted into feature embeddings using the BertTokenizer through the BERT model. Subsequently, the ability-specific and demand-specific encoding networks encoded these feature embeddings separately. The encoded features were then combined and fed into a multi-layer perceptron-based feature fusion and prediction classification module. (B) The architecture of prompt learning with the large language model. (C) The architecture of the encoding network for ability-specific and demand-specific feature learning. The encoding network consists of a 1D-CNN and a Bi-LSTM network, which extract the overall meaning of the paragraph by capturing both local and global features of the sentences. (D) The schematic diagram of the TSDM system.

Figure 2. The prediction performance of the TSDM model on recommendation and prediction tasks within different disciplinary categories. (a–c) Value of precision, recall, and F1-score metrics in the recommendation task, respectively. (d–f) Value of precision, recall, and F1-score metrics in the prediction task, respectively.

Figure 3. Performance comparison of prompt learning across various LLMs in multilingual text contexts.

Figure 4. Performance comparison between token-level and sequence-level feature embedding on different disciplinary categories. (a–c) Value of precision, recall, and F1-score metrics in the recommendation task, respectively. (d–f) Value of precision, recall, and F1-score metrics in the prediction task, respectively.

Figure 5. The performance comparison between different components (MLP, Bi-LSTM, and CNN+Bi-LSTM) in the TSDM network. (a) The performance comparison under the recommendation task. (b) The performance comparison under the prediction task.

Table 1. The statistics of the dataset.

Academic	Training	Matching Test Set	Recommendation Test Set
Mathematical	884	211	25
Chemical	1922	324	66
Life	2667	605	104
Earth	1570	223	38
Engineering and Materials	2223	412	55
Information	1673	286	24
Management	360	72	12
Medical	3337	952	45

Table 2. Performance comparison under different feature embedding models.

Mission Type	Models	Acc (%)	M-Pre (%)	M-Rec (%)	M-F1 (%)
Recommendation task	Word2Vec	77.15	76.02	74.89	75.06
	GloVe	79.40	77.34	75.45	75.89
	BERT	82.46	80.40	77.15	77.48
Prediction task	Word2Vec	72.32	70.01	70.03	70.96
	GloVe	76.33	75.33	71.50	71.69
	BERT	78.39	77.47	72.29	73.21

Table 3. Performance comparison between sentence-level and token-level feature embedding.

Mission Type	Feature Embedding	Acc (%)	M-Pre (%)	M-Rec (%)	M-F1 (%)
Recommendation task	sentence-level	76.05	70.42	67.19	66.38
Recommendation task	token-level	82.46	80.40	77.15	77.48
Prediction task	sentence-level	76.10	74.21	67.19	69.41
Prediction task	token-level	78.39	77.47	72.29	73.21

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, K.; Liu, J.; Zhuang, C. Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model. Appl. Sci. 2025, 15, 2536. https://doi.org/10.3390/app15052536

AMA Style

Li K, Liu J, Zhuang C. Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model. Applied Sciences. 2025; 15(5):2536. https://doi.org/10.3390/app15052536

Chicago/Turabian Style

Li, Kunping, Jianhua Liu, and Cunbo Zhuang. 2025. "Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model" Applied Sciences 15, no. 5: 2536. https://doi.org/10.3390/app15052536

APA Style

Li, K., Liu, J., & Zhuang, C. (2025). Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model. Applied Sciences, 15(5), 2536. https://doi.org/10.3390/app15052536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model

Abstract

1. Introduction

2. Materials and Methods

2.1. Benchmark Datasets

2.2. Feature Representation

2.3. The Architecture of TSDM

2.4. Feature Extracting with Prompt Learning

2.5. Feature Embedding with the Pre-Trained BERT Model

2.6. Demand-Specific and Talent-Ability-Specific Encoding Network

2.7. The Classification Prediction Network

2.8. Loss Function

2.9. Performance Evaluation Metrics

3. Results and Discussion

3.1. Feature Embedding Extracted by the BERT Language Model Was Effective in Accounting for Talent Supply and Demand

3.2. Comparison of Prompt Learning Effectiveness Across Different Large Language Models

3.3. The Impact of Sentence-Level and Token-Level Feature Embedding from the Pre-Trained BERT Model

3.4. Ablation Study

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI