BERT-Based Approaches for Web Service Selection and Recommendation: A Systematic Review with a Focus on QoS Prediction
Abstract
1. Introduction
1.1. Importance of QoS-Based Selection
1.2. Challenges in Web Service Selection and Recommendation
1.3. Why BERT for Web Service Selection?
1.4. Motivation and Objective
- Identifying the gaps in the literature such as how BERT can address unique challenges in web service selection and recommendation and QoS prediction or how it compares against the traditional models.
- Assessing the performance of BERT models by understanding the strengths and its weaknesses and also when and how to the model is to be applied effectively in the web service alongside with QoS attributes, distinguishing when BERT provides significant gains and when it does not.
- We identify five high impact research gaps such as QoS attribute neglect, dataset reproducibility, model interpretability, scalability, and cost awareness, and translate them into a structured future research work.
2. Background
2.1. Web Service Selection and Recommendation
2.2. Existing Techniques for Service Selection and Recommendation
3. Systematic Review Methodology
3.1. Research Questions
3.2. Search Strategy and Selection
3.3. Data Extraction
3.4. Quality Assessment
- (1)
- Whether the objective is clearly defined;
- (2)
- Whether the methods or techniques are elaborated;
- (3)
- Whether the authors have provided their findings and results based on proper data analysis.
4. Synthesis of the Literature
- (1)
- WS-DREAM (Liu et al.)—5825 services, 339 users
- (2)
- ProgrammableWeb (Meghazi et al.)—8400+ services
- (3)
- Stack Overflow (Alsayed et al.)—API documentation corpus
- (4)
- FullTextPeerRead (Jeong)—Citation dataset
- (5)
- WSDream (Liu et al.)—QoS prediction dataset
- (1)
- RMSE (Root Mean Square Error): 47.1% of evaluations (8/17)
- (2)
- MAE (Mean Absolute Error): 23.5% of evaluations (4/17)
- (3)
- Precision: 23.5% of evaluations (4/17)
- (4)
- NDCG (Normalized Discounted Cumulative Gain): 11.8% of evaluations (2/17)
- (5)
- Accuracy: 52.9% of evaluations (9/17)
- (1)
- Best Classification Performance: Meghazi et al.—DeepLAB-WSC
- (2)
- Best QoS Prediction Performance: Liu et al.—llmQoS
- (3)
- Best Semantic Matching: Alam et al.—BERT Variants
- (4)
- Best Citation Performance: Jeong—BERT+GCN
- (5)
- Best Efficiency: Zeng et al.—Lightweight BERT
4.1. Application of BERT on Web Service Selection and Recommendation in the Context of QoS
4.2. Preprocessing Pipelines from Structured Service Descriptions to BERT Inputs
4.3. Advantages and Limitations of Using BERT Models for Web Service Selection and Recommendation
4.4. Future Directions of Using BERT in the Context of Web Service Selection and Recommendation as Well as QoS Predictions
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hasnain, M.; Pasha, M.F.; Ghani, I.; Mehboob, B.; Imran, M.; Ali, A. Benchmark dataset selection of web services technologies: A factor analysis. IEEE Access 2020, 8, 53649–53665. [Google Scholar] [CrossRef]
- Sun, X.; Wang, S.; Xia, Y.; Zheng, W. Predictive-trend-aware composition of web services with time-varying quality-of-service. IEEE Access 2020, 8, 1910–1921. [Google Scholar] [CrossRef]
- Yuan, Y.; Guo, Y.; Ma, W. Dynamic service composition method based on zero-sum game integrated inverse reinforcement learning. IEEE Access 2023, 11, 111897–111908. [Google Scholar] [CrossRef]
- Rajendran, V.; Ramasamy, R.K.; Mohd-Isa, W.N. Improved eagle strategy algorithm for dynamic web service composition in the IoT: A conceptual approach. Future Internet 2022, 14, 56. [Google Scholar] [CrossRef]
- Liu, Q.; Wang, L.; Du, S.; Wyk, B.J.V. A method to enhance web service clustering by integrating label-enhanced functional semantics and service collaboration. IEEE Access 2024, 12, 61301–61311. [Google Scholar] [CrossRef]
- Bonab, M.N.; Tanha, J.; Masdari, M. A semi-supervised learning approach to quality-based web service classification. IEEE Access 2024, 12, 50489–50503. [Google Scholar] [CrossRef]
- Kowsher, M.; Sami, A.A.; Prottasha, N.J.; Arefin, M.S.; Dhar, P.K.; Koshiba, T. Bangla-bert: Transformer-based efficient model for transfer learning and language understanding. IEEE Access 2022, 10, 91855–91870. [Google Scholar] [CrossRef]
- Kim, M.; Lee, S.; Oh, Y.; Choi, H.; Kim, W. A near-real-time answer discovery for open-domain with unanswerable questions from the web. IEEE Access 2020, 8, 158346–158355. [Google Scholar] [CrossRef]
- Zhang, C.; Qin, S.; Wu, H.; Zhang, L. Cooperative mashup embedding leveraging knowledge graph for web api recommendation. IEEE Access 2024, 12, 49708–49719. [Google Scholar] [CrossRef]
- Ramasamy, R.K.; Chua, F.F.; Haw, S.C.; Ho, C.K. WSFeIn: A novel, dynamic web service composition adapter for cloud-based mobile application. Sustainability 2022, 14, 13946. [Google Scholar] [CrossRef]
- Roy, D.; Dutta, M. A systematic review and research perspective on recommender systems. J. Big Data 2022, 9, 59. [Google Scholar] [CrossRef]
- Ghafouri, S.H.; Hashemi, S.M.; Hung, P.C. A survey on web service QoS prediction methods. IEEE Trans. Serv. Comput. 2020, 15, 2439–2454. [Google Scholar] [CrossRef]
- Kumar, S.; Chattopadhyay, S.; Adak, C. TPMCF: Temporal QoS Prediction Using Multi-Source Collaborative Features. IEEE Trans. Netw. Serv. Manag. 2024, 21, 3945–3955. [Google Scholar] [CrossRef]
- Liu, H.; Zhang, Z.; Li, H.; Wu, Q.; Zhang, Y. Large Language Model Aided QoS Prediction for Service Recommendation. arXiv 2024. [Google Scholar] [CrossRef]
- Atzeni, D.; Bacciu, D.; Mazzei, D.; Prencipe, G. A systematic review of Wi-Fi and machine learning integration with topic modeling techniques. Sensors 2022, 22, 4925. [Google Scholar] [CrossRef] [PubMed]
- Xu, Z.; Gu, Y.; Yao, D. WARBERT: A Hierarchical BERT-based Model for Web API Recommendation. arXiv 2025. [Google Scholar] [CrossRef]
- Li, M.; Xu, H.; Tu, Z.; Su, T.; Xu, X.; Wang, Z. A deep learning based personalized QoE/QoS correlation model for composite services. In Proceedings of the 2022 IEEE International Conference on Web Services (ICWS), Barcelona, Spain, 10–16 July 2022; IEEE: New York, NY, USA, 2022; pp. 312–321. [Google Scholar]
- Long, S.; Tan, J.; Mao, B.; Tang, F.; Li, Y.; Zhao, M.; Kato, N. A Survey on Intelligent Network Operations and Performance Optimization Based on Large Language Models. IEEE Commun. Surv. Tutor. 2025. [Google Scholar] [CrossRef]
- Koudouridis, G.P.; Shalmashi, S.; Moosavi, R. An evaluation survey of knowledge-based approaches in telecommunication applications. Telecom 2024, 5, 98–121. [Google Scholar] [CrossRef]
- Alsayed, A.S.; Dam, H.K.; Nguyen, C. MicroRec: Leveraging Large Language Models for Microservice Recommendation. In MSR ‘24: Proceedings of the 21st International Conference on Mining Software Repositories, Lisbon, Portugal, 15–16 April 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 419–430. [Google Scholar] [CrossRef]
- Liu, H.; Zhang, W.; Zhang, X.; Cao, Z.; Tian, R. Context-aware and QoS prediction-based cross-domain microservice instance discovery. In Proceedings of the 2022 IEEE 13th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 21–23 October 2022; IEEE: New York, NY, USA, 2022; pp. 30–34. [Google Scholar]
- Meghazi, H.M.; Mostefaoui, S.A.; Maaskri, M.; Aklouf, Y. Deep Learning-Based Text Classification to Improve Web Service Discovery. Comput. Y Sist. 2024, 28, 529–542. [Google Scholar] [CrossRef]
- Zeng, K.; Paik, I. Dynamic service recommendation using lightweight BERT-based service embedding in edge computing. In Proceedings of the 2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), Singapore, 20–23 December 2021; IEEE: New York, NY, USA, 2021; pp. 182–189. [Google Scholar]
- Zhang, P.; Ren, J.; Huang, W.; Chen, Y.; Zhao, Q.; Zhu, H. A deep-learning model for service QoS prediction based on feature mapping and inference. IEEE Trans. Serv. Comput. 2023, 17, 1311–1325. [Google Scholar] [CrossRef]
- Alam, K.A.; Haroon, M. Evaluating Fine-tuned BERT-based Language Models for Web API Recommendation. In Proceedings of the 2024 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), Abu Dhabi, United Arab Emirates, 9–11 December 2024; IEEE: New York, NY, USA, 2024; pp. 135–142. [Google Scholar]
- Karapantelakis, A.; Alizadeh, P.; Alabassi, A.; Dey, K.; Nikou, A. Generative AI in mobile networks: A survey. Ann. Telecommun. 2024, 79, 15–33. [Google Scholar] [CrossRef]
- Bhanage, D.A.; Pawar, A.V.; Kotecha, K. IT infrastructure anomaly detection and failure handling: A systematic literature review focusing on datasets, log preprocessing, machine & deep learning approaches and automated tool. IEEE Access 2021, 9, 156392–156421. [Google Scholar] [CrossRef]
- Qu, G.; Chen, Q.; Wei, W.; Lin, Z.; Chen, X.; Huang, K. Mobile edge intelligence for large language models: A contemporary survey. IEEE Commun. Surv. Tutor. 2025. [Google Scholar] [CrossRef]
- Hameed, A.; Violos, J.; Santi, N.; Leivadeas, A.; Mitton, N. FeD-TST: Federated Temporal Sparse Transformers for QoS prediction in Dynamic IoT Networks. IEEE Trans. Netw. Serv. Manag. 2024, 22, 1055–1069. [Google Scholar] [CrossRef]
- Huang, W.; Zhang, P.; Chen, Y.; Zhou, M.; Al-Turki, Y.; Abusorrah, A. QoS Prediction Model of Cloud Services Based on Deep Learning. IEEE/CAA J. Autom. Sin. 2022, 9, 564–566. [Google Scholar] [CrossRef]
- Le, F.; Srivatsa, M.; Ganti, R.; Sekar, V. Rethinking data-driven networking with foundation models: Challenges and opportunities. In Proceedings of the 21st ACM Workshop on Hot Topics in Networks, Austin, TX, USA, 14–15 November 2022; pp. 188–197. [Google Scholar]
- Jeong, C.; Jang, S.; Shin, H.; Park, E.; Choi, S. A Context-Aware Citation Recommendation Model with BERT and Graph Convolutional Networks. arXiv 2019. [Google Scholar] [CrossRef]
- Liu, M.; Xu, H.; Sheng, Q.Z.; Wang, Z. QoSGNN: Boosting QoS Prediction Performance with Graph Neural Networks. IEEE Trans. Serv. Comput. 2023, 17, 645–658. [Google Scholar] [CrossRef]
- Lian, H.; Li, J.; Wu, H.; Zhao, Y.; Zhang, L.; Wang, X. Toward Effective Personalized Service QoS Prediction From the Perspective of Multi-Task Learning. IEEE Trans. Netw. Serv. Manag. 2023, 20, 2587–2597. [Google Scholar] [CrossRef]
- Jirsik, T.; Trčka, Š.; Celeda, P. Quality of service forecasting with LSTM neural network. In Proceedings of the 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), Washington, DC, USA, 8–12 April 2019; IEEE: New York, NY, USA, 2019; pp. 251–260. [Google Scholar]
- Guo, C.; Zhang, W.; Dong, N.; Liu, Z.; Xiang, Y. QoS-aware diversified service selection. IEEE Trans. Serv. Comput. 2022, 16, 2085–2099. [Google Scholar] [CrossRef]
- Boulakbech, M.; Messai, N.; Sam, Y.; Devogele, T. Deep learning model for personalized web service recommendations using attention mechanism. In Proceedings of the International Conference on Service-Oriented Computing, Rome, Italy, 28 November–1 December 2023; Springer Nature: Cham, Switzerland, 2023; pp. 19–33. [Google Scholar]
- Xue, L.; Zhang, F. Lcpcwsc: A web service classification approach based on label confusion and priori correction. Int. J. Web Inf. Syst. 2024, 20, 213–228. [Google Scholar] [CrossRef]
- Huang, Y.; Cao, Z.; Chen, S.; Zhang, X.; Wang, P.; Cao, Q. Interpretable web service recommendation based on disentangled representation learning. J. Intell. Fuzzy Syst. 2023, 45, 133–145. [Google Scholar] [CrossRef]
- Wang, X.; Zhou, P.; Wang, Y.; Liu, X.; Liu, J.; Wu, H. Servicebert: A pre-trained model for web service tagging and recommendation. In Proceedings of the International Conference on Service-Oriented Computing, Online, 22–25 November 2021; Springer International Publishing: Cham, Switzerland, 2021; pp. 464–478. [Google Scholar]
- Yang, Y.; Qamar, N.; Liu, P.; Grolinger, K.; Wang, W.; Li, Z.; Liao, Z. Servenet: A deep neural network for web services classification. In Proceedings of the 2020 IEEE International Conference on Web Services (ICWS), Beijing, China, 19–23 October 2020; pp. 168–175. [Google Scholar]
- Wang, Z.; Zhang, X.; Li, Z.S.; Yan, M. QoSBERT: An Uncertainty-Aware Approach Based on Pre-trained Language Models for Service Quality Prediction. IEEE Trans. Serv. Comput. 2025, 1–13. [Google Scholar] [CrossRef]
- Liu, P.; Zhang, L.; Gulla, J.A. Pre-train, prompt, and recommendation: A comprehensive survey of language modeling paradigm adaptations in recommender systems. Trans. Assoc. Comput. Linguist. 2023, 11, 1553–1571. [Google Scholar] [CrossRef]
- Van, M.M.; Tran, T.T. BeLightRec: A Lightweight Recommender System Enhanced with BERT. In Proceedings of the International Conference on Intelligent Systems and Data Science, Nha Trang, Vietnam, 9–10 November 2024; Springer Nature: Singapore, 2024; pp. 30–43. [Google Scholar]
- Kharidia, V.; Paprunia, D.; Kanikar, P. LightFusionRec: Lightweight Transformers-Based Cross-Domain Recommendation Model. In Proceedings of the 2024 First International Conference on Software, Systems and Information Technology (SSITCON), Tumkur, India, 18–19 October 2024; IEEE: New York, NY, USA, 2024; pp. 1–7. [Google Scholar]
- Liu, Q.; Zhao, X.; Wang, Y.; Wang, Y.; Zhang, Z.; Sun, Y.; Li, X.; Wang, M.; Jia, P.; Chen, C.; et al. Large language model enhanced recommender systems: Taxonomy, trend, application and future. arXiv 2024, arXiv:2412.13432. [Google Scholar] [CrossRef]
- Singh, S. BERT Algorithm used in Google Search. Math. Stat. Eng. Appl. 2021, 70, 1641–1650. [Google Scholar] [CrossRef]
- Sun, F.; Liu, J.; Wu, J.; Pei, C.; Lin, X.; Ou, W.; Jiang, P. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 1441–1450. [Google Scholar]
- Fine-Tune and Host Hugging Face Bert Models on Amazon Sagemaker|AWS Machine Learning Blog. Available online: https://aws.amazon.com/blogs/machine-learning/fine-tune-and-host-hugging-face-bert-models-on-amazon-sagemaker/ (accessed on 23 October 2025).







| Advantage | Description |
|---|---|
| Contextual Understanding | The nature of bidirectional process allows it to develop contextual understanding of words. For example, the word “bank” can be distinguished with different meanings depending on the context of its usage [7]. |
| Transfer Learning | The model can be fine-tuned for specific tasks with small or limited datasets due to pre-training on large amounts of text data. This helps in improving performance across different applications without the need to train the model from the beginning [8]. |
| Performance on Benchmarks | Previously, BERT consistently outperformed older models based on wide range of NLP benchmarks such as SQuAD (Stanford Question Answering Dataset), thus demonstrating the robustness and adaptability of the model. |
| Research Questions (RQ) | Motivations |
|---|---|
| (RQ1) How has BERT been applied to web service selection and recommendation in the context of QoS? |
|
| (RQ2) What are the advantages and limitations of using BERT models for web service selection and recommendation? | To critically assess the advantages and its limitations in BERT application, especially in the context of QoS. |
| (RQ3) How does BERT compare to the traditional method? |
|
| (RQ4) What are the challenges faced when using BERT models for tasks such as web selection, recommendation, and QoS predictions? |
|
| (RQ5) What are the future directions of using BERT in the context of web service selection and recommendation as well as QoS predictions? |
|
| Digital Library | Search String |
|---|---|
| IEEE | ((“Full Text & Metadata”:“BERT” OR “Full Text & Metadata”:“Bidirectional Encoder Representations”)) AND ((“Full Text & Metadata”:“web service recommendation” OR “Full Text & Metadata”:“Service selection”)) AND ((“Full Text & Metadata”:“QoS” OR “Quality of Service”)) AND ((“Full Text & Metadata”:“prediction” OR “optimization”)) |
| ACM | [[All: “bert”] OR [All: “bidirectional encoder representations”]] AND [[All: “web service recommendation”] OR [All: “service selection”]] AND [[All: “qos”] OR [All: “quality of service”]] AND [[All: “prediction”] OR [All: “optimization”]] |
| Science Direct | (“BERT” OR “Bidirectional Encoder Representations”) AND (“web service recommendation” OR “Service selection”) AND (“QoS” OR “Quality of Service”) |
| Google Scholar | web service recommendation and qos prediction using bert |
| Inclusion | Exclusion |
|---|---|
|
|
| Data Point | Description | Research Question Relevant |
|---|---|---|
| BERT Methods or techniques | Methods and techniques used in the article | RQ1 |
| Advantages | Advantages identified in the articles | RQ2 |
| Techniques for web service selections and recommendation | Techniques or methods used in the traditional service selection and recommendation | RQ3 |
| Challenges or Limitations | Challenges and limitations of BERT | RQ4 |
| Future directions | Future research ideas or techniques that can be considered for BERT | RQ5 |
| Study | Year | Model Type | Findings |
|---|---|---|---|
| Kumar, S. et al. [13] | 2024 | Transformer, Deep Learning, Graph Neural Networks | Integrates graph convolution and collaborative filtering for temporal QoS prediction and addresses data sparsity and temporal dependencies in QoS prediction |
| Liu, H. et al. [14] | 2024 | Large Language Models (LLMs), NLP Transformer Models | Introduces llmQoS model using LLMs (RoBERTa, Phi3mini) for QoS prediction. The proposed method overcomes data sparsity issue without relying only on historical interactions. |
| Atzeni et al. [15] | 2022 | Machine Learning | Reviews the integration of Wi-Fi with machine learning and topic modeling. |
| Xu, Z. et al. [16] | 2024 | Deep Learning | Uses pre-trained BERT for semantic understanding of API descriptions |
| Li et al. [17] | 2022 | Deep Learning | Presents a deep learning model for personalized QoE/QoS correlation in composite services. |
| Long et al. [18] | 2025 | Large Language Models (LLMs) | Surveys the use of LLMs for intelligent network operations. |
| Koudouridis et al. [19] | 2024 | Knowledge-Based Approaches | Surveys knowledge-based approaches in telecommunications. |
| Alsayed, A.S. et al. [20] | 2024 | Large Language Models (LLMs), Deep Learning | Provides context-aware recommendations for microservice discovery |
| Liu et al. [21] | 2022 | Context-Aware, QoS Prediction | Proposes a context-aware and QoS prediction-based method for microservice instance discovery. |
| Meghazi, H.M. et al. [22] | 2024 | Natural Language Processing (NLP), Deep Learning | Proposes DeepLAB-WSC using word embeddings (Word2Vec, GloVe, BERT) which outperforms state-of-the-art web service classification methods. |
| Zeng et al. [23] | 2021 | Lightweight BERT | Proposes a lightweight BERT-based method for dynamic service recommendation in edge computing. |
| Zhang, P. et al. [24] | 2024 | Deep Learning, Neural Networks | Deep learning model with feature mapping for QoS prediction which addresses the challenges in service quality prediction. |
| Alam et al. [25] | 2024 | Fine-tuned BERT | Evaluates fine-tuned BERT models for recommending Web APIs, focusing on semantic enrichment. |
| Karapantelakis et al. [26] | 2024 | Generative AI (Survey) | Surveys the application of generative AI in mobile networks. |
| Bhanage et al. [27] | 2021 | Machine Learning, Deep Learning (Review) | Reviews ML/DL techniques for anomaly detection and failure handling in IT infrastructure. |
| Qu et al. [28] | 2025 | Large Language Models (LLMs) | Surveys the use of LLMs in mobile edge intelligence to enhance performance. |
| Hameed, A. et al. [29] | 2024 | Deep Learning | Combines federated learning with sparse transformer architecture and preserves privacy while enabling collaborative QoS prediction. |
| Huang, W.J. et al. [30] | 2022 | Deep Neural Networks, Cloud Service Modeling | Presents deep learning-based QoS prediction model for cloud services and addresses QoS prediction challenges |
| Le et al. [31] | 2022 | Foundation Models | Explores the potential of foundation models for network traffic analysis and management. |
| Jeong, C. [32] | 2019 | Graph Convolutional Networks (GCN) | Proposed method that combines BERT with Graph Convolutional Networks (GCN). |
| Liu, M. et al. [33] | 2023 | Deep Learning | Proposed method for QoS prediction using Graph Neural Networks. |
| Lian, H. et al. [34] | 2023 | Deep Learning | Proposed PMT (Personalized Multi-Task) learning framework; a multi-task approach for improved prediction accuracy |
| Jirsik, T. et al. [35] | 2019 | Deep Learning | Proposed Long Short-Term Memory (LSTM) neural networks for QoS attribute forecasting |
| Guo, C. et al. [36] | 2022 | Deep Learning | Proposed service distance-based attention mechanism that embeds users in the model for QoS-aware selection. |
| Boulakbech, M. et al. [37] | 2023 | Deep Learning, Multi-modal Learning | There are two attention mechanisms proposed, namely, functional (tag-based) and non-functional (QoS-based). This improves recommendation quality through dual attention. |
| Study | Dataset | Size (Services) | Domain | QoS Attributes | Availability |
|---|---|---|---|---|---|
| Kumar et al. [13]—TPMCF | WSDREAM-2 | Temporal data | Temporal QoS Prediction | Temporal QoS metrics | Research dataset |
| Liu et al. [14]—llmQoS | WS-DREAM | 5825 | Web Services QoS | Throughput, Response Time | Publicly available |
| Atzeni et al. [15] | Wi-Fi datasets | Not available | Wi-Fi Networks | Wi-Fi performance | Various |
| Xu et al. [16]—WARBERT | Web API Collection | Large scale APIs | API Recommendation | Not available | Not specified |
| Li et al. [17] | Composite Services | Composite services | Composite Services | QoE, QoS correlation | Not specified |
| Long et al. [18] | Survey (various) | Not available (survey) | Network Operations | Not available | Not available |
| Koudouridis et al. [19] | Survey (various) | Not available (survey) | Telecommunications | Not available | Not available |
| Alsayed et al. [20]—MicroRec | Stack Overflow + API Corpus | API docs | Microservices | Not available | Stack Overflow (public) |
| Liu et al. [21] | Microservice Instances | Microservices | Microservices | QoS + Context | Not specified |
| Meghazi et al. [22]—DeepLAB-WSC | ProgrammableWeb | 8400+ services | Web Service Classification | Not available | Publicly available |
| Zeng et al. [23]—Lightweight BERT | Edge Services | Edge services | Edge Computing | Real-time constraints | Not specified |
| Zhang et al. [24] | Cloud Services | Cloud services | Cloud Computing | Service quality metrics | Not specified |
| Alam et al. [25] | Web API Repository | Web APIs | API Discovery | Not available | Not specified |
| Karapantelakis et al. [26] | Survey (various) | Not available (survey) | Mobile Networks | Not available | Not available |
| Bhanage et al. [27] | Review (various) | Not available (review) | IT Infrastructure | Anomalies | Not available |
| Qu et al. [28] | Survey (various) | Not available (survey) | Mobile Edge Intelligence | Not available | Not available |
| Hameed et al. [29]—FeD-TST | IoT Networks/B5G | IoT services | IoT/B5G Networks | Network QoS | Not specified |
| Huang et al. [30] | Cloud Services | Cloud services | Cloud Services | Cloud QoS | Not specified |
| Le et al. [31] | Network Traffic | Network logs | Network Traffic | Not available | Not specified |
| Jeong [32]—BERT+GCN | FullTextPeerRead | Citation data | Citation Recommendation | Not available | Publicly available |
| Liu et al. [33]—QoSGNN | WSDream | Not specified | QoS Prediction | Various QoS | Publicly available |
| Lian et al. [34]—PMT | QoS Dataset | Not specified | QoS Prediction | Multi-attribute QoS | Not specified |
| Jirsik et al. [35] | Network QoS Metrics | Network data | Network QoS | Network QoS | Not specified |
| Guo et al. [36] | Service Selection | Services | Service Selection | QoS attributes | Not specified |
| Boulakbech et al. [37] | Service Recommendation | Services | Service Recommendation | Not available | Not specified |
| Study | Task | RMSE | MAE | Precision | NDCG | Accuracy | Improvement |
|---|---|---|---|---|---|---|---|
| Kumar [13]—TPMCF | Temporal QoS | Improved | - | - | - | - | Significant |
| Liu [14]—llmQoS (T, 20%) | QoS Throughput | 0.101 (±0.004) | 0.095 (±0.003) | - | - | - | 10.2% RMSE ↓ |
| Liu [14]—llmQoS (T, 5%) | QoS Throughput | 0.083 (±0.003) | 0.079 (±0.002) | - | - | - | 7.2% RMSE ↓ |
| Liu [14]—llmQoS (RT, 20%) | QoS Response Time | 0.077 (±0.005) | 0.073 (±0.004) | - | - | - | 21.4% RMSE ↓ |
| Liu [14]—llmQoS (RT, 5%) | QoS Response Time | 0.066 (±0.003) | 0.064 (±0.002) | - | - | - | 2.1% RMSE ↓ |
| Xu [16]—WARBERT | API Recommendation | - | - | Better | Improved | 0.83 | Better precision@k |
| Alsayed [20] —MicroRec | Microservices | - | - | Improved | - | Better | Superior context |
| Meghaz [22] —DeepLAB | Classification | - | - | - | - | ~0.85 | ~15–20% accuracy |
| Zeng [23] —Lightweight | Edge Recommendation | - | - | - | - | Similar | 40% faster |
| Alam [25]—BERT Variants | API Recommendation | - | - | - | 0.87 (RoBERTa) | 0.87 | RoBERTa best |
| Hameed [29]—FeD-TST | QoS (IoT) | - | - | - | - | - | Privacy preserving |
| Jeong [32]—BERT+GCN | Citations | - | - | - | - | MAP: 0.84 | +28% MAP |
| Liu [33]—QoSGNN | QoS Prediction | Improved | - | - | - | Better | Superior GNN |
| Lian—PMT | Personalized QoS | Improved | - | - | - | Better | Multi-task |
| Jirsik [35]—LSTM | QoS Forecasting | Better | - | - | - | Better | Better granularity |
| Guo [36]—QoS-Aware | Service Selection | - | - | Improved | - | - | Enhanced diversity |
| Boulakbech [37]—Dual | Personalized Rec | - | - | ~0.78 | - | - | Better personal |
| QoS Attribute | Papers Addressing | Coverage Level | Gap Severity | Future Directions |
|---|---|---|---|---|
| Response Time | Liu, Kumar, Hameed, Jirsik | Well Studied (16%) | Low | Multi-modal temporal prediction |
| Throughput | Liu, Kumar | Well Studied (8%) | Low | Cross-domain generalization |
| Latency | Hameed, Zeng | Moderately Studied (8%) | Medium | Real-time edge optimization |
| Accuracy (Classification) | Multiple implicit | Moderately Studied (20%) | Medium | Explainable predictions |
| Availability | General mentions, no specific studies | Under Studied (0%) | High | BERT for uptime prediction |
| Reliability | Limited coverage | Under Studied (0%) | High | Sentiment analysis of reviews |
| Scalability | Implicit in some studies | Under Studied (0%) | High | BERT for auto-scaling |
| Security | No dedicated studies | Critically Under Studied (0%) | Critical | Security policy understanding |
| Cost | No dedicated studies | Critically Under Studied (0%) | Critical | Cost–benefit analysis |
| Usability | No dedicated studies | Critically Under Studied (0%) | Critical | UX improvement suggestions |
| Study (Year) | Model/Approach | Data and Task | Key Results |
|---|---|---|---|
| Liu et al., 2024 [14] | llmQoS: RoBERTa + CF for QoS prediction | WS-DREAM (user × service QoS matrix) | RMSE↓ by ~7–10% (throughput), RMSE↓ by 2–21% (response time) vs. CF baselines. Consistent MAE/RMSE gains at all sparsity levels. |
| Huang et al., 2023 [39] | WSR-DRL: BERT (service name) + 2D-CNN+BiLSTM (description) + disentangled interactions | Real user-service rating data (service rec) | Outperforms DMF, DeepFM, DKN, GCMC, etc., on Precision@10, Recall@10, NDCG@10. BERT-name + CNN/LSTM provides richer features. |
| Wang et al., 2021 [40] | ServiceBERT: BERT pre-trained (MLM+RTD+contrastive) for service text | ProgrammableWeb APIs/mashup tasks | Higher accuracy on API tagging and mashup recommendation vs. prior methods. (Reported “better performance” on two service tasks.) |
| Yang et al., 2020 [41] | ServeNet-BERT: BERT embeddings of service name and description + NN | Service description classification (OpenAPI) | Achieved much higher classification accuracy than 10 ML baselines (e.g., LDA+SVM, LSTM). |
| Study | Base Model | Input Representation | Fusion Method | Output Layer | Training Strategy | Dataset |
|---|---|---|---|---|---|---|
| Liu et al., 2024 [14] llmQoS | RoBERTa-base (Pre-trained LLM) | - User descriptive attributes (text) - Service descriptive attributes (text) - Historical QoS values (numerical) | - LLM embeddings (768-dim) - Concatenated with user-service ID embeddings - Fed into collaborative filtering network | - Fully connected layers - Regression output for QoS prediction - Separate outputs for throughput and response time | - Two-stage: (1) Pre-trained RoBERTa frozen (2) Fine-tune CF network - Adam optimizer - MSE loss function | WS-DREAM - 339 users—5825 services - Throughput and response time metrics - Multiple sparsity levels (5%, 10%, 15%, 20%) |
| Huang et al., 2023 [39] WSR-DRL | BERT-base (Service name encoder) + CNN-BiLSTM (Description encoder) | - Service name—BERT tokenization - Service description -Word embeddings - User-service interaction matrix - Disentangled latent factors | - BERT embeddings for service names - 2D-CNN extracts local features from descriptions - BiLSTM captures sequential context - Concatenate name + description embeddings - Disentangled user-service interactions | - Disentangled representation layer - Rating prediction layer - Softmax for ranking | - End-to-end training with disentanglement loss - Separate user and service latent factors - Regularization for interpretability - Cross-entropy + MSE loss | - Real-world user-service ratings - Service descriptions from repositories - User interaction history - Multi-domain services |
| Wang et al., 2021 [40] ServiceBERT | BERT-base (Custom pre-trained on service corpus) | - Service textual descriptions - API documentation - Service tags and categories - Mashup descriptions - [CLS] token for classification | - Multi-task pre-training: - Masked Language Modeling (MLM) - Replaced Token Detection (RTD) - Contrastive learning for service pairs - Domain-specific vocabulary | - Multi-task heads: - Service tagging (classification) - Mashup recommendation (ranking) - Service similarity (contrastive) | - Three-stage pre-training: (1) MLM on service corpus (2) RTD for token detection (3) Contrastive learning - Fine-tuning for downstream tasks - AdamW optimizer | - ProgrammableWeb - 16,000+ APIs - 6000+ mashups - Service descriptions and tags - API-mashup relationships |
| Yang et al., 2020 [41] ServeNet-BERT | BERT-base (Sentence embeddings) | - Service name - Service description text - OpenAPI specifications - Concatenated text features | - BERT generates sentence embeddings - Embeddings fed to feedforward neural network - Pooling of token embeddings ([CLS] + mean pooling) - Dense layers for feature transformation | - Multi-layer perceptron (MLP) - Softmax layer for classification - Output: Service category labels | - Transfer learning from pre-trained BERT - Fine-tuning on service classification - Cross-entropy loss - Dropout for regularization (0.1–0.3) - Early stopping | - OpenAPI dataset - 2000+ web services - 10 service categories - Service descriptions and specifications |
| Study | Service Spec type | Fields Used from WSDL/OpenAPI | Typical Text Length (tokens) | Encoder Type |
|---|---|---|---|---|
| Liu et al., 2024 [14] | WSDL | Operation name, documentation, portType | 30–80 | BERT-base (CLS pooling) |
| Huang et al., 2023 [39] | OpenAPI | Path, operation summary, description, tags | 40–120 | Sentence-BERT (mean pooling) |
| Wang et al., 2021 [40] | Swagger/OpenAPI | API name, description, parameter names | 50–150 | RoBERTa-base (CLS pooling) |
| Yang et al., 2020 [41] | WSDL + free text | Operation name, documentation + manual notes | 60–90 | BERT-base (CLS pooling) |
| Wang, Z et al., 2025 [42] | Mixed APIs | Service name, description, category | 40–100 | Domain-tuned BERT/SBERT |
| Aspects | Advantages | Limitations |
|---|---|---|
| Text Understanding | Captures deep semantic/contextual meaning from service descriptions, user reviews, API docs. | Requires large, domain-specific corpora for effective fine-tuning |
| User Preference Modeling | Learns nuanced user preferences from reviews and mashup queries, outperforming keyword-based models. | Transformer models are computationally intensive during training and inference. |
| QoS Prediction Accuracy | Enriches feature vectors with descriptive text, reducing RMSE/MAE in service quality prediction (e.g., throughput, response time). | Gains may be marginal for some metrics or datasets (e.g., response time). |
| Service Recommendation | Improves top-k ranking metrics (Precision@10, Recall, NDCG) via BERT embeddings of service name/description. | BERT alone cannot model numerical QoS—needs hybridization with collaborative filtering or neural models. |
| Cold-Start Problem Handlin | Alleviates matrix sparsity using textual side info; improves predictions when interaction data are sparse. | Still limited if no descriptive data are available for new services/users. |
| Model Interpretability | Some models improve interpretability with disentangled latent spaces or hybrid attention mechanisms. | Most BERT-based models are still black boxes with limited explainability. |
| Generalization | BERT can transfer to various tasks (e.g., classification, tagging, recommendation) with minimal modification. | Pre-trained BERT may underperform if not adapted to domain-specific terminology. |
| Integration Potential | Easily combined with CNNs, LSTMs, or CF layers to form flexible architectures for QoS-aware recommendations. | Increases design complexity and hyperparameter tuning overhead. |
| Aspect | BERT-Family Encoders (BERT, RoBERTa) | GPT-Style LLMs (GPT-4/5, Llama-2/3, etc.) |
|---|---|---|
| Typical parameter scale | ~110–400 M parameters (base–large) (fine-tunable on a single GPU) | Billions to >100 B parameters (often multi-GPU or hosted) |
| Typical inference latency | Low, milliseconds to low tens of milliseconds per sequence on commodity GPUs/CPUs | Higher, tens to hundreds of milliseconds per call, often with API overhead |
| Deployment model | Frequently deployed on premise or private cloud; easy to containerize and co-locate with QoS engines | Commonly accessed via external cloud APIs; full on-prem deployment is costly and complex |
| Primary role | Encoder: produces fixed-length embeddings for services/users; integrates with CF/MLP QoS predictors | Generator/reasoning agent: produces text, explanations, or decisions from prompts |
| Key Direction | Academic Focus | Industry/Application Trends |
|---|---|---|
| Domain/Transfer Learning [45] | Adapting BERT to new domains via fine-tuning only parts of the model or adding adapter layers; multi-task contrastive pretraining for cross-domain recommendation. | Cross-domain recommenders using light models, for example, DistilBERT with simple fusion; industry emphasis on reusable models across product lines. |
| Multi-Modal Fusion | Combining text with images/audio, example, using CLIP for vision, BERT for text, and fusing them in a joint model. | Multimedia recommendation engines that ingest captions, reviews, and images; multi-modal BERT variants in practice. |
| Interpretability and Hybrid | Hybrid models combining CF/GCN and BERT semantic signals, for example, BeLightRec, disentangled representations, example WSR-DRL; LLMs generating natural explanations. | Use of review snippets or keywords to explain recommendations; commercial explainable AI platforms example, Watson, X.ai. |
| Knowledge and Context [46,47] | Enhancing recommendation systems with LLM-generated summaries; using knowledge graphs for improved semantic matching. | Use of knowledge graph APIs, for example, Neo4j combined with BERT features for recommendation. Search engines (Google, Bing) already fuse BERT with KG for richer results. |
| Prompt-based Learning | Using pre-trained BERT with minimal tuning via prompts, for example, cloze tasks; soft prompt tuning for personalization. | On-demand recommendation via LLM APIs; prompt-based language modeling replacing full retraining. |
| Model Efficiency | Distillation, quantization, pruning of BERT for low-latency applications. | Use of DistilBERT, MobileBERT in production; ONNX/TensorRT deployment; edge recommendation systems. |
| Tools and Platforms [48,49] | Libraries like RecBole, Hugging Face Transformers; benchmarking transformer-based recommendation. | One-click deployment with AWS SageMaker + Hugging Face; BigQuery ML and TensorFlow Hub integrations. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mahanra Rao, V.; Ramasamy, R.K.; Sayeed, M.S. BERT-Based Approaches for Web Service Selection and Recommendation: A Systematic Review with a Focus on QoS Prediction. Future Internet 2025, 17, 543. https://doi.org/10.3390/fi17120543
Mahanra Rao V, Ramasamy RK, Sayeed MS. BERT-Based Approaches for Web Service Selection and Recommendation: A Systematic Review with a Focus on QoS Prediction. Future Internet. 2025; 17(12):543. https://doi.org/10.3390/fi17120543
Chicago/Turabian StyleMahanra Rao, Vijayalakshmi, R Kanesaraj Ramasamy, and Md Shohel Sayeed. 2025. "BERT-Based Approaches for Web Service Selection and Recommendation: A Systematic Review with a Focus on QoS Prediction" Future Internet 17, no. 12: 543. https://doi.org/10.3390/fi17120543
APA StyleMahanra Rao, V., Ramasamy, R. K., & Sayeed, M. S. (2025). BERT-Based Approaches for Web Service Selection and Recommendation: A Systematic Review with a Focus on QoS Prediction. Future Internet, 17(12), 543. https://doi.org/10.3390/fi17120543

