Integrating Large Language Models with near Real-Time Web Crawling for Enhanced Job Recommendation Systems
Abstract
1. Introduction
1.1. Problem Statement
1.2. Research Goal and Questions
- What are the key advantages of using LLMs to process job-related data in near real-time recommendations?
- How can web crawling technologies be integrated into job recommendation systems to utilize near real-time data collection from online sources?
- How can the effectiveness of a job recommendation system that uses these technologies be evaluated?
- How does the proposed system compare with traditional job recommendation systems in terms of accuracy and relevance?
2. Literature Review
2.1. Related Work
2.1.1. Approaches Involving Large Language Models
2.1.2. Approaches Involving Web Crawling
2.2. Research Gap
3. Research Design
3.1. Research Strategy
3.2. Prototype and Used Technologies
- Web crawling
- LLM-based keyword extraction
- LLM-based ranking of job postings
3.3. Evaluation with User Tests
4. Prototype Solution
4.1. Architecture and Implementation
4.2. Configurations
4.2.1. Keyword Extraction Configuration
4.2.2. Web Crawler
- Job Title (String)
- Description (String)
- Location (String)
- Link (URL)
4.3. Functionalities
4.4. Prototype Limitations
- Jobscout24 (www.jobscout24.ch)
- Jobs.ch (www.jobs.ch)
5. User Tests and Evaluation
5.1. Structure of the Tests and Limitations
5.2. User Test Questions
5.2.1. Quantitative Evaluation: Prototype
- How accurate was the output? This question evaluated the relevance of the job recommendations generated by the prototype. Participants rated how well the recommended jobs matched their personal job interests. A high score indicated that the prototype successfully identified relevant job listings, while a low score highlighted potential shortcomings in the CV analysis, web crawling, or recommendation.
- How satisfied were you with the functionalities of the prototype? Participants rated their satisfaction with the prototype’s features and usability. This included aspects such as the ease of uploading a CV and the intuitiveness of the interface. The goal was to understand how effectively the prototype met user expectations in terms of functionality and ease of use.
- How satisfied were you with the prototype in general? This question provided a holistic view of the participant’s overall experience with the prototype. It combined their perceptions of recommendation accuracy, usability, and general satisfaction to provide a broader understanding of the prototype’s effectiveness.
5.2.2. Quantitative Evaluation: JobScout24.ch
- How accurate was the output? Participants rated the precision of the job recommendations provided by JobScout24.ch. This served as a baseline for comparing the relevance and quality of the recommendations from the prototype, highlighting areas where the prototype either excelled or needed improvement.
- How satisfied were you with the functionalities of JobScout24.ch? This question focused on the usability of JobScout24.ch, including features such as search filters and overall interaction. Participants rated how effectively these functionalities supported their job search experience.
- How satisfied were you with JobScout24.ch in general? This question captured the participants’ overall impressions of JobScout24.ch. By comparing these scores with those of the prototype, it became possible to assess which system provided a better overall user experience.
5.2.3. Qualitative Evaluation
- What did you like about the prototype? This question focused on identifying the strengths of the prototype. Participants shared what they found most appealing. Positive feedback helped highlight the aspects of the prototype that resonated well with users and should be retained or further developed.
- What did you not like about the prototype? Participants were asked to pinpoint specific issues or shortcomings. Identifying these areas provided a clear direction for improvement.
- What could we improve in the prototype? This question encouraged participants to suggest improvements. These suggestions offer valuable guidance for future iterations of the prototype.
- Did you prefer the output from the prototype or JobScout24.ch more and why? Participants were asked to state their preference between the two systems and justify their choice. This question provided insights into the prototype’s competitive strengths and weaknesses compared to JobScout24.ch. Understanding user preferences helped pinpoint areas where the prototype could better meet or exceed the benchmark set by JobScout24.ch.
5.3. Evaluation Results
5.3.1. Quantitative Results
5.3.2. Qualitative Results
5.3.3. User Suggestions for Improvement
6. Discussion
6.1. Keyword Extraction
6.2. Ranking and Contextual Matching
6.3. Recommendation Accuracy and User Satisfaction
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Pei, Y.; Pang, Y.W.; Cai, W.; Sengupta, N.; Toshniwal, D. Leveraging LLM generated labels to reduce bad matches in job recommendations. In Proceedings of the 18th ACM Conference on Recommender Systems, Bari, Italy, 14–18 October 2024; pp. 796–799. [Google Scholar] [CrossRef]
- Wu, L.; Qiu, Z.; Zheng, Z.; Zhu, H.; Chen, E. Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; AAAI Press: Washington, DC, USA, 2024; Volume 38, pp. 9178–9186. [Google Scholar] [CrossRef]
- Kumar, N.; Gupta, M.; Sharma, D.; Ofori, I. Technical Job Recommendation System Using APIs and Web Crawling. Comput. Intell. Neurosci. 2022, 2022, 7797548. [Google Scholar] [CrossRef] [PubMed]
- Zheng, Z.; Qiu, Z.; Hu, X.; Wu, L.; Zhu, H.; Xiong, H. Generative Job Recommendations with Large Language Model. arXiv 2023, arXiv:2307.02157. [Google Scholar] [CrossRef]
- Ghosh, P.; Sadaphal, V. JobRecoGPT—Explainable job recommendations using LLMs. arXiv 2023, arXiv:2309.11805. [Google Scholar] [CrossRef]
- Musale, D.V.; Patil, K.S.; Sayyed, R.F. Job Recommendation System Using Profile Matching and Web-Crawling. Int. J. Adv. Sci. Res. Eng. Trends 2016, 1, 29–34. Available online: http://ijasret.com/VolumeArticles/FullTextPDF/24_IJASRET7747.pdf (accessed on 8 April 2025).
- Mankawade, A.; Pungliya, V.; Bhonsle, R.; Pate, S.; Purohit, A.; Raut, A. Resume Analysis and Job Recommendation. In Proceedings of the 2023 IEEE 8th International Conference for Convergence in Technology (I2CT), Pune, India, 7–9 April 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Yörük, R. An AI-based Personalised Job Recommendation and Application Assistant Agent for Enhanced Employment Matching: A Scrapus Use Case. J. Data Anal. Artif. Intell. Appl. 2025, 1, 172–189. [Google Scholar] [CrossRef]
- Gore, A.; Tonde, V.; Kolekar, A.; Sayyad, A.; Tayde, P. Real-Time Job Seeker Automation: Applying Jobs Across Platforms Effortlessly. Int. J. Sci. Innov. Eng. 2025, 2, 482–486. [Google Scholar]
- Akhtar, N.; Rabbani, S.; Rabbani, H.; Kumar, S.; Perwej, Y. AI-Driven Intelligent Resume Recommendation Engine. Int. J. Sci. Res. Sci. Eng. Technol. 2025, 12, 1141–1155. [Google Scholar] [CrossRef]
- Gugnani, A.; Misra, H. Implicit Skills Extraction Using Document Embedding and Its Use in Job Recommendation. In Proceedings of the Thirty-Second Innovative Applications of Artificial Intelligence Conference, New York, NY, USA, 7–12 February 2020; AAAI Press: Palo Alto, CA, USA, 2020; Volume 34, pp. 13286–13293. [Google Scholar] [CrossRef]
- Bansal, S.; Srivastava, A.; Arora, A. Topic Modeling Driven Content Based Jobs Recommendation Engine for Recruitment Industry. Procedia Comput. Sci. 2017, 122, 865–872. [Google Scholar] [CrossRef]
- Çelik Ertuğrul, D.; Bitirim, S. Job recommender systems: A systematic literature review, applications, open issues, and challenges. J. Big Data 2025, 12, 140. [Google Scholar] [CrossRef]
- Vom Brocke, J.; Hevner, A.; Maedche, A. Introduction to design science research. In Design Science Research. Cases; Springer International Publishing: Cham, Switzerland, 2020; pp. 1–13. [Google Scholar]
- Venable, J.R.; Pries-Heje, J.; Baskerville, R.L. Choosing a Design Science Research Methodology. In ACIS 2017 Proceedings; The Association for Information Systems (AIS): Atlanta, GA, USA, 2017; p. 112. Available online: https://aisel.aisnet.org/acis2017/112 (accessed on 8 April 2025).
- Jay, R. Introduction to LangChain and LLMs. In Generative AI Apps with LangChain and Python: A Project-Based Approach to Building Real-World LLM Apps; Apress: Berkeley, CA, USA, 2024; pp. 1–38. [Google Scholar]
- Alla, M. Designing High-Throughput FastAPI Gateways for Microservice Communication. J. Comput. Sci. Technol. Stud. 2025, 7, 823–828. [Google Scholar] [CrossRef]
- Vadlamani, V. PostgreSQL Skills Development on Cloud: A Practical Guide to Database Management with AWS and Azure; Apress: Berkeley, CA, USA, 2024. [Google Scholar]
- Gemini Team Google. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv 2024, arXiv:2403.05530. [Google Scholar] [CrossRef]
- Mitchell, R. Web Scraping with Python; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2024. [Google Scholar]
- Srivastava, S.; Shukla, H.; Landge, N.; Srivastava, A.; Jindal, D. A Comprehensive Review of Next.js Technology: Advancements, Features, and Applications. In Proceedings of the International Conference on Innovative Computing & Communication (ICICC 2024), Delhi, India, 16–17 February 2024. [Google Scholar]
- Muzumdar, P.; Bhosale, A.; Basyal, G.P.; Kurian, G. Navigating the Docker ecosystem: A comprehensive taxonomy and survey. arXiv 2024, arXiv:2403.17940. [Google Scholar] [CrossRef]
- Divakaran, A. Packaging. In Deep Dive Python: Techniques and Best Practices for Developers; Apress: Berkeley, CA, USA, 2025; pp. 647–705. [Google Scholar]
- Shaikym, A.; Zhalgassova, Z.; Sadyk, U. Design and Evaluation of a Personalized Job Recommendation System for Computer Science Students Using Hybrid Approach. In Proceedings of the 2023 17th International Conference on Electronics Computer and Computation (ICECCO), Kaskelen, Kazakhstan, 1–2 June 2023; pp. 1–7. [Google Scholar] [CrossRef]
- Zins, A.H.; Bauernfeind, U.; Del Missier, F.; Venturini, A.; Rumetshofer, H. An Experimental Usability Test for different Destination Recommender Systems. In Information and Communication Technologies in Tourism; Frew, A.J., Ed.; Springer: Vienna, Austria, 2005; pp. 228–238. [Google Scholar] [CrossRef]
- Pu, P.; Chen, L.; Hu, R. A user-centric evaluation framework for recommender systems. In Proceedings of the Fifth ACM Conference on Recommender Systems, Chicago, IL, USA, 23–27 October 2011; pp. 157–164. [Google Scholar] [CrossRef]
- Vijayan, A. A Prompt Engineering Approach for Structured Data Extraction from Unstructured Text Using Conversational LLMs. In Proceedings of the 2023 6th International Conference on Algorithms, Computing and Artificial Intelligence, Sanya, China, 22–24 December 2023; ACM: New York, NY, USA, 2023; pp. 183–189. [Google Scholar] [CrossRef]
- Shestakov, D. Intelligent Web Crawling (WI-IAT 2013 Tutorial). IEEE Intell. Inform. Bull. 2013, 14, 5–7. Available online: https://www.researchgate.net/publication/277310963_Intelligent_Web_Crawling_WI-IAT_2013_Tutorial (accessed on 8 April 2025).

| Prototype | Jobscouts24.ch | |||||
|---|---|---|---|---|---|---|
| Accuracy | Functionality Satisfaction | General Satisfaction | Accuracy | Functionality Satisfaction | General Satisfaction | |
| Tester 1 | 4 | 9 | 6 | 7 | 7 | 7 |
| Tester 2 | 5 | 9 | 6 | 7 | 9 | 8 |
| Tester 3 | 6 | 10 | 7 | 8 | 7 | 7 |
| Tester 4 | 2 | 8 | 4 | 8 | 7 | 7 |
| Tester 5 | 2 | 8 | 5 | 9 | 9 | 9 |
| Tester 6 | 2 | 7 | 4 | 7 | 7 | 7 |
| Tester 7 | 4 | 9 | 6 | 8 | 6 | 7 |
| Tester 8 | 1 | 6 | 3 | 5 | 8 | 7 |
| Tester 9 | 2 | 6 | 4 | 6 | 8 | 7 |
| Tester 10 | 2 | 10 | 5 | 9 | 8 | 8 |
| Tester 11 | 5 | 9 | 7 | 7 | 7 | 7 |
| Tester 12 | 6 | 9 | 8 | 8 | 8 | 8 |
| Tester 13 | 2 | 8 | 5 | 9 | 8 | 8 |
| Average | 3.31 | 8.31 | 5.38 | 7.54 | 7.62 | 7.46 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gauhl, D.; Kakkanattu, K.; Mukkattu, M.; Hanne, T. Integrating Large Language Models with near Real-Time Web Crawling for Enhanced Job Recommendation Systems. Computers 2025, 14, 387. https://doi.org/10.3390/computers14090387
Gauhl D, Kakkanattu K, Mukkattu M, Hanne T. Integrating Large Language Models with near Real-Time Web Crawling for Enhanced Job Recommendation Systems. Computers. 2025; 14(9):387. https://doi.org/10.3390/computers14090387
Chicago/Turabian StyleGauhl, David, Kevin Kakkanattu, Melbin Mukkattu, and Thomas Hanne. 2025. "Integrating Large Language Models with near Real-Time Web Crawling for Enhanced Job Recommendation Systems" Computers 14, no. 9: 387. https://doi.org/10.3390/computers14090387
APA StyleGauhl, D., Kakkanattu, K., Mukkattu, M., & Hanne, T. (2025). Integrating Large Language Models with near Real-Time Web Crawling for Enhanced Job Recommendation Systems. Computers, 14(9), 387. https://doi.org/10.3390/computers14090387

