Next Article in Journal
Towards Understanding Driver Acceptance of C-ITS Services—A Multi-Use Case Field Study Approach
Previous Article in Journal
Spatiotemporal Stability Responses of Tunnel Excavation Under Cyclical Footage Impact: A FLAC3D-Based Numerical Study
Previous Article in Special Issue
A Second-Classroom Personalized Learning Path Recommendation System Based on Large Language Model Technology
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Evaluating Proprietary and Open-Weight Large Language Models as Universal Decimal Classification Recommender Systems

Faculty of Electrical Engineering and Computer Science, University of Maribor, 2000 Maribor, Slovenia
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(14), 7666; https://doi.org/10.3390/app15147666
Submission received: 10 June 2025 / Revised: 5 July 2025 / Accepted: 7 July 2025 / Published: 8 July 2025
(This article belongs to the Special Issue Advanced Models and Algorithms for Recommender Systems)

Abstract

Manual assignment of Universal Decimal Classification (UDC) codes is time-consuming and inconsistent as digital library collections expand. This study evaluates 17 large language models (LLMs) as UDC classification recommender systems, including ChatGPT variants (GPT-3.5, GPT-4o, and o1-mini), Claude models (3-Haiku and 3.5-Haiku), Gemini series (1.0-Pro, 1.5-Flash, and 2.0-Flash), and Llama, Gemma, Mixtral, and DeepSeek architectures. Models were evaluated zero-shot on 900 English and Slovenian academic theses manually classified by professional librarians. Classification prompts utilized the RISEN framework, with evaluation using Levenshtein and Jaro–Winkler similarity, and a novel adjusted hierarchical similarity metric capturing UDC’s faceted structure. Proprietary systems consistently outperformed open-weight alternatives by 5–10% across metrics. GPT-4o achieved the highest hierarchical alignment, while open-weight models showed progressive improvements but remained behind commercial systems. Performance was comparable between languages, demonstrating robust multilingual capabilities. The results indicate that LLM-powered recommender systems can enhance library classification workflows. Future research incorporating fine-tuning and retrieval-augmented approaches may enable fully automated, high-precision UDC assignment systems.
Keywords: universal decimal classification; large language models; conversational systems; recommender systems; prompt engineering; zero-shot classification; hierarchical similarity universal decimal classification; large language models; conversational systems; recommender systems; prompt engineering; zero-shot classification; hierarchical similarity

Share and Cite

MDPI and ACS Style

Borovič, M.; Tomovski, E.; Li Dobnik, T.; Majninger, S. Evaluating Proprietary and Open-Weight Large Language Models as Universal Decimal Classification Recommender Systems. Appl. Sci. 2025, 15, 7666. https://doi.org/10.3390/app15147666

AMA Style

Borovič M, Tomovski E, Li Dobnik T, Majninger S. Evaluating Proprietary and Open-Weight Large Language Models as Universal Decimal Classification Recommender Systems. Applied Sciences. 2025; 15(14):7666. https://doi.org/10.3390/app15147666

Chicago/Turabian Style

Borovič, Mladen, Eftimije Tomovski, Tom Li Dobnik, and Sandi Majninger. 2025. "Evaluating Proprietary and Open-Weight Large Language Models as Universal Decimal Classification Recommender Systems" Applied Sciences 15, no. 14: 7666. https://doi.org/10.3390/app15147666

APA Style

Borovič, M., Tomovski, E., Li Dobnik, T., & Majninger, S. (2025). Evaluating Proprietary and Open-Weight Large Language Models as Universal Decimal Classification Recommender Systems. Applied Sciences, 15(14), 7666. https://doi.org/10.3390/app15147666

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop