Author Contributions
Conceptualization, G.B. and J.D.C.; Data curation, S.L.; Formal analysis, G.B.; Funding acquisition, J.D.C.; Investigation, G.B.; Methodology, G.B. and J.D.C.; Project administration, S.L. and J.D.C.; Resources, S.L.; Software, G.B. and N.C.; Supervision, J.D.C.; Validation, S.L. and N.C.; Visualization, G.B.; Writing—original draft, G.B.; Writing—review and editing, G.B., S.L., N.C. and J.D.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data are not publicly available due to confidentiality agreements with Hyundai Motor Company.
Acknowledgments
We gratefully acknowledge the support of the Hyundai Motor Company. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of Hyundai Motor Company.
Conflicts of Interest
Shinsun Lee was employed by “Hyundai Motor Company”. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
LLM | Large Language Model |
RAG | Retrieval-Augmented Generation |
BGE | BAAI (Beijing Academy of Artificial Intelligence) General Embedding |
SecMulti-RAG | Secure Multifaceted-RAG |
References
- Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.t.; Rocktäschel, T.; et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS ’20), Vancouver, BC, Canada, 6–12 December 2020; Curran Associates Inc.: Red Hook, NY, USA, 2020. [Google Scholar]
- OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; et al. GPT-4 Technical Report. arXiv 2024, arXiv:2303.08774. [Google Scholar]
- Anthropic. The Claude 3 Model Family: Opus, Sonnet, Haiku. 2024. Available online: https://api.semanticscholar.org/CorpusID:268232499 (accessed on 5 January 2025).
- DeepSeek-AI; Guo, D.; Yang, D.; Zhang, H.; Song, J.; Zhang, R.; Xu, R.; Zhu, Q.; Ma, S.; Wang, P.; et al. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv 2025, arXiv:2501.12948. [Google Scholar]
- Zhou, Y.; Liu, Z.; Dou, Z. AssistRAG: Boosting the Potential of Large Language Models with an Intelligent Information Assistant. arXiv 2024, arXiv:2411.06805. [Google Scholar] [CrossRef]
- Gutiérrez, B.J.; Shu, Y.; Gu, Y.; Yasunaga, M.; Su, Y. HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models. arXiv 2025, arXiv:2405.14831. [Google Scholar]
- Jeong, S.; Baek, J.; Cho, S.; Hwang, S.J.; Park, J.C. Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity. arXiv 2024, arXiv:2403.14403. [Google Scholar]
- Yu, W.; Iter, D.; Wang, S.; Xu, Y.; Ju, M.; Sanyal, S.; Zhu, C.; Zeng, M.; Jiang, M. Generate rather than Retrieve: Large Language Models are Strong Context Generators. arXiv 2023, arXiv:2209.10063. [Google Scholar] [CrossRef]
- Wu, R.; Chen, S.; Su, X.; Zhu, Y.; Liao, Y.; Wu, J. A Multi-Source Retrieval Question Answering Framework Based on RAG. arXiv 2024, arXiv:2405.19207. [Google Scholar] [CrossRef]
- Zhang, S.; Ye, L.; Yi, X.; Tang, J.; Shui, B.; Xing, H.; Liu, P.; Li, H. “Ghost of the past”: Identifying and resolving privacy leakage from LLM’s memory through proactive user interaction. arXiv 2024, arXiv:2410.14931. [Google Scholar]
- Kim, S.; Yun, S.; Lee, H.; Gubri, M.; Yoon, S.; Oh, S.J. ProPILE: Probing Privacy Leakage in Large Language Models. arXiv 2023, arXiv:2307.01881. [Google Scholar] [CrossRef]
- Hayes, J.; Melis, L.; Danezis, G.; Cristofaro, E.D. LOGAN: Evaluating Privacy Leakage of Generative Models Using Generative Adversarial Networks. arXiv 2017, arXiv:1705.07663. [Google Scholar]
- Lukas, N.; Salem, A.; Sim, R.; Tople, S.; Wutschitz, L.; Zanella-Béguelin, S. Analyzing Leakage of Personally Identifiable Information in Language Models. arXiv 2023, arXiv:2302.00539. [Google Scholar] [CrossRef]
- Chong, C.J.; Hou, C.; Yao, Z.; Talebi, S.M.S. Casper: Prompt Sanitization for Protecting User Privacy in Web-Based Large Language Models. arXiv 2024, arXiv:2408.07004. [Google Scholar] [CrossRef]
- Zhou, X.; Lu, Y.; Ma, R.; Gui, T.; Zhang, Q.; Huang, X. TextMixer: Mixing Multiple Inputs for Privacy-Preserving Inference. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, 6–10 December 2023; Bouamor, H., Pino, J., Bali, K., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2023; pp. 3749–3762. [Google Scholar] [CrossRef]
- Yang, A.; Yang, B.; Zhang, B.; Hui, B.; Zheng, B.; Yu, B.; Li, C.; Liu, D.; Huang, F.; Wei, H.; et al. Qwen2. 5 Technical Report. arXiv 2024, arXiv:2412.15115. [Google Scholar]
- Douze, M.; Guzhva, A.; Deng, C.; Johnson, J.; Szilvasy, G.; Mazaré, P.E.; Lomeli, M.; Hosseini, L.; Jégou, H. The Faiss library. arXiv 2025, arXiv:2401.08281. [Google Scholar]
- Choi, N.; Byun, G.; Chung, A.; Paek, E.S.; Lee, S.; Choi, J.D. Reference-Aligned Retrieval-Augmented Question Answering over Heterogeneous Proprietary Documents. arXiv 2025, arXiv:2502.19596. [Google Scholar]
- Mizuno, K. Crash Safety of Passenger Vehicles; Translated from Japanese by Kyungwon Song; Reviewed by Inhwan Han, Jongjin Park, Sungjin Kim, Jongchan Park, and Namgyu Park; Bomyung Books: Seoul, Republic of Korea, 2016. [Google Scholar]
- Chen, J.; Xiao, S.; Zhang, P.; Luo, K.; Lian, D.; Liu, Z. M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation. In Proceedings of the Findings of the Association for Computational Linguistics ACL 2024, Bangkok, Thailand and Virtual Meeting, 11–16 August 2024; Ku, L.W., Martins, A., Srikumar, V., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 2318–2335. [Google Scholar] [CrossRef]
- Xiao, S.; Liu, Z.; Zhang, P.; Xing, X. LM-Cocktail: Resilient Tuning of Language Models via Model Merging. arXiv 2023, arXiv:2311.13534. [Google Scholar] [CrossRef]
- Zhang, P.; Xiao, S.; Liu, Z.; Dou, Z.; Nie, J.Y. Retrieve Anything To Augment Large Language Models. arXiv 2023, arXiv:2310.07554. [Google Scholar] [CrossRef]
- Xiao, S.; Liu, Z.; Zhang, P.; Muennighoff, N. C-Pack: Packaged Resources To Advance General Chinese Embedding. arXiv 2023, arXiv:2309.07597. [Google Scholar]
- Zheng, L.; Chiang, W.L.; Sheng, Y.; Zhuang, S.; Wu, Z.; Zhuang, Y.; Lin, Z.; Li, Z.; Li, D.; Xing, E.P.; et al. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. arXiv 2023, arXiv:2306.05685. [Google Scholar]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).