Advancing Large Language Models with Enhanced Retrieval-Augmented Generation: Evidence from Biological UAV Swarm Control
Abstract
:1. Introduction
- Designing an enhanced RAG framework with hybrid retrieval and reranking modules;
- Constructing a structured domain knowledge base by parsing multimodal documents using an element-based chunking strategy;
- Evaluating the proposed domain-specific LLM using both automatic metrics and expert feedback in the context of biological UAV swarm control.
2. Related Work
2.1. Domain-Specific LLMs
2.2. Document Chunking
- Fixed-size Chunking: In this method, text is divided into chunks of a predefined word count. It is a straightforward and computationally efficient approach, particularly in cases where semantic context is relatively uniform across the document. However, fixed-size chunking may fail to preserve important context or capture the logical structure of the document;
- Content-based Chunking: This approach divides text based on its inherent content, such as punctuation (e.g., periods or commas) or sentence boundaries. It is often used in combination with natural language processing tools like Natural Language Toolkit or spaCy for sentence segmentation. While content-based chunking can better respect natural language boundaries, it may still overlook document structure, which limits its utility in more complex tasks such as question-answering;
- Recursive Chunking: A more flexible and adaptive approach, recursive chunking involves applying chunking rules iteratively. For example, splitting a document by paragraph breaks (“\n” or “\r”) first and checking if chunks exceed a predefined size. If necessary, the process continues by applying finer rules (e.g., sentence-level breaks) to reduce chunk size further. Recursive chunking balances chunk size and context preservation, but designing precise rules that ensure effective segmentation across diverse document types remains a technical challenge;
- From Small to Large Chunking: This method involves generating chunks of varying sizes, from small to large, and storing them in a vector database with hierarchical relationships to support recursive search. While this technique allows the flexibility to handle different levels of granularity, it requires significantly more storage and may introduce redundancy due to the overlapping content across multiple chunk sizes;
- Special Structure Chunking: This approach involves creating custom chunkers tailored to specific document types (e.g., Markdown, LaTeX, or programming code). These chunkers aim to preserve the integrity of the document’s structure while breaking it into meaningful segments. Special structure chunking is ideal for handling highly structured content but requires significant effort to design chunkers for each document type and ensure they handle variations in format correctly.
3. Framework for Biological UAV Swarm Control Based on Retrieval-Augmented Generation (RAG)
3.1. The Base Model—Retrieval-Augmented Generation (RAG)
3.1.1. Retrieval Stage
3.1.2. Augmentation Stage
- Input-level fusion: This method combines the retrieved documents with the original input to form a single sequence, which is then input into the generator. While effective, this approach is limited by the number of documents, as overly long sequences may exceed the model’s processing capacity. For domain-specific LLMs, input-level fusion can enhance zero-shot capabilities by automatically retrieving appropriate natural language prompts;
- Output-level fusion: During the prediction phase, the subsequent token distribution generated by the language model is combined with the distribution from the retrieved corpus. This approach is flexible and easy to integrate but may limit the model’s ability to perform deep reasoning;
- Intermediate-level fusion: By designing a semi-parameterized module, retrieval results are integrated into the internal layers of the generative model. This method may enhance performance but increases complexity and requires high access to the model’s internals, which may not be feasible in practical applications.
3.1.3. Generation Stage
- White-box models: These models allow access to model parameters and mainly include encoder–decoder models and pure decoder models. Encoder–decoder models like T5 and BART connect input and target tokens through cross-attention mechanisms. Pure decoder models process the concatenation of input and target tokens and gradually build representations. White-box generators support parameter optimization and adapt to different retrieval and augmentation methods;
- Black-box models: These generators, such as the GPT series, Codex, and Claude, do not allow access to their parameters. They only support basic input queries and response receipts without internal structure modification or parameter updates. Black-box generators focus on the retrieval and augmentation process, enhancing performance by adding knowledge, guidance, or examples to the input to compensate for the limitations of not being able to directly optimize model parameters.
3.2. Hybrid Retrieval Based on Vectors and Keywords
3.3. Reranking Based on Topic Modeling and Cross-Encoding
- (1)
- Obtain two independent ranking results from the CrossEncoder method and the LDA topic model, denoted as and respectively;
- (2)
- Set a tuning parameter , typically with a value of 60;
- (3)
- For each block , compute the reciprocal rank in each ranking result;
- (4)
- Perform a weighted sum of the reciprocal ranks for each block to calculate its RRF score;
- (5)
- Finally, sort all documents based on their RRF scores and return the final reranked results. Select the top K blocks from the final ranking as the ultimate block set .
4. Domain Knowledge Base Construction
4.1. Data Collection
4.2. Data Preprocessing
4.3. Element-Based Chunking Strategy
- (1)
- If the element’s text length is less than 512 characters, attempt to merge it with the subsequent element;
- (2)
- Iteratively merge the element texts in the above step until the desired length is achieved, without disrupting the individual elements;
- (3)
- If a heading element is found, start a new chunk;
- (4)
- If a table element or image element is found, start a new chunk and save the entire table or image.
4.4. Embedding Model
5. Experiment and Evaluation
5.1. Experiment in Public Professional Domain
5.1.1. Experimental Dataset
5.1.2. Data Preprocessing and LDA Modeling
5.2. Experimental Setup
5.2.1. Experimental Environment
5.2.2. Parameter Settings
5.2.3. Evaluation Metrics
5.3. Results and Analysis
5.3.1. Main Experiment Results
5.3.2. Ablation Experiment
5.3.3. Parameter Sensitivity Analysis
5.4. Expert Evaluation on Specialized UAV Domain
5.4.1. Face Validity with Typical Cases
“Swarm instability is a phenomenon that has been observed in some natural systems, but it is not a well-established concept with clear definitions or criteria.”
“Swarm instability refers to a situation where a group of individuals, whether they are insects, animals, or even autonomous robots, exhibit chaotic or unpredictable behavior that leads to a breakdown in the overall structure or coordination of the swarm. This concept is most commonly associated with swarms of insects like bees or ants, but it can also apply to other contexts, such as robotic swarms.”
“One example of a system that has been studied for swarm instability is the flocking behavior of birds. In some cases, the flock may become disorganized and exhibit erratic behavior, such as flying in random directions or changing altitude. This behavior has been attributed to swarm instability, but it is not clear whether this is a general phenomenon or specific to certain types of flocks or conditions.”
“Robotic Swarms: In the context of robotics, swarm instability can occur when a group of robots fails to achieve its intended collective behavior due to communication errors, environmental changes, or unexpected interactions. For example, a robotic swarm designed to search for objects in a disaster area might become unstable if one or more robots fail to communicate effectively with the others 3.
Fish Schools: Fish schools can sometimes become unstable, leading to the formation of ‘schools within schools’ or the breakdown of the overall school structure. This can happen due to environmental changes, predation, or other factors that disrupt the school’s normal patterns 4.”
“A larger number of neighbors can lead to faster convergence and more stable behavior, but may also require more computational resources and may be less effective in dealing with large swarms.”
“The number of neighbors considered in UAV swarm control varies depending on the algorithm, communication constraints, and specific application requirements. 1. Algorithm Type: Topological Neighborhood: Fixed number of nearest neighbors (e.g., 3–8), regardless of distance. Inspired by biological systems like bird flocks or fish schools 1…2. Common Ranges: Many practical implementations use 4–6 neighbors for balance between cohesion and computational efficiency…3. Design Factors: Communication Bandwidth: Limits the number of neighbors to prevent overload; Scalability: Smaller fixed numbers (e.g., 3–5) ensure efficient computation in large swarms… Typical Answer: …UAV swarms often consider 3 to 8 neighbors, with 4–6 being a common range in many algorithms. The exact number is tailored to the system’s design goals and constraints.”
5.4.2. Expert Evaluation Procedure
5.4.3. Result
6. Conclusions, Limitations, and Future Directions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix A.1
Case Question 1. On what sensory information do swarm entities base their decisions, and how does this relate to biological organisms? |
Raw model’s response to question 1: Swarm entities, such as insects or birds, often make decisions based on sensory information from their surroundings. For example, an insect might use its sense of vision to detect food sources or potential threats, or a bird might use its sense of smell to locate a mate or identify a predator. In some cases, swarm entities may also use information from the actions or behaviors of other members of their swarm. For example, a bird might follow the lead of other birds in a flock, or an insect might rely on the signals or behaviors of its fellow insects to make decisions about where to go or what to do. This approach to decision-making is similar to that used by some biological organisms, such as social insects or fish. For example, some species of ants rely on chemical signals from other ants to coordinate their activities and make decisions about where to forage for food. Similarly, some fish use visual cues, such as the movement of other fish or the shape of the water column, to navigate and make decisions about where to go. Overall, swarm entities and other biological organisms often rely on a combination of sensory information from their surroundings and the actions and behaviors of other members of their group to make decisions about how to behave. |
Enhanced model’s response to question 1: Swarm entities base their decisions on various sensory information that allows them to interact with their environment and each other. This sensory information can include, but is not limited to, the following:
Notes:
|
Case question 2. How many neighbors are taken into account in UAV swarm control? |
Raw model’s reponse to question 2: In UAV swarm control, the number of neighbors that are taken into account can vary depending on the specific application and control algorithm. Generally, the number of neighbors is a hyperparameter that can be set by the controller. For example, in a decentralized control algorithm, the controller may use a fixed number of neighbors or may dynamically adjust the number of neighbors based on the current state of the swarm. In a centralized control algorithm, the number of neighbors may be set to a fixed value or may be adjusted based on the current state of the swarm. The number of neighbors can affect the performance of the control algorithm, as it determines the level of cooperation between the UAVs in the swarm. A larger number of neighbors can lead to faster convergence and more stable behavior, but may also require more computational resources and may be less effective in dealing with large swarms. A smaller number of neighbors can be more efficient, but may require more careful tuning and may be more susceptible to outliers and other disturbances in the swarm. |
Enhanced model’s response to question 2: The number of neighbors considered in UAV swarm control varies depending on the algorithm, communication constraints, and specific application requirements. Here’s a concise breakdown: 1. Algorithm Type: Topological Neighborhood: Fixed number of nearest neighbors (e.g., 3–8), regardless of distance. Inspired by biological systems like bird flocks or fish schools 1. Metric Neighborhood: All neighbors within a defined radius. The actual count depends on swarm density and communication range 2. 2. Common Ranges: Many practical implementations use 4–6 neighbors for balance between cohesion and computational efficiency. Some systems dynamically adjust based on swarm density or task demands (e.g., collision avoidance vs. formation keeping) 3. 3. Design Factors:
Notes:
|
Case question 3. What is the speed of behavior propagation throughout the swarm? |
Raw model’s response to question 3: The speed of behavior propagation throughout a swarm can vary depending on the specific swarm and the behavior being propagated. In general, the speed of behavior propagation can be influenced by factors such as the size of the swarm, the density of the swarm, the communication channels used by the swarm, and the behavior being propagated itself. For example, in some swarms, behavior may propagate quickly due to high density and efficient communication channels. In other swarms, behavior may propagate more slowly due to lower density or less efficient communication channels. It is also worth noting that the speed of behavior propagation can be influenced by external factors such as environmental changes or disturbances. Overall, the speed of behavior propagation throughout a swarm can be difficult to predict and can vary significantly depending on the specific swarm and situation. |
Enhanced model’s response to question 3: The speed of behavior propagation throughout a UAV swarm can depend on several factors, including the communication protocol used, the size of the swarm, and the characteristics of the swarm dynamics. In general, the speed of behavior propagation is typically much faster than the speed of information dissemination. For example, in a decentralized control algorithm, the behavior of each UAV is determined locally and then broadcast to its neighboring UAVs, which can propagate the behavior to other neighboring UAVs quickly 1. However, the speed of behavior propagation can be limited by several factors, such as the communication bandwidth, the network latency, and the computational resources available to the UAVs 2. For example, if the communication bandwidth is low, the behavior of UAVs may not propagate quickly enough to respond to changes in the swarm dynamics. If the network latency is high, the behavior of UAVs may not propagate quickly enough to respond to changes in the swarm dynamics in real-time 3. Overall, the speed of behavior propagation is an important consideration in UAV swarm control, as it determines how quickly the swarm can adapt to changes in the environment and how effectively the controller can coordinate the behavior of the UAVs. Notes:
|
Case question 4. Are there known cases of swarm instability? |
Raw model’s response to question 4: Swarm instability is a phenomenon that has been observed in some natural systems, but it is not a well-established concept with clear definitions or criteria. One example of a system that has been studied for swarm instability is the flocking behavior of birds. In some cases, the flock may become disorganized and exhibit erratic behavior, such as flying in random directions or changing altitude. This behavior has been attributed to swarm instability, but it is not clear whether this is a general phenomenon or specific to certain types of flocks or conditions. Other examples of systems that have been studied for swarm instability include the movement of crowds and the behavior of fish schools. However, these phenomena are still under investigation and it is not clear whether they exhibit swarm instability or some other form of instability. Overall, the concept of swarm instability is still a topic of ongoing research and it is not clear whether it exists in all systems or under all conditions. |
Enhanced model’s response to question 4: Swarm instability refers to a situation where a group of individuals, whether they are insects, animals, or even autonomous robots, exhibit chaotic or unpredictable behavior that leads to a breakdown in the overall structure or coordination of the swarm. This concept is most commonly associated with swarms of insects like bees or ants, but it can also apply to other contexts, such as robotic swarms. Here are some known cases of swarm instability: 1. Bee Swarms: One of the most famous examples of swarm instability is the “Bee Disorientation” phenomenon. During certain times of the year, especially in urban areas, bee swarms can become disoriented and fail to find their way back to their hive. This can lead to a loss of the swarm’s structure and the death of the bees 1. 2. Ant Colonies: Ant colonies can also experience instability. For example, if an ant colony is disturbed or if the queen is removed, the colony may become unstable, with worker ants not performing their tasks efficiently 2. 3. Robotic Swarms: In the context of robotics, swarm instability can occur when a group of robots fails to achieve its intended collective behavior due to communication errors, environmental changes, or unexpected interactions. For example, a robotic swarm designed to search for objects in a disaster area might become unstable if one or more robots fail to communicate effectively with the others 3. 4. Fish Schools: Fish schools can sometimes become unstable, leading to the formation of “schools within schools” or the breakdown of the overall school structure. This can happen due to environmental changes, predation, or other factors that disrupt the school’s normal patterns 4. 5. Animal Herds: Similar to fish schools, animal herds can also become unstable. For example, during a migration, if a sudden change in weather or the presence of predators disrupts the herd’s cohesion, the herd may break apart into smaller groups 1. The causes of swarm instability can be complex and multifaceted, often involving a combination of environmental factors, individual behaviors, and interactions between individuals. Understanding and managing swarm instability is an important area of research in fields such as ecology, robotics, and computer science, as it can have significant implications for the survival and efficiency of swarms. Notes:
|
Appendix B
Symbol/Parameter | Description |
---|---|
k1 | Term frequency saturation parameter in BM25 |
b | Field length normalization parameter in BM25 |
Relevance score between document and query in BM25 | |
Inverse Document Frequency of term | |
Frequency of term in document | |
Length of document (total number of terms) | |
avgdl | Average length of all documents in the corpus |
Symbol/Parameter | Description |
---|---|
Topic distribution over documents in LDA | |
z | Latent topic assignment in LDA |
w | Observed words |
Dirichlet prior for topic distribution | |
Dirichlet prior for word distribution | |
The observed word at position n | |
The latent topic assignment for the n-th word |
Symbol/Parameter | Description |
---|---|
Reciprocal Rank Fusion score for document | |
Rank of document by CrossEncoder | |
Rank of document by LDA model | |
Tuning parameter |
References
- Huang, L.; Yu, W.; Ma, W.; Zhong, W.; Feng, Z.; Wang, H.; Chen, Q.; Peng, W.; Feng, X.; Qin, B.; et al. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions. ACM Trans. Inf. Syst. 2025, 43, 1–55. [Google Scholar] [CrossRef]
- Ge, Y.; Hua, W.; Mei, K.; Tan, J.; Xu, S.; Li, Z.; Zhang, Y. OpenAGI: When LLM Meets Domain Experts. Adv. Neural Inf. Process. Syst. 2023, 36, 5539–5568. [Google Scholar]
- Shahzad, M.M.; Saeed, Z.; Akhtar, A.; Munawar, H.; Yousaf, M.H.; Baloach, N.K.; Hussain, F. A Review of Swarm Robotics in a Nutshell. Drones 2023, 7, 269. [Google Scholar] [CrossRef]
- Houtman, N.R.; Yakimishyn, J.; Collyer, M.; Sutherst, J.; Robinson, C.L.; Costa, M. Experimentally Determining Optimal Conditions for Mapping Forage Fish with RPAS. Drones 2022, 6, 426. [Google Scholar] [CrossRef]
- Jiang, Y.; Bai, T.; Wang, Y. Formation Control Algorithm of Multi-UAVs Based on Alliance. Drones 2022, 6, 431. [Google Scholar] [CrossRef]
- Campion, M.; Ranganathan, P.; Faruque, S. UAV Swarm Communication and Control Architectures: A Review. J. Unmanned Veh. Syst. 2019, 7, 93–106. [Google Scholar] [CrossRef]
- Zhao, P.; Zhang, H.; Yu, Q.; Wang, Z.; Geng, Y.; Fu, F.; Yang, L.; Zhang, W.; Jiang, J.; Cui, B. Retrieval-Augmented Generation for AI-Generated Content: A Survey. arXiv 2024, arXiv:2402.19473. [Google Scholar]
- Zhao, H.; Chen, H.; Yang, F.; Liu, N.; Deng, H.; Cai, H.; Wang, S.; Yin, D.; Du, M. Explainability for Large Language Models: A Survey. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–38. [Google Scholar] [CrossRef]
- Dang, Y.; Huang, K.; Huo, J.; Yan, Y.; Huang, S.; Liu, D.; Gao, M.; Zhang, J.; Qian, C.; Wang, K.; et al. Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey. arXiv 2024, arXiv:2412.02104. [Google Scholar]
- Singh, C.; Inala, J.P.; Galley, M.; Caruana, R.; Gao, J. Rethinking Interpretability in the Era of Large Language Models. arXiv 2024, arXiv:2402.01761. [Google Scholar]
- Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2021, arXiv:2106.09685. [Google Scholar]
- Balaguer, A.; Benara, V.; Cunha, R.L.d.F.; Filho, R.d.M.E.; Hendry, T.; Holstein, D.; Marsman, J.; Mecklenburg, N.; Malvar, S.; Nunes, L.O.; et al. RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture. arXiv 2024, arXiv:2401.08406. [Google Scholar]
- Javed, S.; Hassan, A.; Ahmad, R.; Ahmed, W.; Ahmed, R.; Saadat, A.; Guizani, M. State-of-the-Art and Future Research Challenges in UAV Swarms. IEEE Internet Things J. 2024, 11, 19023–19045. [Google Scholar] [CrossRef]
- Bu, Y.; Yan, Y.; Yang, Y. Advancement Challenges in UAV Swarm Formation Control: A Comprehensive Review. Drones 2024, 8, 320. [Google Scholar] [CrossRef]
- Cheng, Q.; Zhang, Z.; Du, Y.; Li, Y. Research on Particle Swarm Optimization-Based UAV Path Planning Technology in Urban Airspace. Drones 2024, 8, 701. [Google Scholar] [CrossRef]
- Deng, C.; Zeng, G.; Cai, Z.; Xiao, X. A Survey of Knowledge Based Question Answering with Deep Learning. J. Artif. Intell. 2020, 2, 157–166. [Google Scholar] [CrossRef]
- Liu, N.F.; Lin, K.; Hewitt, J.; Paranjape, A.; Bevilacqua, M.; Petroni, F.; Liang, P. Lost in the Middle: How Language Models Use Long Contexts. Trans. Assoc. Comput. Linguist. 2024, 12, 157–173. [Google Scholar] [CrossRef]
- Salton, G. Recent Trends in Automatic Information Retrieval. In Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval-SIGIR ’86, Palazzo dei Congressi, Pisa, Italy, 1 September 1986. [Google Scholar]
- Aizawa, A. An Information-Theoretic Perspective of Tf–Idf Measures q. Inf. Process. Manag. 2003, 39, 45–65. [Google Scholar] [CrossRef]
- Robertson, S.; Zaragoza, H. The Probabilistic Relevance Framework: BM25 and Beyond. Found. Trends® Inf. Retr. 2009, 3, 333–389. [Google Scholar] [CrossRef]
- Zhao, S.; Yang, Y.; Wang, Z.; He, Z.; Qiu, L.K.; Qiu, L. Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make Your LLMs Use External Data More Wisely. arXiv 2024, arXiv:2409.14924. [Google Scholar]
- Zhang, S.; Li, X.; Zong, M.; Zhu, X.; Cheng, D. Learning k for KNN Classification. ACM Trans. Intell. Syst. Technol. (TIST) 2017, 8, 1–19. [Google Scholar] [CrossRef]
- Maier, D.; Waldherr, A.; Miltner, P.; Wiedemann, G.; Niekler, A.; Keinert, A.; Pfetsch, B.; Heyer, G.; Reber, U.; Häussler, T.; et al. Applying LDA Topic Modeling in Communication Research: Toward a Valid and Reliable Methodology. Commun. Methods Meas. 2018, 12, 93–118. [Google Scholar] [CrossRef]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Agarwal, C.; Sharifzadeh, M.; Schonfeld, D. CrossEncoders: A Complex Neural Network Compression Framework. Electron. Imaging 2018, 30, 153-1–153-5. [Google Scholar] [CrossRef]
- Ni, J.; Qu, C.; Lu, J.; Dai, Z.; Ábrego, G.H.; Ma, J.; Zhao, V.Y.; Luan, Y.; Hall, K.B.; Chang, M.-W.; et al. Large Dual Encoders Are Generalizable Retrievers. arXiv 2021, arXiv:2112.07899. [Google Scholar]
- Cormack, G.V.; Clarke, C.L.A.; Buettcher, S. Reciprocal Rank Fusion Outperforms Condorcet and Individual Rank Learning Methods. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston MA, USA, 19–23 July 2009. [Google Scholar]
- Hossain, N.; Ghazvininejad, M.; Zettlemoyer, L. Simple and Effective Retrieve-Edit-Rerank Text Generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020. [Google Scholar]
- Chang, Y.; Wang, X.; Wang, J.; Wu, Y.; Yang, L.; Zhu, K.; Chen, H.; Yi, X.; Wang, C.; Wang, Y.; et al. A Survey on Evaluation of Large Language Models. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–45. [Google Scholar] [CrossRef]
- Li, H.; Shao, Y.; Wu, Y.; Ai, Q.; Ma, Y.; Liu, Y. LeCaRDv2: A Large-Scale Chinese Legal Case Retrieval Dataset 2023. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington DC, USA, 14–18 July 2024. [Google Scholar]
- Reiter, E. A Structured Review of the Validity of BLEU. Comput. Linguist. 2018, 44, 393–401. [Google Scholar] [CrossRef]
- Hardesty, D.M.; Bearden, W.O. The Use of Expert Judges in Scale Development: Implications for Improving Face Validity of Measures of Unobservable Constructs. J. Bus. Res. 2004, 57, 98–107. [Google Scholar] [CrossRef]
- Zhou, Y.; Rao, B.; Wang, W. UAV Swarm Intelligence: Recent Advances and Future Trends. IEEE Access 2020, 8, 183856–183878. [Google Scholar] [CrossRef]
- Asai, A.; Wu, Z.; Wang, Y.; Sil, A.; Hajishirzi, H. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. In Proceedings of the Twelfth International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024. [Google Scholar]
- Zhang, T.; Patil, S.G.; Jain, N.; Shen, S.; Zaharia, M.; Stoica, I.; Gonzalez, J.E. RAFT: Adapting Language Model to Domain Specific RAG. In Proceedings of the First Conference on Language Modeling, Philadelphia, PA, USA, 7–9 October 2024. [Google Scholar]
Study | Theme(s) | Findings |
---|---|---|
Huang et al. (2025) [1] | Hallucination | This comprehensive survey explores the phenomenon of hallucinations in LLMs, proposing a taxonomy and discussing contributing factors and challenges. |
Zhao et al. (2024) [8] | Interpretability | This survey introduces a taxonomy of interpretability techniques and provides a structured overview of methods for explaining Transformer-based language models, categorizing techniques based on training paradigms and summarizing approaches for generating local and global explanations. |
Dang et al. (2024) [9] | Interpretability and hallucination | This paper makes a comprehensive survey on interpretability in multimodal LLMs and provides a unified view of explanation methods, and links token alignment and confidence control to hallucination reduction. |
Singh et al. (2024) [10] | Interpretability and hallucination | This paper reviews existing methods to evaluate the emerging field of LLM interpretation, discussing the potential of LLMs to redefine interpretability and highlighting new challenges such as hallucinated explanations and computational costs. |
Ge et al. (2023) [2] | Hallucination | This paper highlights the domain accuracy limitations of general-purpose LLMs, noting their difficulty in handling specialized tasks that require precise tool usage, multi-step reasoning, or domain knowledge. |
Dataset | LeCaRDv1 | CAIL2019 -SCM | CAIL2022 -LCR | COLIEE 2020 | COLIEE 2021 | LeCaRDv2 |
---|---|---|---|---|---|---|
Number of queries | 107 | 8264 | 130 | 650 | 900 | 800 |
Number of candidate cases per query | 100 | 2 | 100 | 200 | 4415 | 55,192 |
Avg. length per case document (words) | 8275 | 676 | 2707 | 3232 | 1274 | 4766 |
Number of avg. relevant cases per query | 10.33 | 1 | 11.53 | 5.15 | 4.73 | 20.89 |
Recall | F1 | Cosine | BLEU | |
---|---|---|---|---|
Baseline (w/o RAG) | 0.348 | 0.292 | 0.489 | 0.040 |
Raw RAG | 0.498 | 0.476 | 0.570 | 0.097 |
Our RAG | 0.517 | 0.490 | 0.585 | 0.105 |
Recall | F1 | Cosine | BLEU | |
---|---|---|---|---|
Baseline (w/o RAG) | 0.348 | 0.292 | 0.489 | 0.040 |
+KNN | 0.498 | 0.476 | 0.570 | 0.097 |
+KNN + Rerank | 0.522 | 0.504 | 0.586 | 0.103 |
+BM25 | 0.379 | 0.315 | 0.512 | 0.061 |
+BM25 + Rerank | 0.381 | 0.329 | 0.527 | 0.077 |
+Hybrid Search | 0.511 | 0.474 | 0.582 | 0.092 |
Our RAG | 0.517 | 0.490 | 0.585 | 0.105 |
Dimension | Definition | Focus |
---|---|---|
Accuracy | The precision and correctness of the generated text | The extent to which the information generated by the language model aligns with factual knowledge, avoiding errors and inaccuracies. |
Relevance | The appropriateness and significance of the generated | The extent to which the text addresses a specific context or query, ensuring that the provided information is relevant and directly applicable. |
Human Alignment | The alignment with human values, preferences, and expectations | The extent to which the text respects ethics, societal norms, and user expectations, promoting a positive interaction with users. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hao, J.-X.; Chen, L.; Meng, L. Advancing Large Language Models with Enhanced Retrieval-Augmented Generation: Evidence from Biological UAV Swarm Control. Drones 2025, 9, 361. https://doi.org/10.3390/drones9050361
Hao J-X, Chen L, Meng L. Advancing Large Language Models with Enhanced Retrieval-Augmented Generation: Evidence from Biological UAV Swarm Control. Drones. 2025; 9(5):361. https://doi.org/10.3390/drones9050361
Chicago/Turabian StyleHao, Jin-Xing, Lei Chen, and Luyao Meng. 2025. "Advancing Large Language Models with Enhanced Retrieval-Augmented Generation: Evidence from Biological UAV Swarm Control" Drones 9, no. 5: 361. https://doi.org/10.3390/drones9050361
APA StyleHao, J.-X., Chen, L., & Meng, L. (2025). Advancing Large Language Models with Enhanced Retrieval-Augmented Generation: Evidence from Biological UAV Swarm Control. Drones, 9(5), 361. https://doi.org/10.3390/drones9050361