Context-Aware Multi-Agent Architecture for Wildfire Insights
Abstract
1. Introduction
- (RQ1)
- How can an orchestrator-based MAS dynamically integrate heterogeneous data sources (such as UAV imagery, satellite observations, and tabular environmental data) into a unified reasoning framework for wildfire response?
- (RQ2)
- What advantages does a multimodal MAS integrated with RAG pipelines offer over traditional wildfire prediction and management approaches?
Statement of Novelty and Contributions
- Architectural Determinism via Decay-Weighted Routing: Unlike standard autonomous agents that rely on open-ended loops (which are prone to “getting stuck”), we introduce a novel orchestration policy governed by a decay factor. This mathematically enforces task convergence, a critical requirement for safety-critical response systems.
- Lossless Multimodal RAG: We address the information loss inherent in standard “caption-based” retrieval systems. By retrieving and processing raw visual artifacts (Base64) rather than text descriptions, our pipeline preserves the forensic granularity required to distinguish between smoke plumes and cloud cover.
- Formalized State Engineering: We replace standard natural language prompting with a rigorous “Context Tuple” framework (p). This formalization constrains the stochastic nature of Large Multimodal Models (LMMs), ensuring that agent behavior is reproducible and auditable, a feature largely absent in generic generative AI frameworks.
2. Related Work
3. Methodology
3.1. Process Overview
- CSV data retrieval pipeline: retrieves structured meteorological and historical wildfire data from a vector database using high-dimensional text embeddings.
- Image data retrieval pipeline: retrieves visual analogs from satellite or aerial imagery databases, utilizing visual embeddings to identify patterns similar to the input image.
- Multimodal RAG pipeline: synchronizes the retrieval of both text and image data when the query necessitates a combined environmental view.
3.2. Materials and Datasets
3.3. Data Preprocessing and Retrieving
3.3.1. Preprocessing and Retrieval Algorithms
| Algorithm 1 Multimodal RAG pipeline: preprocessing phase |
| Require: Tabular dataset , Image dataset |
| Ensure: Populated indices , |
| Hyperparameters: |
|
| Algorithm 2 Multimodal RAG pipeline: retrieval and generation phase |
| Require: User query q, Populated indices , |
| Ensure: Generated answer A |
| Hyperparameters: |
|
3.3.2. Pipeline Execution Flow
3.4. Structured Prompt Engineering and Context Formalization
3.5. Proposed Multi-Agent Architecture
- Task-Based Approach
- Orchestrator-Based Approach
- Agent denotes the set of agents in the system. An Agent is an intelligent and automated unit powered by an LMM that performs specific tasks. Apart from the LMM, an agent must be provided with a role, a goal describing its instructions, a set of tasks to achieve the aforementioned goal, and a set of tools it can use to perform its assigned tasks. Agents can establish communication with other agents while maintaining their own memory of interactions. An output of an agent depends on the LMM it is supplied with. Therefore, the most suitable LMM for an agent may vary, and it largely depends on the tasks assigned to the agent. The proposed solution has three key agents that support its functionality.
- –
- Data acquisition agents gather data for processing. This agent is supplied with data retrieval tools, including the CSV data retrieval tool, the Image data retrieval tool, and the multimodal RAG tool, and invoke them when necessary.
- –
- The reasoning agent processes complex spatial–temporal patterns in wildfire behavior, analyzing the wildfire cases that have previously happened. Generally, this agent is invoked after the data acquisition agent is invoked, because the data acquisition agent provides the data that is retrieved from its multiple RAG pipelines. Then the reasoning agent provides the reasoning about the current situation with the data that it gets from the Data Acquisition agent.
- –
- Orchestrator agent coordinates tasks and their execution within this MAS. As the central coordinator of the proposed framework, it enables full functionality by connecting other intelligent agents as tools. As shown in Figure 1, the data acquisition agent serves as the data acquisition tool, and the reasoning agent serves as the reasoning tool. The orchestrator agent facilitates communication between these sub-agents, allowing seamless task delegation. Its plan-and-execute nature provides adaptability to dynamic environments, making it ideal for domains such as wildfire management.
- –
- Structurally, the system adopts a hub-and-spoke (Star) topology, enforcing strict isolation between subordinate agents. The data acquisition agent and reasoning agent operate in distinct environments and are invoked via independent API calls. Consequently, they possess no shared memory or lateral communication channels; the state of the orchestrator is opaque to them. This design ensures that the orchestrator agent acts as the sole source of truth for conversation history and global context, preventing information leakage and ensuring that all inter-agent data flow is explicitly filtered and routed through the central policy function.
- Task represents the set of tasks, which are dynamically planned and executed by the Orchestrator Agent.
- Tool represents the set of functions available within the system. Each tool is a specialized skill or capability that agents can invoke to execute specific actions. Beyond pre-built tools, custom tools can be developed and assigned to agents to extend their functionality. As shown in Figure 1, these tools are primarily utilized by the data acquisition agent and custom-built as follows.
- –
- The CSV data-retrieval tool is an application interface that utilizes a CSV-based RAG Pipeline to retrieve raw metadata from indexed meteorological CSV data.
- –
- The image data retrieval tool is an application interface that utilizes an Image Data Retrieval pipeline. It accepts both text and image inputs as queries and retrieves relevant images in base-64 format along with their metadata.
- –
- The multimodal RAG tool utilizes the above retrieval pipelines in parallel. This ensures that the Reasoning Agent receives context that is both semantically and visually grounded.
- Rule R represents the set of interaction rules governing agent coordination. These rules are specified independently of the agent prompts.
3.5.1. Task-Based Architecture
- Error propagation: since the coupling is linear (), any hallucination in the initial retrieval stage is propagated to the reasoning stage, which can lead to erroneous outcomes.
- Computational inefficiency: as per (1), this approach forces the execution of for every query. This results in an inevitable computational cost , even when the retrieved data might not be of use (e.g., for general knowledge queries).
3.5.2. Orchestrator-Based Architecture
| Algorithm 3 Orchestrator dynamic policy optimization |
| Require: User query Q, Set of available agents , Initial context |
| Ensure: Final system response R |
| Parameters: Decay rate , Threshold |
|
3.5.3. Comparison of Architecture
3.5.4. Application Implementation
4. Results
4.1. Applied Task Formulation
4.2. Evaluation Metrics
4.3. Web Application with a Visual Question–Answer System
4.4. Comparative Analysis of LLM
4.5. Ablation Study and Architectural Validation
5. Discussion
5.1. Lessons Learned
5.2. Comparative Analysis with State-of-the-Art Models
5.3. Challenges and Future Works
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Multimodal Wildfire Reasoning and Response Tasks
| Task ID | Task Name | Definition and Scope |
|---|---|---|
| Task 1 | Hazard Identification | Objective: Detect the presence of active fire threats by cross-referencing visual artifacts with meteorological data. |
| (Detection) | Input: Satellite/UAV imagery + Wind/Temperature data. | |
| Output: Binary Classification (Fire/No Fire) and confidence score. | ||
| Task 2 | Spread Prediction | Objective: Forecast the future direction and rate of spread (ROS) of the identified fire front. |
| Input: Wind direction, wind speed, and terrain features. | ||
| Output: Directional vector and estimated ROS (e.g., “rapid downslope expansion”). | ||
| Task 3 | Response Planning | Objective: Formulate particularized mitigatory action plans based on the synthesized threat assessment. |
| Input: Synthesized outputs from Task 1 and Task 2. | ||
| Output: Actionable recommendations (e.g., “initiate evacuation in Sector 4”). |
Appendix B. Mathematical Notations
| Category | Notation | Description |
|---|---|---|
| Data Domains | , where each represents a meteorological record with d features (temperature, humidity, wind speed, etc.). | |
| , where each denotes satellite or aerial imagery with height H, width W, and C color channels. | ||
| Hyperparameters | Missing value threshold for row removal (default: 0.40). Range . | |
| Interquartile Range (IQR) multiplier for outlier detection (default: 1.5). . | ||
| Embedding Models | Text embedding function () using BAAI/bge-base-en-v1.5, mapping text strings to 768-dimensional vectors. | |
| Multimodal embedding function () using OpenAI CLIP, mapping images or text to 512-dimensional vectors. | ||
| Vector Indices | Vector database storing tabular embeddings with associated metadata. | |
| Vector database storing visual embeddings with associated metadata. | ||
| Retrieval Params | k | Top-k retrieval count (), specifying the number of most similar items to retrieve per search query. |
| Context Variables | q | User query in natural language (). |
| Text embedding of query q for tabular search (). | ||
| CLIP embedding of query q for visual search (). | ||
| Retrieved tabular context (top-k meteorological records). | ||
| Retrieved visual context (top-k similar images). | ||
| Augmented prompt combining q, , and . | ||
| A | Final generated answer. | |
| Functions | Computes fraction of missing values in row . Output . | |
| Fills missing values using statistical methods (mean for numerical, mode for categorical). | ||
| RemoveOutliers | Removes values outside . | |
| Applies Z-score normalization: . | ||
| SerializeToText | Converts structured row data to natural language text . | |
| IntegrityCheck | Validates image file integrity. Returns . | |
| Detects duplicate images via perceptual hashing. Returns . | ||
| Normalizes image resolution to standard dimensions. | ||
| Retrieves k nearest neighbors to vector v using cosine similarity. | ||
| PromptAugment | Constructs structured prompt template combining query q and retrieved contexts . | |
| Large Multimodal Model that generates final reasoning A from augmented context. |
References
- National Interagency Fire Center. Wildland Fire Summary and Statistics Annual Report 2024. 2024. Available online: https://www.nifc.gov/fire-information/statistics (accessed on 1 December 2024).
- Wasserman, T.N.; Mueller, S.E. Climate influences on future fire severity: A synthesis of climate-fire interactions and impacts on fire regimes, high-severity fire, and forests in the western United States. Fire Ecol. 2023, 19, 43. [Google Scholar] [CrossRef]
- Naser, M.; Kodur, V. Vulnerability of structures and infrastructure to wildfires: A perspective into assessment and mitigation strategies. Nat. Hazards 2025, 121, 9995–10015. [Google Scholar] [CrossRef]
- Sharma, D.; Kashyap, M.P.; Das, D.; Chatterji, B.; Modi, M.; Talukdar, N. Assessing fire-induced tree cover loss and its contribution to carbon emission in BRICS+ nations. Discov. Environ. 2026, 4, 7. [Google Scholar] [CrossRef]
- Von Scheffer, C.; Mauquoy, D.; Theurer, T.; Coathup, D.; Muirhead, D. ‘Fire Islands’: Holocene wildfire intensity as a critical determinant of carbon accumulation in South Atlantic peatlands. Quat. Sci. Rev. 2026, 374, 109759. [Google Scholar] [CrossRef]
- Tavakol Sadrabadi, M.; Peiró, J.; Innocente, M.S.; Rein, G. Conceptual design of a wildfire emergency response system empowered by swarms of unmanned aerial vehicles. Int. J. Disaster Risk Reduct. 2025, 124, 105493. [Google Scholar] [CrossRef]
- Duarte, M.; da Silva, T.A.; de Sousa, J.P.; de Castro, A.L.; Lourenço, R. Fuzzy Inference System for Mapping Forest Fire Susceptibility in Northern Rondônia, Brazil. Geogr. Environ. Sustain. 2024, 17, 83–94. [Google Scholar] [CrossRef]
- Toledo-Castro, J.; Caballero-Gil, P.; Rodríguez Pérez, N.; Santos-González, I.; Hernández-Goya, C.; Aguasca, R. Forest Fire Prevention, Detection, and Fighting Based on Fuzzy Logic and Wireless Sensor Networks. Complexity 2018, 2018, 1639715. [Google Scholar] [CrossRef]
- Cao, J.; Liu, X.; Xue, R. FireMM-IR: An Infrared-Enhanced Multi-Modal Large Language Model for Comprehensive Scene Understanding in Remote Sensing Forest Fire Monitoring. Sensors 2026, 26, 390. [Google Scholar] [CrossRef]
- Binlajdam, R.; Meedeniya, D.; Jayaweera, C.; Karakus, O.; Rana, O.; Ter Wengel, P.; Goossens, B.; Lertsinsrubtavee, A.; Mekbungwan, P.; Mishra, D.; et al. Review on Sustainable Forestry with Artificial Intelligence. ACM J. Comput. Sustain. Soc. 2025, 3, 1–48. [Google Scholar] [CrossRef]
- Andrianarivony, H.S.; Akhloufi, M.A. Machine learning and deep learning for wildfire spread prediction: A review. Fire 2024, 7, 482. [Google Scholar] [CrossRef]
- Saleh, A.; Zulkifley, M.A.; Harun, H.H.; Gaudreault, F.; Davison, I.; Spraggon, M. Forest fire surveillance systems: A review of deep learning methods. Heliyon 2024, 10, e23127. [Google Scholar] [CrossRef]
- Jayanetti, A.; Meedeniya, D.; Dilini, N.; Wickramapala, M.; Madushanka, H. Enhanced land cover and land use information generation from satellite imagery and foursquare data. In Proceedings of the 6th International Conference on Software and Computer Applications(ICSCA); ACM: Bangkok, Thailand, 2017; pp. 149–153. [Google Scholar] [CrossRef]
- Dewangan, A.; Pande, Y.; Braun, H.W.; Vernon, F.; Perez, I.; Altintas, I.; Cottrell, G.W.; Nguyen, M.H. FIgLib & SmokeyNet: Dataset and deep learning model for real-time wildland fire smoke detection. Remote Sens. 2022, 14, 1007. [Google Scholar] [CrossRef]
- Xie, Y.; Jiang, B.; Mallick, T.; Bergerson, J.D.; Hutchison, J.K.; Verner, D.R.; Branham, J.; Alexander, M.R.; Ross, R.B.; Feng, Y.; et al. WildfireGPT: Tailored Large Language Model for Wildfire Analysis. arXiv 2025, arXiv:2402.07877. [Google Scholar] [CrossRef]
- Du, S.; Li, J.; Noto, M. Wildfire scene recognition based on qwen2-wildfire. In Proceedings of the 2025 8th International Conference on Software Engineering and Information Management; ACM: Singapore Singapore, 2025; pp. 254–262. [Google Scholar] [CrossRef]
- Meedeniya, D.; Jayaweera, C. Blazing Trails: Cutting-Edge Technologies Revolutionizing Forest Fire Screening. In Pioneering Autonomous Technology A Deep Dive into Hyper Automation, 1st ed.; Swain, K., Pattnaik, P., Poonia, R., Nayak, S., Eds.; Elsevier Academic Press: Amsterdam, The Netherlands, 2026; Volume 143, Chapter 3; pp. 1–16. [Google Scholar] [CrossRef]
- Mawanza, L. Fault-tolerant dynamic formation control of the heterogeneous multi-agent system for cooperative wildfire tracking. Syst. Sci. Control Eng. 2025, 12, 2294991. [Google Scholar] [CrossRef]
- Kouzehgar, M.; Meghjani, M.; Bouffanais, R. Multi-Agent Reinforcement Learning for Dynamic Ocean Monitoring by a Swarm of Buoys. arXiv 2020, arXiv:2012.11641. [Google Scholar] [CrossRef]
- Zadeh, R.B.; Elmi, A.; Moghaddam, V.; MahmoudZadeh, S. A Conceptual High-Level Multiagent System for Wildfire Management. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5911415. [Google Scholar] [CrossRef]
- UNDP. Sustainable Development Goals. 2015. Available online: https://www.undp.org/sustainable-development-goals (accessed on 25 December 2025).
- Ángel Javaloyes, M.; Pendás-Recondo, E.; Sánchez, M. A general model for wildfire propagation with wind and slope. SIAM J. Appl. Algebra Geom. 2023, 7, 414–439. [Google Scholar] [CrossRef]
- Gao, X.; Cao, C.; Wang, S.; Xu, M.; Li, J.; Yang, X.; Yang, Y.; Hu, R.; Zhang, Y.; Wu, S.; et al. Remote sensing diagnosis of Forest fire risk based on state-trend characteristics using machine learning models. Ecol. Indic. 2026, 182, 114527. [Google Scholar] [CrossRef]
- Yel, S.G.; Küçüker, D.M.; Görmüş, E.T. Wildfire susceptibility mapping with multiple machine learning algorithms utilizing forest inventory and FIRMS data: A case study in Arsin, Trabzon, Türkiye. Int. J. Appl. Earth Obs. Geoinf. 2026, 146, 105091. [Google Scholar] [CrossRef]
- Uma Maheswara Rao, R.; Waila, P.; Mammen, P.C.; Muthu, R. Enabling Artificial Intelligence (AI) and Machine Learning (ML) Techniques for Managing Forest Fires. In Application of Machine Learning in Earth Sciences: A Practical Approach; Springer: Berlin/Heidelberg, Germany, 2026; pp. 429–459. [Google Scholar] [CrossRef]
- Mousa, M.H.; Algamdi, A.M.; Fouad, Y.; Elshewey, A.M. CNN-MLP framework for forest burned areas prediction using PSO-WOA algorithm. Sci. Rep. 2026, 16, 4982. [Google Scholar] [CrossRef]
- Saranya, A.; Subhashini, R. A systematic review of Explainable Artificial Intelligence models and applications: Recent developments and future trends. Decis. Anal. J. 2023, 7, 100230. [Google Scholar] [CrossRef]
- Ahangama, I.; Meedeniya, D.; Pradhan, B. Explainable Image Segmentation for Spatio-Temporal and Multivariate Image Data in Precipitation Nowcasting. Results Eng. 2025, 26, 105595. [Google Scholar] [CrossRef]
- Klotz, J.; Burgert, T.; Demir, B. On the Effectiveness of Methods and Metrics for Explainable AI in Remote Sensing Image Scene Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 27764–27780. [Google Scholar] [CrossRef]
- Pan, Y.; Yang, J.; Lu, M.; Bao, Q.; Zhu, T.; Yao, Q.; New, S.; Chen, D.; Shi, C.; Chen, L. Bridging the “Last-mile Gap” in Climate Services Delivery: A Dynamical-AI Hybrid Framework for Next-Month Wildfire Danger Prediction and Emergency Action. Adv. Atmos. Sci. 2026, 43, 706–722. [Google Scholar] [CrossRef]
- Aththanayake, S.; Mallikarachchi, C.; Wickramasinghe, J.; Kugarajah, S.; Meedeniya, D.; Pradhan, B. ResQConnect: An AI-Powered Multi-Agentic Platform for Human-Centered and Resilient Disaster Response. Sustainability 2026, 18, 1014. [Google Scholar] [CrossRef]
- Xie, Y.; Jiang, B.; Mallick, T.; Bergerson, J.; Hutchison, J.K.; Verner, D.R.; Branham, J.; Alexander, M.R.; Ross, R.B.; Feng, Y.; et al. MARSHA: Multi-agent RAG system for hazard adaptation. npj Clim. Action 2025, 4, 70. [Google Scholar] [CrossRef]
- Kalatzis, N.; Avgeris, M.; Dechouniotis, D.; Papadakis-Vlachopapadopoulos, K.; Roussaki, I.; Papavassiliou, S. Edge Computing in IoT ecosystems for UAV-enabled Early Fire Detection. In Proceedings of the 2018 IEEE International Conference on Smart Computing (SMARTCOMP), Taormina, Italy, 18–20 June 2018; pp. 106–114. [Google Scholar] [CrossRef]
- Abid, F. Algerian Forest Fires Dataset. 2019. Available online: https://archive.ics.uci.edu/ml/datasets/Algerian+Forest+Fires+Dataset (accessed on 12 December 2025).
- Hopkins, B.; ONeill, L.; Marinaccio, M.; Rowell, E.; Parsons, R.; Flanary, S.; Nazim, I.; Seielstad, C.; Afghah, F. FLAME 3 Dataset: Unleashing the Power of Radiometric Thermal UAV Imagery for Wildfire Management. arXiv 2024, arXiv:2412.02831. [Google Scholar] [CrossRef]
- Stavros, N.; Tane, Z.; Kane, V.; Veraverbeke, S.; McGaughey, R.; Lutz, J.A.; Ramirez, C.; Schimel, D.S. Remote Sensing Data Before and After California Rim and King Forest Fires, 2010–2015. 2016. Available online: https://www.earthdata.nasa.gov/data/catalog/ornl-cloud-king-rim-fire-analysis-1288-1 (accessed on 12 December 2025).
- Data, G. National USFS Fire Occurrence Point. 2025. Available online: https://catalog.data.gov/dataset/national-usfs-fire-occurrence-point-feature-layer-d3233 (accessed on 7 August 2025).
- Hua, Q.; Ye, L.; Fu, D.; Xiao, Y.; Cai, X.; Wu, Y.; Lin, J.; Wang, J.; Liu, P. Context Engineering 2.0: The Context of Context Engineering. arXiv 2025, arXiv:2510.26493. [Google Scholar] [CrossRef]
- Wu, S.; Qiao, Y.; He, S.; Zhou, J.; Wang, Z.; Li, X.; Wang, F. FireCLIP: Enhancing Forest Fire Detection with Multimodal Prompt Tuning and Vision-Language Understanding. Fire 2025, 8, 237. [Google Scholar] [CrossRef]
- Zhang, Z.; Yao, Y.; Zhang, A.; Tang, X.; Ma, X.; He, Z.; Wang, Y.; Gerstein, M.; Wang, R.; Liu, G.; et al. Igniting language intelligence: The hitchhiker’s guide from chain-of-thought reasoning to language agents. ACM Comput. Surv. 2025, 57, 1–39. [Google Scholar] [CrossRef]
- Sandeep, A.; Samarappuli, V.; Jayarathne, S.; Sandaruwan, S. AI-Powered Insight Engine for Wildfire Data Reasoning. 2025. Available online: https://sites.google.com/cse.mrt.ac.lk/fusionsense (accessed on 12 December 2025).
- Cruz, M.G.; Alexander, M.E. The 10% wind speed rule of thumb for estimating a wildfire’s forward rate of spread in forests and shrublands. Ann. For. Sci. 2019, 76, 44. [Google Scholar] [CrossRef]
- Andrews, P.L. The Rothermel Surface Fire Spread Model and Associated Developments: A Comprehensive Explanation and Guide; Technical Report RMRS-GTR-371; USDA Forest Service, Rocky Mountain Research Station: Fort Collins, CO, USA, 2018. [Google Scholar] [CrossRef]
- Meedeniya, D.; Ariyarathne, I.; Bandara, M.; Jayasundara, R.; Perera, C. A Survey on Deep Learning-based Forest Environment Sound Classification at the Edge. ACM Comput. Surv. 2023, 56, 66. [Google Scholar] [CrossRef]
- Faria, F.T.J.; Baniata, L.H.; Choi, A.; Kang, S. Towards Robust Chain-of-Thought Prompting with Self-Consistency for Remote Sensing VQA: An Empirical Study Across Large Multimodal Models. Mathematics 2025, 13, 3046. [Google Scholar] [CrossRef]
- Zhao, X.; Wang, H.; Dai, C.; Tang, J.; Deng, K.; Zhong, Z.; Kong, F.; Wang, S.; Morikawa, S. Multi-Stage Simulation of Residents’ Disaster Risk Perception and Decision-Making Behavior: An Exploratory Study on Large Language Model-Driven Social–Cognitive Agent Framework. Systems 2025, 13, 240. [Google Scholar] [CrossRef]
- Paranayapa, T.; Ranasinghe, P.; Ranmal, D.; Meedeniya, D.; Perera, C. A Comparative Study of Preprocessing and Model Compression Techniques in Deep Learning for Forest Sound Classification. Sensors 2024, 24, 1149. [Google Scholar] [CrossRef]
- Karim, M.M.; Van, D.H.; Khan, S.; Qu, Q.; Kholodov, Y. Ai agents meet blockchain: A survey on secure and scalable collaboration for multi-agents. Future Internet 2025, 17, 57. [Google Scholar] [CrossRef]
- Ranmal, D.; Ranasinghe, P.; Paranayapa, T.; Meedeniya, D.; Perera, C. ESC-NAS: Environment Sound Classification Using Hardware-Aware Neural Architecture Search for the Edge. Sensors 2024, 24, 3749. [Google Scholar] [CrossRef]
- Han, Z.; Wang, J.; Yan, X.; Jiang, Z.; Zhang, Y.; Liu, S.; Gong, Q.; Song, C. CoReaAgents: A Collaboration and Reasoning Framework Based on LLM-Powered Agents for Complex Reasoning Tasks. Appl. Sci. 2025, 15, 5663. [Google Scholar] [CrossRef]
- Chen, Z.; Asadi Shamsabadi, E.; Jiang, S.; Shen, L.; Dias-da Costa, D. Integration of large vision language models for efficient post-disaster damage assessment and reporting. Nat. Commun. 2026. [Google Scholar] [CrossRef]










| Feature | Standard Agentic RAG | Proposed Framework (Ours) | Architectural Advantage |
|---|---|---|---|
| Orchestration Logic | Recursive ReAct Loops: Relies on open-ended ‘Reason + Act’ cycles that often loop indefinitely or hallucinate tools in complex scenarios. | Decay-Weighted Routing: Implements a policy function with a decay factor that mathematically forces task convergence. | Prevents infinite loops and ensures deterministic latency for safety-critical response. |
| Multimodal Data | Intermediate Captioning: Converts images to text descriptions before processing, causing loss of granular visual details (e.g., smoke density). | Lossless Artifact Injection: Retrieves and injects raw Base64 visual artifacts directly into the LMM context window. | Preserves forensic visual fidelity required for distinguishing similar hazards (e.g., cloud vs. smoke). |
| Context Management | Unstructured Logs: Appends raw conversation history to the prompt, leading to context drift and unauthorized tool usage. | Formalized State Tuples: Uses a rigid tuple structure to strictly define role boundaries and constraints. | Guarantees reproducibility and prevents agents from acting outside safety guardrails. |
| Study | Description | Approach | Region | |||
|---|---|---|---|---|---|---|
| Rule-Based | Fuzzy Logic | AI | Other | |||
| SmokeyNet (2022) [14] | Multimodal smoke detection | – | – | CNN, LSTM, ViT | – | USA (HPWREN sites) |
| UAV Swarms for WER (2025) [6] | Autonomous UAV wildfire suppression | Simple rules | – | MAS, Swarm Robotics | – | United Kingdom |
| High-Level MAS with DRL (2025) [20] | MAS with DRL for fire tracking | – | – | MAS, Deep RL | – | Global |
| Heterogeneous MAS (2025) [18] | UAV/ground robot monitoring | – | – | MAS | – | – |
| MARL-based Systems (2020) [19] | Swarm ocean monitoring | – | – | Multi-Agent RL | – | Bedok Reservoir, Singapore |
| WildfireGPT (2025) [15] | RAG-based LLM decision support | – | – | MAS, LLM, RAG | – | United States |
| Fuzzy Fire Mapping (2024) [7] | Fire susceptibility mapping | – | Fuzzy Inference System | – | – | Brazil (Rondônia) |
| WSN Fire Controller (2018) [8] | IoT/WSN Fire Controller | – | Dynamic Fuzzy Logic | – | IoT, WSN | Spain |
| Edge-UAV System (2018) [33] | UAV Early Detection | – | – | – | Edge Computing | European South Region |
| Study | MAS Architecture | UAV/Drone Swarms | Multi Modal Fusion | VQA | Context Awareness | Historical Data | RAG | MAS Orchestration | Explainable Decisions |
|---|---|---|---|---|---|---|---|---|---|
| SmokeyNet [14] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| UAV Swarms for WER [6] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| High-Level MAS with DRL [20] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Heterogeneous MAS [18] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| WildfireGPT [15] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| MARL-based Systems [19] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Fuzzy Fire Mapping [7] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| WSN Fire Controller [8] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Edge-UAV System [33] | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Proposed VQA-MAS System (Ours) | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
Satisfies;
Does Not Satisfy.| Dataset | Type | Description | Features |
|---|---|---|---|
| Algerian Forest Fires [34] | Tabular | 244 instances from Bejaia (northeast) and Sidi Bel-Abbes (northwest) regions in Algeria, 122 per region. | Contains meteorological data for in CSV format, indexed in the vector DB, and retrieved from the CSV Retrieval pipeline |
| Remote Sensing Data Before and After California Rim and King Forest Fires, 2010–2015 [36] | Satellite and thermal images | High-resolution surface reflectance, thermal imagery, burn severity metrics, and LiDAR-derived structural measures from Sierra Nevada Mountains, California, USA, collected before/after 2013 Rim and 2014 King fires. | Provides high-resolution multi-spectral and thermal imagery, indexed in vector DB and utilized in image retrieval pipeline. |
| National USFS Fire Occurrence Point [37] | Tabular | Ignition points for USFS wildland fires, maintained at Forest/District level to track occurrence and origin. | Provides historical US wildfire data for VQA to understand fire patterns and geographical risks. |
| FLAME 3 - Radiometric Thermal UAV Imagery for Wildfire Management [35] | UAV images/thermal | 622 image quartets labeled Fire and 116 labeled No Fire from the surrounding forestry of the prescribed burn plot. | Gives RGB-thermal image pairs for multimodal fusion algorithms. |
| Query Condition | Dominant Weight Factor | Selected Agent | Execution Action Order |
|---|---|---|---|
| System Initialization (e.g., “New User Query q Received”) | Policy Initialization () Orchestrator evaluates: is (Data Need) or (Intent Match) the priority? | Orchestrator Agent | 1. Parse semantic intent of query q. 2. Instantiate dynamic weights (). 3. Determine initial . 4. Route control to the selected agent (typically Data Acquisition for complex queries). |
| Data Retrieval (External Context Required) | High Data Need () | Data Acquisition Agent | 1. Execute RAG pipelines. 2. Update global context (). 3. Data validation: If data is null, re-invoke Data Acquisition Agent immediately. 4. Once data is sufficient, decay () to shift control. |
| Context Reasoning Sufficient Data Available | High Intent Match () Prioritizes logic synthesis and answer generation. | Reasoning Agent | 1. Ingest grounded context (). 2. Perform Chain-of-Thought analysis. 3. Draft response (R). 4. Submit R to Orchestrator Agent. |
| Ambiguous/Conflict (e.g., Conflicting visual vs. textual data) | Recursive Correction Orchestrator detects low confidence or format violation. | Recursive Loop | 1. Trigger Supervisor–Worker Protocol 2. Re-invoke subordinate agent with refined constraints 3. Filter Hallucinations |
| Ability | Task-Based Approach | Orchestrator-Based Approach |
|---|---|---|
| Dynamic task planning | No | Yes |
| No of crew needed | Single | Multiple |
| Agent invocation | Sequential, fixed | Flexible, orchestrated |
| Maintainability | Direct, less flexible | Modular, more maintainable |
| Task execution flow | Crew executes tasks | Orchestrator agent manages flow |
| Use case suitability | Simple, linear flows | Complex, adaptive workflows |
| Model | Accuracy | Precision | Recall | F1-Score | Inference Time (s) |
|---|---|---|---|---|---|
| GPT-4.1-Mini | 0.700 | 0.700 | 0.700 | 0.700 | 213.02 |
| GPT-4.1-Nano | 0.632 | 0.662 | 0.632 | 0.647 | 160.21 |
| GPT-4o | 0.700 | 0.752 | 0.700 | 0.725 | 190.57 |
| GPT-5 | 0.450 | 0.604 | 0.450 | 0.515 | 155.25 |
| GPT-5-Nano | 0.684 | 0.797 | 0.684 | 0.736 | 176.70 |
| Gemini-2.5-Flash-Lite | 0.600 | 0.662 | 0.600 | 0.629 | 176.70 |
| Ablation Type | Accuracy | Precision | Recall | F1-Score | Inference Time (s) |
|---|---|---|---|---|---|
| Without CSV Data Retrieval tool (Only Image Data Retrieval tool and Multimodal RAG tool) | 0.667 | 0.722 | 0.667 | 0.693 | 94.50 |
| Without Image Data Retrieval tool (Only CSV Data Retrieval tool and Multimodal RAG tool) | 0.643 | 0.651 | 0.648 | 0.649 | 107.96 |
| Without Multimodal RAG tool (Only Image Data Retrieval tool and CSV Data Retrieval tool) | 0.471 | 0.569 | 0.471 | 0.515 | 116.94 |
| With CSV Data Retrieval tool, Image Data Retrieval tool and Multimodal RAG tool | 0.684 | 0.797 | 0.684 | 0.736 | 176.70 |
| Approach | Performance | Detection Speed | Monitoring Capability | Management Support | Scalability |
|---|---|---|---|---|---|
| WSN Fire Controller (2018) [8] | N/A (real-time alerts) | Real-time (WSN) | Sensor-based variables | Risk alerts | High (multi-hop routing) |
| Edge-UAV System (2018) [33] | N/A (qualitative efficient management of CPU/RAM, battery life, and network resources based on initial experiments) | Real-time (edge/fog) | UAV detection | Resource allocation | High (hierarchical) |
| MARL-based Systems (2020) [19] | Learning convergence | Real-time inference | Spatial coverage | Monitoring | Medium (dataset dependent) |
| SmokeyNet (2022) [14] | Accuracy: 83.49% | Real-time inference | Image | Smoke Detection | Medium (dataset dependent) |
| Fuzzy Fire Mapping (2024) [7] | AUC 0.879 | N/A (static mapping) | Susceptibility via GIS | Prevention actions | Medium (climate-sensitive) |
| UAV Swarms for WER (2025) [6] | N/A (conceptual) | N/A (conceptual) | Real-time via UAVs | Evacuation/ Suppression | High (swarm-based) |
| High-Level MAS with DRL (2025) [20] | N/A (conceptual) | Real-time (DRL algorithms) | UAV/IoT tracking | DSS for decision-making | Medium (integrated data) |
| Heterogeneous MAS (2025) [18] | Finite-time convergence (simulations) | Finite-time tracking | Cooperative air-ground | Fault-tolerant tracking | High (heterogeneous agents) |
| WildfireGPT (2025) [15] | Correctness: 97.73% (case studies) | Real-time inference (LLM-based) | Data synthesis (climate projections/literature) | Risk insights/ decision-making | High (LLM scalable) |
| Proposed MAS | Precision 0.797, F1-score 0.736 | Low latency (orchestration) | Multimodal synthesis | Context-Aware Reasoning | High (agentic scalability) |
| Wind Speed (km/h) | Est. Rate of Spread (ROS) [m/min] 1 | Inference Latency () [sec] | Spatial Error (Baseline) [m] 2 | Spatial Error (Ours) [m] |
|---|---|---|---|---|
| 20 | ≈33 | 176.70 | 97.1 | Residual Variance |
| 40 | ≈66 | 176.70 | 194.3 | Residual Variance |
| 60 (Benchmark) | ≈100 | 176.70 | 294.5 | Residual Variance |
| 80 | ≈133 | 176.70 | 391.6 | Residual Variance |
| Approach | Architecture | Data Handling Strategy | Reasoning Process | Performance Metrics |
|---|---|---|---|---|
| WSN Fire Controller (2018) [8] | WSN with fuzzy controller | Sensor (meteorological/gases) | Fuzzy logic for alerts | N/A (real-time alerts) |
| Edge-UAV System (2018) [33] | Edge/fog/cloud hierarchy | UAV sensor data | Dynamic allocation | N/A (qualitative efficient management of CPU/RAM, battery life, and network resources based on initial experiments) |
| MARL-based Systems (2020) [19] | MARL networks | Continuous 2D position states and movement actions | Reinforcement learning with centralized training | Convergence in 5000 episodes (CR-MARL) |
| SmokeyNet (2022) [14] | Single-model CNNs | Static sequential image frames from fixed cameras | Fixed inference/Binary classification | Precision: 89.84%, Recall: 76.45%, F1-score: 82.59%, Accuracy: 83.49% |
| Fuzzy Fire Mapping (2024) [7] | Fuzzy inference with GIS | Remote sensing (temp/rainfall) | Rule-based susceptibility | AUC 0.879 |
| UAV Swarms for WER (2025) [6] | Systems engineering with swarms | Multimodal (sensors/UAVs) | Collaborative self organization | N/A (conceptual) |
| High-Level MAS with DRL (2025) [20] | Hierarchical MAS | Integrated historical/real-time | DRL for tracking/estimation | N/A (conceptual) |
| Heterogeneous MAS (2025) [18] | Fault tolerant formation control | Sensor inputs for tracking | FO-NFTSM/HOSMO | Finite-time convergence |
| WildfireGPT (2025) [15] | LLM Agent with RAG framework | Multi-modal data sources | Multi-round conversational reasoning | Correctness: 97.73%, Relevance: 98.20%, Entailment: 93.75%, Accessibility: 95.49% |
| Proposed Orchestrator- Based MAS (Ours) | Orchestrator based with LMM/RAG | Dynamic multimodal (text/image) | Context-aware via agents | Precision 0.797, F1-score 0.736 (Section 5, Table 7) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Sandeep, A.; Jayarathna, S.; Sandaruwan, S.; Samarappuli, V.; Meedeniya, D.; Perera, C. Context-Aware Multi-Agent Architecture for Wildfire Insights. Sensors 2026, 26, 1070. https://doi.org/10.3390/s26031070
Sandeep A, Jayarathna S, Sandaruwan S, Samarappuli V, Meedeniya D, Perera C. Context-Aware Multi-Agent Architecture for Wildfire Insights. Sensors. 2026; 26(3):1070. https://doi.org/10.3390/s26031070
Chicago/Turabian StyleSandeep, Ashen, Sithum Jayarathna, Sunera Sandaruwan, Venura Samarappuli, Dulani Meedeniya, and Charith Perera. 2026. "Context-Aware Multi-Agent Architecture for Wildfire Insights" Sensors 26, no. 3: 1070. https://doi.org/10.3390/s26031070
APA StyleSandeep, A., Jayarathna, S., Sandaruwan, S., Samarappuli, V., Meedeniya, D., & Perera, C. (2026). Context-Aware Multi-Agent Architecture for Wildfire Insights. Sensors, 26(3), 1070. https://doi.org/10.3390/s26031070

