A GraphRAG-Based Question-Answering System for Explainable and Advanced Reasoning over Air Quality Insights
Abstract
1. Introduction
- A structured semantic representation of key entities and their relationships for the IAQ domain through a flexible and scalable KG schema, including air pollutants and climatic conditions, sensor and building infrastructure, environmental and health guidelines, and IAQ predictive models.
- A dual-source retrieval architecture that implements a unified orchestration layer where the KG transforms natural language intent into deterministic queries for a TSDB. This design differentiates the system from generic GraphRAG frameworks by using the ontological structure of the KG to parameterize data retrieval rather than simply retrieving static text nodes.
- A GraphRAG architecture that integrates a KG and a time-series database for the retrieval of IAQ domain knowledge and historical measurements to support context-aware reasoning in question answering.
- A performance evaluation of the GraphRAG QA system, measuring its retrieval precision, contextual understanding, and the quality of generated answers.
2. Methodology
2.1. Problem Identification
2.2. Definition of the QA System’s Objectives
2.3. Design of the Knowledge Graph (KG) Schema
2.4. Data Management Platform for IAQ Measurements and Predictions Context-Aware Retrieval
2.5. Design of the Intention Detection Mechanism
2.6. Design of the GraphRAG Context Retrieval Mechanism
2.7. Design of the GraphRAG Augmented Generation Mechanism
3. Results
3.1. Evaluation Metrics for the GraphRAG QA System’s Performance
3.2. Evaluation Dataset for the QA System
3.3. Evaluation Results for the QA System—Baseline
3.4. Evaluation Results for the QA System—GraphRAG
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
| Entity | Property | Data Type | Description and Constraints |
|---|---|---|---|
| Building | buildingID | String | Unique identifier of the building entity (e.g., “BLD01”) |
| buildingType | String | Functional building type constrained to a predefined set (e.g., “Residential”, “Office”) | |
| buildingLocation | List<Float> | Geographical coordinates (latitude, longitude) of the building | |
| Apartment | apartmentID | String | Unique identifier of the apartment entity (e.g., “APRT01”) |
| apartmentFloor | Integer | Vertical level of the apartment within the building | |
| apartmentArea | Float | Total surface area of the apartment in square meters | |
| Room | roomID | String | Unique identifier of the room entity (e.g., “RM01”) |
| roomType | String | Functional room type constrained to a predefined set (e.g., “Bedroom”) | |
| roomArea | Float | Total surface area of the room in square meters | |
| User | userID | String | Unique identifier of the individual occupant (e.g., “USER01”) |
| userFullName | String | Full name of the user for personalized responses (e.g., “John Doe”) | |
| userRespiratoryProfile | JSON | Health metadata specifying user respiratory conditions and sensitivity level | |
| StaticDevice | staticDeviceID | String | Unique identifier of the static sensing device (e.g., “SDEV01”) |
| staticDeviceSpecs | JSON | Metadata defining static device specifications (e.g., MAC address, connectivity protocol) | |
| manufacturerSpecs | JSON | Metadata defining manufacturer specifications (e.g., manufacturer name, model version) | |
| PortableDevice | portableDeviceID | String | Unique identifier of the portable sensing device (e.g., “PDEV01”) |
| portableDeviceSpecs | JSON | Metadata defining portable device specifications (e.g., MAC address, connectivity protocol) | |
| manufacturerSpecs | JSON | Metadata defining manufacturer specifications (e.g., manufacturer name, model version) | |
| Pollutant | pollutantID | String | Unique identifier of the pollutant (e.g., “POL01”) |
| pollutantName | String | Standardized nomenclature of the pollutant (e.g., “CO2”) | |
| pollutantAlias | List<String> | Alternative nomenclature of the pollutant (e.g., “Carbon dioxide”) | |
| unitsOfMeasure | String | Standardized units of measure for the pollutant (e.g., “ppm”) | |
| ClimaticCondition | climaticConditionID | String | Unique identifier of the climatic condition (e.g., “CCON01”) |
| climaticConditionName | String | Standardized nomenclature of climatic condition (e.g., “Air Temperature”) | |
| climaticConditionAlias | List<String> | Alternative nomenclature of the climatic condition (e.g., “Temperature”) | |
| unitsOfMeasure | String | Standardized units of measure of the climatic condition (e.g., “°C”) | |
| PollutantThreshold | thresholdCategory | String | Qualitative associated risk level (e.g., “High”, “Moderate”) |
| limitValue | Float | Concentration boundary for the risk level category (e.g., “1000”) | |
| limitUnit | String | Standardized measurement unit for the threshold (e.g., “ppm”) | |
| ExposureDuration | exposureTimeWindow | Integer | Temporal window for pollutant exposure measured in hours |
| exposureAggregationType | String | Statistical aggregation method for exposure (e.g., “Average”, “Maximum”) | |
| HealthEffect | healthEffectID | String | Unique identifier of the health outcome attributed to pollutant exposure (e.g., “HL01”) |
| healthEffectDescription | String | Description of the health outcome attributed to pollutant exposure | |
| severityLevel | String | Classification of the health impact (e.g., “Acute”, “Chronic”) | |
| Authority | authorityName | String | Health or environmental authority name that issued the guideline (e.g., “WHO”, “EPA”) |
| documentTitle | String | Title of the environmental or health guideline issued by the authority | |
| documentPubYear | Integer | Year when the guideline was officially issued (e.g., 2021) | |
| documentPubYear | Integer | Year when the guideline was last ratified (e.g., 2021) | |
| PredictiveModel | modelID | String | Unique identifier of the instance of the forecasting model (e.g., “PRD01”) |
| modelType | String | High-level architecture of the forecasting model (e.g., “LSTM”, “ARIMA”) | |
| modelTemporalResolution | Integer | Duration of the forecasting model’s single temporal step measured in minutes (e.g., 5) | |
| trainingDate | Datetime | Timestamp of the forecasting model’s last training cycle | |
| PredictionHorizon | horizonValue | Integer | Numerical value of the method’s forecast lead time (e.g., “60”) |
| horizonUnit | String | Temporal unit for the lead time (e.g., minutes) | |
| PredictionUncertainty | confidenceInterval | List<Float> | Probabilistic bounds of the prediction confidence intervals (e.g., [0.85, 0.95]) |
| errorMetricName | String | Type of error metric used (e.g., “RMSE”, “MAE”) | |
| errorMetricValue | Float | Numerical value for the specified error metric used | |
| PredictiveFeature | featureName | String | Name of the predictor variable (e.g., “lagged_CO2”) |
| featureLag | Integer | Number of discrete temporal steps relative to prediction time t (e.g., 1, 2) | |
| ExplanationMethod | methodID | String | Unique identifier of the diagnostic framework (e.g., “EXMETH01”) |
| methodType | String | Classification of the explanation approach (e.g., “Global”, “Local”) | |
| algorithmName | String | Specific technique used to generate the explanation (e.g., “SHAP”, “LIME”) | |
| ExplanationArtifact | artifactID | String | Unique identifier of the specific explanatory artifact (e.g., “EXAR01”) |
| featureWeights | JSON | Mapping of input features to their specific contribution scores | |
| baselineValue | Float | Reference value from which the explanation starts (e.g., SHAP base value) | |
| artifactFormat | String | Data representation type (e.g., “Vector”, “Contribution Map”) |
| Relationship | Source → Target | Attribute | Type | Description |
|---|---|---|---|---|
| hasApartment | Building → Apartment | isPrivateApartment | Boolean | Flag indicating whether a spatial unit is private |
| hasRoom | Apartment → Room | isPrimaryRoom | Boolean | Flag indicating the room’s prioritization |
| occupies | User → Apartment | isPermanentOccupant | Boolean | Flag indicating permanent occupation |
| occupiesSince | Datetime | Timestamp when the occupancy period started | ||
| hasInstalledDevice | Room → StaticDevice | installationHeight | Float | Vertical distance from the floor in meters |
| installtionDate | Datetime | Timestamp when the sensing device installed | ||
| installationContext | String | Description of the device’s spatial environment (e.g., “Near HVAC”) | ||
| hasPortableDevice | Room → StaticDevice | deploymentDate | Datetime | Timestamp when the sensing device deployed |
| deploymentContext | String | Description of the device’s deployment environment (e.g., “at the bedside table”) | ||
| measuresPollutant | Device → Pollutant | isPrimarySource | Boolean | Flags whether the sensor is the main reference for a specific pollutant in a room |
| samplingFrequency | Integer | Measurement interval in seconds (e.g., 60) | ||
| accuracy | Float | Measurement precision specific to the pollutant | ||
| measuresClimaticCondition | Device → ClimaticCondition | isPrimarySource | Boolean | Flags whether the sensor is the main reference for a specific pollutant near a user |
| samplingFrequency | Integer | Measurement interval in seconds (e.g., 60) | ||
| accuracy | Float | Measurement precision specific to the pollutant | ||
| hasThreshold | Pollutant → PollutantThreshold | isPrimarySource | Boolean | Flags whether the sensor is the main reference for a specific pollutant near a user |
| samplingFrequency | Integer | Measurement interval in seconds (e.g., 60) | ||
| accuracy | Float | Measurement precision specific to the pollutant | ||
| measuresClimaticConfigure | Device → ClimaticCondition | isPrimarySource | Boolean | Flags whether the sensor is the main reference for a specific pollutant near a user |
| samplingFrequency | Integer | Measurement interval in seconds (e.g., 60) | ||
| accuracy | Float | Measurement precision specific to the pollutant | ||
| hasThreshold | Pollutant → PollutantThreshold | isRecommended | Boolean | Flag specifying whether the threshold is an advisory guideline or a legally binding standard |
| hasHealthEffect | PollutantThreshold → HealthEffect | causalityType | String | Defines the link nature (e.g., “Symptomatic”) |
| usesFeature | PredictiveModel → PredictiveFeature | influenceScore | Float | Contribution score of the specific predictor to the model output |
| significanceMetric | String | Metric type used for the score (e.g., “p-value”, “Weight”, “Importance”) | ||
| featureLag | Integer | Number of discrete temporal steps relative to prediction time t (e.g., 1, 2) | ||
| hasUncertainty | PredictiveModel → PredictiveUncertainty | confidenceLevel | Float | Confidence level for a prediction (e.g., 0.95) |
| Category | Cypher Query |
|---|---|
| Spatial Navigation | MATCH (b:Building {buildingID: “BLD01”})-[:hasApartment]→(a:Apartment) -[:hasRoom]→(r:Room {roomType: “Bedroom”}) -[:hasInstalledDevice]→(d:StaticDevice) RETURN d.staticDeviceID, r.roomID |
| Pollutant Threshold and Health Effects | MATCH (p:Pollutant)-[:hasThreshold]→(pt:PollutantThreshold)-[:appliesToDuration]→(ed:ExposureDuration)-[:hasHealthEffect]→(he:HealthEffect) WHERE pt.thresholdCategory = “High” RETURN pt.limitValue, pt.limitUnit, he.healthEffectDescription, he.severityLevel |
| Predictive Model Feature Analysis | MATCH (pm:PredictiveModel {modelType: “LSTM”})-[:usesFeature]→(pf:PredictiveFeature) RETURN pf.featureName, pf.featureLag ORDER BY pf.featureLag ASC |
| User Exposure Context | MATCH (u:User {userID: “USER01”})-[:occupies]→(a:Apartment) -[:hasRoom]→(r:Room)-[:hasInstalledDevice]→(d:StaticDevice) -[:measuresPollutant]→(p:Pollutant)-[:hasThreshold]→(pt:PollutantThreshold) -[:appliesToDuration]→(ed:ExposureDuration) -[:hasHealthEffect]→(he:HealthEffect) WHERE he.severityLevel = “Chronic” RETURN u.userFullName, r.roomID, p.pollutantName, pt.limitValue, pt.limitUnit, ed.exposureTimeWindow, he.healthEffectDescription, he.severityLevel |
References
- Health Effects Institute. State of Global Air 2025: A Report on Air Pollution and Its Role in the World’s Leading Causes of Death; Health Effects Institute: Boston, MA, USA, 2025. [Google Scholar]
- World Health Organization. Household Air Pollution. Available online: https://www.who.int/news-room/fact-sheets/detail/household-air-pollution-and-health (accessed on 20 December 2025).
- United Nations Office for Disaster Risk Reduction. Household Air Pollution. Available online: https://www.undrr.org/understanding-disaster-risk/terminology/hips/en0101 (accessed on 20 December 2025).
- Morawska, L.; Thai, P.K.; Liu, X.; Asumadu-Sakyi, A.; Ayoko, G.; Bartonova, A.; Bedini, A.; Chai, F.; Christensen, B.; Dunbabin, M.; et al. Applications of low-cost sensing technologies for air quality monitoring and exposure assessment: How far have they gone? Environ. Int. 2018, 116, 286–299. [Google Scholar] [CrossRef]
- Liu, X.; Jayaratne, R.; Thai, P.; Kuhn, T.; Zing, I.; Christensen, B.; Lamont, R.; Dunbabin, M.; Zhu, S.; Gao, J.; et al. Low-cost sensors as an alternative for long-term air quality monitoring. Environ. Res. 2020, 185, 109438. [Google Scholar] [CrossRef]
- Hernández-Gordillo, A.; Ruiz-Correa, S.; Robledo-Valero, V.; Hernández-Rosales, C.; Arriaga, S. Recent advancements in low-cost portable sensors for urban and indoor air quality monitoring. Air Qual. Atmos. Health 2021, 14, 1931–1951. [Google Scholar] [CrossRef]
- Garcia, A.; Saez, Y.; Harris, I.; Huang, X.; Collado, E. Advancements in air quality monitoring: A systematic review of IoT-based air quality monitoring and AI technologies. Artif. Intell. Rev. 2025, 58, 275. [Google Scholar] [CrossRef]
- Saini, J.; Dutta, M.; Marques, G. Machine learning for indoor air quality assessment: A systematic review and analysis. Environ. Model. Assess. 2025, 30, 417–434. [Google Scholar] [CrossRef]
- Méndez, M.; Merayo, M.G.; Núñez, M. Machine learning algorithms to forecast air quality: A survey. Artif. Intell. Rev. 2023, 56, 10031–10066. [Google Scholar] [CrossRef]
- Ogundiran, J.; Asadi, E.; Gameiro da Silva, M. A systematic review on the use of AI for energy efficiency and indoor environmental quality in buildings. Sustainability 2024, 16, 3627. [Google Scholar] [CrossRef]
- Latoń, D.; Grela, J.; Ożadowicz, A.; Wiśniewski, L. Artificial intelligence and machine learning approaches for indoor air quality prediction: A comprehensive review of methods and applications. Energies 2025, 18, 5194. [Google Scholar] [CrossRef]
- Amangeldy, B.; Tasmurzayev, N.; Imankulov, T.; Baigarayeva, Z.; Izmailov, N.; Riza, T.; Abdukarimov, A.; Mukazhan, M.; Zhumagulov, B. AI-powered building ecosystems: A narrative mapping review on the integration of digital twins and LLMs for proactive comfort, indoor environmental quality, and energy management. Sensors 2025, 25, 5265. [Google Scholar] [CrossRef]
- Dai, T.; Wang, F.; Chen, Q. Application of large language models in the design of indoor air distribution systems for office environments. Build. Environ. 2025, 285, 113647. [Google Scholar] [CrossRef]
- Chen, A.; Du, J.; Rodriguez, A.; Rodriguez, R.; Higgins, J.; Podmore, R.; Liu, R.; Ilao, E.; Degilla, S.; Bibiano, J.; et al. Viability of applying large language models to indoor climate sensor and health data for scientific discovery. In Proceedings of the IEEE Global Humanitarian Technology Conference (GHTC), Radnor, PA, USA, 23–26 October 2024. [Google Scholar] [CrossRef]
- Burton, J.W.; Lopez-Lopez, E.; Hechtlinger, S.; Rahwan, Z.; Aeschbach, S.; Bakker, M.A.; Becker, J.A.; Berditchevskaia, A.; Berger, J.; Brinkmann, L.; et al. How large language models can reshape collective intelligence. Nat. Hum. Behav. 2024, 8, 1643–1655. [Google Scholar] [CrossRef]
- Ye, H.; Liu, T.; Zhang, A.; Hua, W.; Jia, W. Cognitive mirage: A review of hallucinations in large language models. arXiv 2023, arXiv:2309.06794. [Google Scholar] [CrossRef]
- McKenna, N.; Li, T.; Cheng, L.; Hosseini, M.; Johnson, M.; Steedman, M. Sources of hallucination by large language models on inference tasks. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, 6–10 December 2023. [Google Scholar] [CrossRef]
- Cheng, J.; Marone, M.; Weller, O.; Lawrie, D.; Khashabi, D.; Van Durme, B. Dated data: Tracing knowledge cutoffs in large language models. arXiv 2024, arXiv:2403.12958. [Google Scholar] [CrossRef]
- Wang, S.; Zhu, Y.; Liu, H.; Zheng, Z.; Chen, C.; Li, J. Knowledge editing for large language models: A survey. ACM Comput. Surv. 2024, 57, 59. [Google Scholar] [CrossRef]
- Zhao, H.; Chen, H.; Yang, F.; Liu, N.; Deng, H.; Cai, H.; Wang, S.; Yin, D.; Du, M. Explainability for large language models: A survey. ACM Trans. Intell. Syst. Technol. 2024, 15, 20. [Google Scholar] [CrossRef]
- Gantla, S.R. Exploring mechanistic interpretability in large language models: Challenges, approaches, and insights. In Proceedings of the International Conference on Data Science, Agents & Artificial Intelligence (ICDSAAI), Chennai, India, 6–10 June 2025. [Google Scholar] [CrossRef]
- Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.-T.; Rocktäschel, T.; et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. arXiv 2021, arXiv:2005.11401. [Google Scholar] [CrossRef]
- Han, H.; Wang, Y.; Shomer, H.; Guo, K.; Ding, J.; Lei, Y.; Halappanavar, M.; Rossi, R.A.; Mukherjee, S.; Tang, X.; et al. Retrieval-augmented generation with graphs (GraphRAG). arXiv 2025, arXiv:2501.00309. [Google Scholar] [CrossRef]
- Edge, D.; Trinh, H.; Cheng, N.; Bradley, J.; Chao, A.; Mody, A.; Truitt, S.; Metropolitansky, D.; Osazuwa Ness, R.; Larson, J. From local to global: A Graph RAG approach to query-focused summarization. arXiv 2025, arXiv:2404.16130. [Google Scholar] [CrossRef]
- Romaios, A.; Sfikas, P.; Giannadakis, A.; Panidis, T.; Paravantis, J.A.; Skouras, E.D.; Mihalakakou, G. Artificial Intelligence for Enhancing Indoor Air Quality in Educational Environments: A Review and Future Perspectives. Sustainability 2025, 17, 10117. [Google Scholar] [CrossRef]
- Zhang, H.; Srinivasan, R. A Systematic Review of Air Quality Sensors, Guidelines, and Measurement Studies for Indoor Air Quality Management. Sustainability 2020, 12, 9045. [Google Scholar] [CrossRef]
- Settimo, G.; Yu, Y.; Gola, M.; Buffoli, M.; Capolongo, S. Challenges in IAQ for Indoor Spaces: A Comparison of the Reference Guideline Values of Indoor Air Pollutants from the Governments and International Institutions. Atmosphere 2023, 14, 633. [Google Scholar] [CrossRef]
- Morawska, L.; Huang, W. WHO Health Guidelines for Indoor Air Quality and National Recommendations/Standards. In Handbook of Indoor Air Quality; Zhang, Y., Hopke, P.K., Mandin, C., Eds.; Springer: Singapore, 2022. [Google Scholar] [CrossRef]
- Camprodon, G.; González, Ó.; Barberán, V.; Pérez, M.; Smári, V.; de Heras, M.Á.; Bizzotto, A. Smart Citizen Kit and Station: An Open Environmental Monitoring System for Citizen Participation and Scientific Experimentation. HardwareX 2019, 6, e00070. [Google Scholar] [CrossRef]
- Cirillo, F.; Solmaz, G.; Berz, E.L.; Bauer, M.; Cheng, B.; Kovacs, E. A Standard-Based Open Source IoT Platform: FIWARE. IEEE Internet Things Mag. 2019, 2, 12–18. [Google Scholar] [CrossRef]
- Bauer, M. FIWARE: Standard-Based Open Source Components for Cross-Domain IoT Platforms. In Proceedings of the 2022 IEEE 8th World Forum on Internet of Things (WF-IoT), Yokohama, Japan, 26 October–11 November 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Mountzouris, C.; Protopsaltis, G.; Gialelis, J. Toward Personalized Short-Term PM2.5 Forecasting Integrating a Low-Cost Wearable Device and an Attention-Based LSTM. Air 2025, 3, 29. [Google Scholar] [CrossRef]
- Mountzouris, C.; Protopsaltis, G.; Gialelis, J. Short-Term Forecast of Indoor CO2 Using Attention-Based LSTM: A Use Case of a Hospital in Greece. Sensors 2025, 25, 5382. [Google Scholar] [CrossRef] [PubMed]
- Mountzouris, C.; Protopsaltis, G.; Gialelis, J.; Fytili, D. Short-term forecast of indoor CO2 using Gradient Boosting: A use case of a hospital in Greece. In Proceedings of the IEEE 30th International Conference on Emerging Technologies and Factory Automation (ETFA), Porto, Portugal, 9–12 September 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 1–7. [Google Scholar] [CrossRef]
- Yu, H.-H.; Lin, W.-T.; Kuan, C.-W.; Yang, C.-C.; Liao, K.-M. GraphRAG-Enhanced Dialogue Engine for Domain-Specific Question Answering: A Case Study on the Civil IoT Taiwan Platform. Future Internet 2025, 17, 414. [Google Scholar] [CrossRef]
- Jiang, B.; Liu, Z.; Wang, N.; Li, Z.; Shi, Y.; Lin, B. Process-Oriented Dual-Layer Knowledge GraphRAG for Reservoir Engineering Decision Support. Processes 2025, 13, 3230. [Google Scholar] [CrossRef]
- Papageorgiou, G.; Sarlis, V.; Maragoudakis, M.; Tjortjis, C. Hybrid Multi-Agent GraphRAG for E-Government: Towards a Trustworthy AI Assistant. Appl. Sci. 2025, 15, 6315. [Google Scholar] [CrossRef]

| Parameter | Type | Sensing Layer |
|---|---|---|
| Air temperature | Climatic | Static sensors, Portable sensors |
| Relative humidity | Climatic | Static sensors, Portable sensors |
| CO2 | IAQ | Static sensors, Portable sensors |
| PM1, PM2.5, PM4, PM10 | IAQ | Static sensors, Portable sensors |
| VOC index | IAQ | Static sensors, Portable sensors |
| Formaldehyde | IAQ | Static sensors |
| Category | Example of Competency Question | Data Sources * |
|---|---|---|
| Factual | What is the 8 h exposure threshold for indoor CO2 based on the WHO? | KG |
| Which health effects are associated with the long-term PM2.5 exposure? | KG | |
| Which pollutants are measured from my portable device? | KG | |
| Relational | Did the CO2 in my bedroom exceeded the short-term exposure limits yesterday? | KG + TSDB |
| Provide me a prediction for the PM2.5 in my living room for the next 1 h. | KG + TSDB | |
| Based on my exposure on PM2.5 this morning, what’s the respiratory risk? | KG + TSDB | |
| Analytical | What was the peak PM2.5 concentration in my bedroom during the last day? | TSDB |
| How many hours this week did the CO2 in the kitchen stay below 1000 ppm? | TSDB | |
| What was my average exposure to PM2.5 over the past 30 days? | TSDB | |
| Summarizing | Provide me a weekly compliance summary for PM2.5 across all the rooms. | KG + TSDB |
| Summarize the overall health risk profile for my apartment during the last week. | KG + TSDB | |
| Report me the differences in CO2 levels across all the rooms yesterday. | KG + TSDB |
| Category | Description of Category | Variables | Questions |
|---|---|---|---|
| Temporal Analysis | Historical measurements of air pollutants and climatic conditions across the occupant’s space. | Target pollutant or climatic condition (e.g., CO2, PM2.5, air temperature) | 35 |
| Temporal window (e.g., timestamp, hour, days, week, month) | |||
| Temporal aggregation (e.g., average, standard deviation, count, max, min) | |||
| Temporal operators (e.g., before, now, this morning) | |||
| Predictive Analytics | Short-term forecasting for IAQ conditions across the occupant’s space. | Target pollutant (e.g., CO2 levels, PM2.5 levels) | 20 |
| Temporal horizon (e.g., minutes, hour) | |||
| Compliance Monitoring | Compare historical IAQ measurements against established threshold and guidelines for exposure. | Target pollutant (e.g., CO2 levels, PM2.5 levels) | 20 |
| Target guideline (e.g., WHO thresholds) | |||
| Exposure duration (e.g., hour, days, week, month) | |||
| Health Impact | Compare historical IAQ measurements against established guidelines for health risk assessment. | Target pollutant (e.g., CO2 levels, PM2.5 levels) | 20 |
| Target guideline (e.g., WHO health guidelines) | |||
| Exposure duration (e.g., hour, days, week, month) | |||
| Explainable Predictions | Interpret predictions based on individual feature contribution, confidence, and explainability artifacts. | Target model (e.g., ARIMA, Random Forest, Gradient Boosting, LSTM) | 25 |
| Predictive Factors (e.g., rolling CO2 mean, lagged PM2.5 concentrations) | |||
| Prediction Uncertainty and Model’s Error Metrics | |||
| Explainable artifacts (e.g., feature importance, correlation coefficients) | |||
| Domain Knowledge | General knowledge on fundamental IAQ concepts. | IAQ and environmental guidelines (e.g., EPA guidelines) | 20 |
| IAQ and health guidelines (e.g., WHO guidelines) | |||
| IAQ exposure and health effects (e.g., long-term PM2.5 exposure) | |||
| Sensing Devices | Sensor infrastructure specifications | Device type (e.g., portable device, static device) | 15 |
| Device spatial allocation (e.g., apartment-level, room-level) | |||
| Sensing capabilities (e.g., pollutants, climatic conditions) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Mountzouris, C.; Protopsaltis, G.; Gialelis, J. A GraphRAG-Based Question-Answering System for Explainable and Advanced Reasoning over Air Quality Insights. Air 2026, 4, 6. https://doi.org/10.3390/air4010006
Mountzouris C, Protopsaltis G, Gialelis J. A GraphRAG-Based Question-Answering System for Explainable and Advanced Reasoning over Air Quality Insights. Air. 2026; 4(1):6. https://doi.org/10.3390/air4010006
Chicago/Turabian StyleMountzouris, Christos, Grigorios Protopsaltis, and John Gialelis. 2026. "A GraphRAG-Based Question-Answering System for Explainable and Advanced Reasoning over Air Quality Insights" Air 4, no. 1: 6. https://doi.org/10.3390/air4010006
APA StyleMountzouris, C., Protopsaltis, G., & Gialelis, J. (2026). A GraphRAG-Based Question-Answering System for Explainable and Advanced Reasoning over Air Quality Insights. Air, 4(1), 6. https://doi.org/10.3390/air4010006

