From Volume to Mass: Transforming Volatile Organic Compound Detection with Photoionization Detectors and Machine Learning
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Sites and Equipment
2.2. Experimental Design
2.2.1. Equipment Installation, Calibration, and Comparison
2.2.2. Algorithm Optimization
2.2.3. Data Flow and Pseudocode
2.2.4. Computational Resources
2.3. Data Analysis
2.3.1. Data Preprocessing
2.3.2. Correlation Analysis
2.3.3. Model Building and Validation
3. Results
3.1. Comparison of PID and GC-FID System and the Impact of Meteorological Factors
3.1.1. Comparison of PID and GC-FID Systems
3.1.2. Impact of Meteorological Parameters on VOCs Concentrations
3.2. Comparison of Model Performance and Analysis of the Impact of Meteorological Factors
3.2.1. Comparison of Model Performance Comparison
3.2.2. Analysis of the Impact of Meteorological Factors
3.3. Case Analysis of High-Concentration Pollution Events
3.4. Model Generalizability and Cross-Scenario Challenges
3.4.1. Cross-Site and Cross-Seasonal Validation
3.4.2. Generalizability Analysis—Pollutant Concentration
4. Discussion
5. Conclusions
5.1. Summary of Conclusions
5.2. Future Outlook
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Han, D.; Gao, S.; Fu, Q.; Cheng, J.; Chen, X.; Xu, H.; Liang, S.; Zhou, Y.; Ma, Y. Do volatile organic compounds (VOCs) emitted from petrochemical industries affect regional PM2.5? Atmos. Res. 2018, 209, 123–130. [Google Scholar] [CrossRef]
- Sun, J.; Wu, F.; Hu, B.; Tang, G.; Zhang, J.; Wang, Y. VOC characteristics, emissions and contributions to SOA formation during hazy episodes. Atmos. Environ. 2016, 141, 560–570. [Google Scholar] [CrossRef]
- Wang, J.; Ye, J.; Zhang, Q.; Zhao, J.; Wu, Y.; Li, J.; Liu, D.; Li, W.; Zhang, Y.; Wu, C.; et al. Aqueous production of secondary organic aerosol from fossil-fuel emissions in winter Beijing haze. Proc. Natl. Acad. Sci. USA 2021, 118, e2022179118. [Google Scholar] [CrossRef] [PubMed]
- Mozaffar, A.; Zhang, Y.-L. Atmospheric volatile organic compounds (VOCs) in China: A review. Curr. Pollut. Rep. 2020, 6, 250–263. [Google Scholar] [CrossRef]
- Liu, B.; Ji, J.; Zhang, B.; Huang, W.; Gan, Y.; Leung, D.Y.C.; Huang, H. Catalytic ozonation of VOCs at low temperature: A comprehensive review. J. Hazard. Mater. 2022, 422, 126847. [Google Scholar] [CrossRef]
- Geng, F.; Tie, X.; Xu, J.; Zhou, G.; Peng, L.; Gao, W.; Tang, X.; Zhao, C. Characterizations of ozone, NOx, and VOCs measured in Shanghai, China. Atmos. Environ. 2008, 42, 6873–6883. [Google Scholar] [CrossRef]
- International Agency for Research on Cancer (IARC). IARC Monographs on the Evaluation of Carcinogenic Risks to Humans; Volume 100F: Formaldehyde; IARC Press: Lyon, France, 2012. [Google Scholar]
- International Agency for Research on Cancer (IARC). IARC Monographs on the Evaluation of Carcinogenic Risks to Humans; Volume 100F: Benzene; IARC Press: Lyon, France, 2018. [Google Scholar]
- World Health Organization. Health Risks of Indoor Air Pollution from Household Solid Fuel Use; World Health Organization: Geneva, Switzerland, 2014. [Google Scholar]
- Liu, Y.; Xie, Q.; Li, X.; Tian, F.; Qiao, X.; Chen, J.; Ding, W. Profile and source apportionment of volatile organic compounds from a complex industrial park. Environ. Sci. Process. Impacts 2019, 21, 9–18. [Google Scholar] [CrossRef] [PubMed]
- Gu, Y.; Liu, B.; Meng, H.; Song, S.; Dai, Q.; Shi, L.; Feng, Y.; Hopke, P.K. Source apportionment of consumed volatile organic compounds in the atmosphere. J. Hazard. Mater. 2023, 459, 132138. [Google Scholar] [CrossRef] [PubMed]
- Cao, L.; Men, Q.; Zhang, Z.; Yue, H.; Cui, S.; Huang, X.; Zhang, Y.; Wang, J.; Chen, M.; Li, H. Significance of volatile organic compounds to secondary pollution formation and health risks observed during a summer campaign in an industrial urban area. Toxics 2024, 12, 34. [Google Scholar] [CrossRef]
- Ministry of Ecology and Environment of the People’s Republic of China. 2024 China Ecological Environment Statistical Report; Ministry of Ecology and Environment of the People’s Republic of China: Beijing, China, 2024. [Google Scholar]
- Câmara, J.S.; Martins, C.; Pereira, J.A.M.; Perestrelo, R.; Rocha, S.M. Chromatographic-based platforms as new avenues for scientific progress and sustainability. Molecules 2022, 27, 5267. [Google Scholar] [CrossRef]
- Badjagbo, K.; Sauvé, S.; Moore, S. Real-time continuous monitoring methods for airborne VOCs. TrAC Trends Anal. Chem. 2007, 26, 931–940. [Google Scholar] [CrossRef]
- Han, M.; Ren, G.; Zhao, X.; Zhang, X.; Lin, H.; Liu, D.; Wang, L. Spatial heterogeneity of volatile organic compound pollution in a typical industrial park based on multi-point online monitoring: Pollution characteristics, health risks, and priority-controlled species. Atmos. Environ. 2024, 338, 120852. [Google Scholar] [CrossRef]
- Xu, W.; Cai, Y.; Gao, S.; Hou, S.; Yang, Y.; Duan, Y.; Fu, Q.; Chen, F.; Wu, J. New understanding of miniaturized VOCs monitoring device: PID-type sensors performance evaluations in ambient air. Sens. Actuators B Chem. 2021, 330, 129285. [Google Scholar] [CrossRef]
- Spinelle, L.; Gerboles, M.; Kok, G.; Persijn, S.; Sauerwald, T. Performance evaluation of low-cost BTEX sensors and devices within the EURAMET Key-VOCs project. Proceedings 2017, 1, 425. [Google Scholar] [CrossRef]
- MacDonald, M.; Thoma, E.; George, I.; Duvall, R. Demonstration of VOC fenceline sensors and canister grab sampling near chemical facilities in Louisville, Kentucky. Sensors 2022, 22, 3480. [Google Scholar] [CrossRef]
- Thoma, E.D.; Brantley, H.L.; Oliver, K.D.; Whitaker, D.A.; Mukerjee, S.; Mitchell, B.; Wu, T.; Squier, B.; Escobar, E.; Cousett, T.A.; et al. South Philadelphia passive sampler and sensor study. J. Air Waste Manag. Assoc. 2016, 66, 959–970. [Google Scholar] [CrossRef]
- Skarysz, A.; Salman, D.; Eddleston, M.; Sykora, M.; Hunsicker, E.; Nailon, W.H.; Darnley, K.; McLaren, D.B.; Thomas, C.L.P.; Soltoggio, A. Fast and automated biomarker detection in breath samples with machine learning. PLoS ONE 2022, 17, e0265399. [Google Scholar] [CrossRef]
- Badawi, D.; Ayhan, T.; Ozev, S.; Yang, C.; Orailoglu, A.; Çetin, A.E. Detecting gas vapor leaks using uncalibrated sensors. IEEE Access 2019, 7, 155701–155710. [Google Scholar] [CrossRef]
- Zhang, S. Application of machine learning in environmental engineering. In Artificial Intelligence for Future Society; Palade, V., Favorskaya, M., Patnaik, S., Simic, M., Belciug, S., Eds.; Springer: Cham, Switzerland, 2024; pp. 1–15. [Google Scholar] [CrossRef]
- Fascista, A. Toward integrated large-scale environmental monitoring using WSN/UAV/Crowdsensing: A review of applications, signal processing, and future perspectives. Sensors 2022, 22, 1824. [Google Scholar] [CrossRef] [PubMed]
- Singh, S.; Sajana, S.; Varma, P.; Sreelekha, G.; Adak, C.; Shukla, R.P.; Kamble, V.B. Metal oxide-based gas sensor array for VOCs determination in complex mixtures using machine learning. Mikrochim. Acta 2024, 191, 196. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.L.; Hsu, C.C. Using transfer-learning-based algorithms as data reduction strategies for volatile organic compounds classification using plasma spectroscopy. J. Phys. D Appl. Phys. 2023, 56, 324003. [Google Scholar] [CrossRef]
- Wang, L.; Cheng, Y.; Parekh, G.; Naidu, R. Real-time monitoring and predictive analysis of VOC flux variations in soil vapor: Integrating PID sensing with machine learning for enhanced vapor intrusion forecasts. Sci. Total Environ. 2024, 924, 171616. [Google Scholar] [CrossRef] [PubMed]
- Honeywell International Inc. Humidity Filtering II Tube for PID Measurements in Humid Environments; Honeywell International Inc.: Charlotte, NC, USA, 2021. [Google Scholar]
- Honeywell International Inc. Correction Factors, Ionization Energies, and Calibration Characteristics for 3GPID+1 Monitors; Honeywell International Inc.: Charlotte, NC, USA, 2021. [Google Scholar]
Algorithm | Core Advantage | Hyperparameter Settings (GridSearchCV Optimization) | Optimization Strategy |
---|---|---|---|
SVR | Multivariate linear fitting | Kernel function (linear), penalty coefficient (C = 10) | Maintained linear kernel stability with grid-selected optimal penalty coefficient |
PR | Nonlinear fitting capability | Degree (degree = 2) | Optimal degree selected through model complexity testing |
DT | High model interpretability | Maximum depth (10), minimum samples split (2) | Pre-pruning strategy to balance depth and overfitting risk |
GBDT | Residual iterative optimization | Learning rate (0.1), number of trees (100), maximum depth (4) | Early stopping for iteration control |
RF | High stability and anti-overfitting | Number of trees (100), maximum depth (10), maximum features (2) | Feature subsampling to enhance diversity Tree depth optimization for feature interaction |
Device Name | Effective Data | Mean (nmol mol−1) | R2 | Slope | Intercept | RMSE (nmol mol−1) | MAE (nmol mol−1) |
---|---|---|---|---|---|---|---|
GC-FID | 915 | 23.4 ± 36.4 | 1.00 | / | / | / | |
Y-2 | 858 | 50.4 ± 34.7 | 0.76 | 0.94 | −22.9 | 30.9 | 27.0 |
Z-1 | 915 | 74.3 ± 5.7 | 0.92 | 6.07 | −427.8 | 59.6 | 50.9 |
A-1 | 915 | 32.5 ± 40.0 | 0.81 | 0.82 | −3.3 | 19.4 | 9.1 |
A-2 | 915 | 28.2 ± 38.5 | 0.79 | 0.84 | −0.2 | 18.7 | 4.8 |
A-3 | 915 | 33.5 ± 40.2 | 0.85 | 0.83 | −4.5 | 18.7 | 10.1 |
S-1 | 915 | 22.5 ± 19.0 | 0.67 | 1.58 | −12.1 | 23.3 | −0.9 |
S-2 | 915 | 24.4 ± 19.2 | 0.69 | 1.58 | −15.2 | 22.9 | 1.0 |
S-3 | 915 | 20.7 ± 18.8 | 0.71 | 1.63 | −10.3 | 23.1 | −2.7 |
Model Name | MSE | RMSE (μg m−3) | MAE | R2 | MAPE (%) | SMAPE (%) | RMSEnorm |
---|---|---|---|---|---|---|---|
SVR | 2570.33 | 50.70 | 27.96 | 0.79 | 221.43 | 81.79 | 4.23 × 10−2 |
PR | 3445.14 | 58.70 | 29.52 | 0.72 | 231.03 | 106.85 | 2.29 × 10−2 |
DT | 4254.76 | 65.23 | 30.48 | 0.65 | 220.86 | 77.52 | 4.32 × 10−2 |
GBDT | 2504.75 | 50.05 | 24.40 | 0.80 | 235.28 | 82.25 | 2.08 × 10−2 |
RF | 2326.23 | 48.23 | 20.25 | 0.81 | 129.95 | 62.47 | 2.07 × 10−2 |
Time | PID (nmol mol−1) | Predicted_FID (μg m−3) | FID (μg m−3) | Main Pollutant (μg m−3) | Main Pollutant Type |
---|---|---|---|---|---|
4 March 2022 20 | 117.44 | 447.9 | 445.95 | 128.37 | Toluene |
4 March 2022 21 | 869.76 | 1223.4 | 1967.68 | 787.30 | 1,3-Butadiene |
4 March 2022 22 | 91.99 | 420.1 | 305.35 | 65.09 | Isoprene |
4 March 2022 23 | 30.28 | 76.5 | 119.83 | 22.23 | Xylene |
5 March 2022 00 | 106.31 | 434.1 | 462.49 | 253.18 | Isopentane |
5 March 2022 01 | 82.61 | 319.6 | 362.61 | 74.84 | Xylene |
5 March 2022 02 | 96.58 | 459.6 | 426.27 | 103.66 | Xylene |
5 March 2022 03 | 161.16 | 708.6 | 698.46 | 272.94 | Isopentane |
5 March 2022 04 | 191.02 | 858.8 | 803.06 | 386.08 | Isopentane |
5 March 2022 05 | 117.00 | 453.4 | 494.36 | 98.79 | Isopentane |
Site Name | Site A | Site B | Site C | |||
---|---|---|---|---|---|---|
Device Name | GC-FID | A-3 Prediction | GC-FID | A-1 Prediction | GC-FID | A-2 Prediction |
Mean (μg m−3) | 47.9 ± 110.6 | 38.6 ± 82.6 | 28.3 ± 31.3 | 28.2 ± 22.6 | 22.3 ± 27.6 | 20.4 ± 17.6 |
Training Set | 914 | 1364 | 1253 | |||
Testing Set | 914 | 1364 | 1253 | |||
R2 | 0.81 | 0.69 | 0.68 | |||
RMSE (μg m−3) | 48.3 | 17.4 | 20.4 | |||
MAE (μg m−3) | 20.5 | 8.5 | 10.6 | |||
MeanCV | 0.80 ± 0.02 | 0.67 ± 0.04 | 0.66 ± 0.05 | |||
Atm | 1024.3 ± 5.2 | 1013.4 ± 4.1 | 1011.9 ± 4.3 | |||
Temp (°C) | 6.9 ± 3.7 | 20.2 ± 3.9 | 20.1 ± 4.1 | |||
Hum | 74.0 ± 18.1 | 68.2 ± 14.0 | 70.8 ± 15.7 | |||
WS | 2.2 ± 1.2 | 0.5 ± 0.5 | 1.2 ± 0.7 | |||
WD Standard Deviation | 1.25 | 1.8 | 1.22 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cai, Y.; Che, X.; Duan, Y. From Volume to Mass: Transforming Volatile Organic Compound Detection with Photoionization Detectors and Machine Learning. Sensors 2025, 25, 5314. https://doi.org/10.3390/s25175314
Cai Y, Che X, Duan Y. From Volume to Mass: Transforming Volatile Organic Compound Detection with Photoionization Detectors and Machine Learning. Sensors. 2025; 25(17):5314. https://doi.org/10.3390/s25175314
Chicago/Turabian StyleCai, Yunfei, Xiang Che, and Yusen Duan. 2025. "From Volume to Mass: Transforming Volatile Organic Compound Detection with Photoionization Detectors and Machine Learning" Sensors 25, no. 17: 5314. https://doi.org/10.3390/s25175314
APA StyleCai, Y., Che, X., & Duan, Y. (2025). From Volume to Mass: Transforming Volatile Organic Compound Detection with Photoionization Detectors and Machine Learning. Sensors, 25(17), 5314. https://doi.org/10.3390/s25175314