Deep Hybrid AI Models Applied to Predict, Model, and Forecast the Next Upcoming Periods of Ozone in Craiova City
Abstract
1. Introduction
- -
- Short-term forecasting: most existing models focus on short-term predictions, providing only a limited number of future values per air pollutant.
- -
- Absence of certain variables: existing research often lacks comprehensive studies that incorporate all relevant variables for air pollutant prediction and forecasting.
- -
- Model limitations: many models are constrained by the specific location and dataset used, making them difficult to generalize in other areas.
- -
- Model selection: in some cases, models are chosen and applied without a thorough assessment or justification for their selection.
- -
- Statistical evaluation: some studies rely on a narrow set of statistical metrics and indices, limiting the robustness of model accuracy assessments.
2. Data Description
2.1. Area of Interest and Climate Description
2.2. Air Quality Monitoring Stations and the Dataset
2.3. Data Pre-Processing
- -
- Monthly trends (first subplot): this plot illustrates average ozone concentrations for each month. Higher ozone levels are observed during summer months (June, July, and August) due to increased solar radiation and higher temperatures, which enhance photochemical reactions. Conversely, lower ozone concentrations are recorded in winter months (December and January) due to reduced sunlight and colder temperatures.
- -
- Seasonal trends (second subplot): a bar chart aggregates monthly ozone data into four seasons (winter, spring, summer, and fall). The highest ozone levels are observed in summer (season 3), driven by higher temperatures and intense sunlight. In contrast, the lowest ozone levels occur in winter (season 1), when photochemical activity is minimal.
- -
- Outlier detection (third subplot): a scatter plot highlights extreme ozone values (outliers) over time. Outliers, marked in red, may indicate pollution spikes, measurement errors, or unusual meteorological conditions. The overall time series exhibits periodic fluctuations, likely influenced by daily and seasonal cycles.
- -
- Correlation analysis (fourth subplot): a horizontal bar chart presents the correlation strength between ozone and other variables. Positive correlations are found with temperature and sun brightness, suggesting that ozone levels increase with higher temperatures and solar radiation. Negative correlations appear with variables like relative humidity, indicating that higher moisture content can suppress ozone formation.
- -
- Ozone distribution by month (fifth subplot): a box plot displays the distribution of ozone concentrations for each month, capturing medians, quartiles, and outliers. The summer months exhibit a wider range of ozone values, likely due to higher photochemical activity, whereas winter months display more stable and lower concentrations.
3. Materials and Methods
3.1. AI-Based Models Employed
- Equation (1) serves as a conceptual framework connecting AI models with classical system identification theory.
- Equation (2) expresses the functional mapping structure that data-driven models (e.g., NARX, NARMAX) learn during training.
- The innovation lies in integrating this general dynamic representation with AI-based learning and hybrid feature selection to improve model generalization and interpretability.
3.2. Evaluation Criteria and Statistical Indices
3.3. Methodology
4. Results and Discussion
4.1. Best-Performing Model for Ozone Prediction
4.2. Best-Performing Model for Ozone Forecasting
5. Conclusions
5.1. Key Conclusions
- Among 27 ML and DL models evaluated, the hybrid RF–treebag model emerged as the top-performing model for predicting hourly ozone concentrations in both training and testing. This model achieved excellent results in the testing stage, with R2 = 0.96615, MBE = 0.002468 μg/m3, RMSE = 6.2576 μg/m3, MAPE = 0.087766 μg/m3, and σ = 6.2582 μg/m3.
- From 50 FS techniques assessed, the GEO technique was identified as the best feature selection method, providing the most effective predictor combination for ozone concentration prediction. The analysis indicated that the most relevant predictors for ozone concentration are C6H6, isomers of xylene, NO, NOx, precipitation, RH, SO2, sun brightness, C7H8, and wind velocity. When coupled with the hybrid RF–treebag model, this approach yielded strong predictive accuracy, with R2 = 0.95656, MBE = 0.043114 μg/m3, RMSE = 7.2145 μg/m3, MAPE = 0.10317 μg/m3, and σ = 7.2151 μg/m3.
- A novel Deep-NARMAX model was successfully developed for forecasting and modeling future ozone concentrations. The model relies on key predictors, including ozone concentration at time (t − 1), sun brightness at time (t − 1), and month at time (t − 2). Its statistical evaluation confirmed robust predictive performance, achieving R2 = 0.93614, MBE = −0.26051 μg/m3, RMSE = 8.279 μg/m3, MAPE = 0.032824 μg/m3, and σ = 8.2752 μg/m3.
5.2. Challenges
- There is no long-term research on the hybrid ML-DL approach on ground-level ozone based on the dataset supplied by the Romanian Agency for Environmental Protection (member of the EEA network). The amount of missing data in the Romanian Agency for Environmental Agency network has to be clarified and effectively understood.
- No study compares the quality of the data from different open sources about air pollution (independent low-cost sensor networks, the EEA network, and satellite data), eventually complementing the information from different data sources to complement data missing from one dataset.
- Improvement of the techniques of prediction and forecasting in air pollution allow decision-makers to select the best solutions for the benefit of their local communities.
5.3. Limitations
5.4. Perspectives
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
| Method | Description | Key Parameters | Advantages | Disadvantages | Reference |
|---|---|---|---|---|---|
| Correlation-based feature selection (CFS) | Selects features with high correlation to target and low inter-feature correlation. | Correlation threshold | Simple and computationally inexpensive. | May miss nonlinear relationships. | [27] |
| ANOVA F-Test | Measures statistical significance of each feature with the target variable. | p-value threshold | Well suited for identifying significant features. | Limited to linear dependencies. | [66] |
| Variance threshold | Removes features with low variance. | Variance threshold | Simple and eliminates uninformative features. | Does not consider the target variable. | [67] |
| Chi-square test | Measures dependence between categorical features and the target variable. | None | Effective for categorical data. | Not suitable for continuous features. | [26] |
| Lasso regularization | Shrinks coefficients of less important features to zero. | Regularization parameter λ | Performs feature selection and regression simultaneously. | Assumes linear relationships. | [68] |
| Principal component analysis (PCA) | Reduces dimensionality by projecting data onto principal components. | Number of components | Captures variance and reduces redundancy. | Does not select original features directly. | [69] |
| Particle swarm optimization (PSO) | Simulates particle movement to optimize feature subsets. | Number of particles, inertia weight | Efficient and widely applicable. | May converge to local optima. | [70] |
| Genetic algorithm (GA) | Uses evolutionary principles like crossover and mutation. | Population size, mutation rate | Handles discrete and continuous search spaces. | Can be computationally intensive. | [71] |
| Ant colony optimization (ACO) | Simulates pheromone-based movement to optimize feature selection. | Number of ants, evaporation rate | Good exploration of search space. | Requires parameter tuning and computationally intensive. | [72] |
| Coyote optimization algorithm (COA) | Models social behavior of coyotes in packs. | Number of coyotes | Simple and adaptive. | Limited theoretical validation. | [73] |
| Imperialist competitive algorithm (ICA) | Simulates imperialistic competition among nations to optimize selection. | Number of nations, assimilation rate | Strong global search ability. | May converge slowly. | [74] |
| Whale optimization algorithm (WOA) | Mimics humpback whales’ bubble-net hunting behavior. | Number of whales, search space size | Balances exploration and exploitation. | Requires careful parameter tuning. | [75] |
| Grey wolf optimizer (GWO) | Simulates leadership hierarchy and hunting mechanism of wolves. | Number of wolves | Simple and effective. | Can be slow to converge for large feature sets. | [76] |
| Simulated annealing (SA) | Combines stochastic exploration with cooling schedule. | Initial temperature, cooling rate | Effective for avoiding local optima. | Sensitive to parameter settings. | [77] |
| Firefly algorithm (FA) | Mimics fireflies’ light-based attraction. | Light absorption, randomness factor | Balances exploration and exploitation. | May be slower for high-dimensional data. | [78] |
| Harmony search (HS) | Inspired by the process of musical improvisation to find optimal solutions. | Harmony memory size, pitch adjustment rate | Simple and effective for continuous optimization. | Requires parameter tuning. | [79] |
| Bat algorithm (BA) | Mimics echolocation behavior of bats for optimization. | Loudness, pulse rate, frequency | Balances exploration and exploitation. | Can become trapped in local optima. | [80] |
| Artificial bee colony (ABC) | Simulates foraging behavior of honeybees. | Number of bees, limit for abandonment | Effective for global search. | May converge slowly for large datasets. | [81] |
| Crow search algorithm (CSA) | Model crows’ intelligence to hide and retrieve solutions. | Awareness probability, flight length | Effective for global optimization. | May converge slowly. | [82] |
| Dragonfly algorithm (DA) | Simulates static and dynamic swarming behavior of dragonflies. | Separation, alignment, cohesion weights | Effective for feature-rich datasets. | Computationally intensive for large-scale problems. | [83] |
| Raven roosting optimization (RRO) | Inspired by ravens’ intelligence and cooperation for optimization. | Awareness probability, social behavior | Effective for combinatorial problems. | Limited theoretical validation. | [84] |
| Shuffled frog-leaping algorithm (SFLA) | Mimics frog leaping behavior for cooperative optimization. | Number of frogs, memeplex division | Good exploration properties. | Convergence speed depends on parameters. | [85] |
| Seagull optimization algorithm (SOA) | Mimics the behavior of seagulls in their search for food | None | Balances exploration and exploitation. | May require parameter fine-tuning. | [86] |
| Salp swarm algorithm (SSA) | Simulates salps’ chain-based movement in the search space. | Number of salps, adaptive parameters | Good convergence properties. | May require parameter tuning. | [87] |
| Elephant herding optimization (EHO) | Simulates the social behavior of elephants in herds. | Clan size, migration rate | Handles multimodal problems effectively. | Sensitive to parameter settings. | [88] |
| Harris hawks optimization (HHO) | Mimics cooperative hunting strategy of Harris hawks. | None | Good balance of exploration and exploitation. | May require parameter fine-tuning for large feature sets. | [77] |
| Grasshopper optimization algorithm (GOA) | Mimics swarming behavior of grasshoppers for optimization. | Control parameters | Efficient for nonlinear problems. | May converge slowly for complex datasets. | [89] |
| Sine cosine algorithm (SCA) | Uses sine and cosine functions to guide search. | Amplitude control | Effective in global optimization. | Requires parameter tuning. | [90] |
| Ant lion optimization (ALO) | Simulates ant lions’ hunting mechanism to find optimal solutions. | Elite rate, trap radius | Strong exploration and exploitation. | Computationally expensive. | [91] |
| Glowworm swarm optimization (GSO) | Inspired by glowworms’ luminescence for decision-making. | Luciferin update, vision range | Effective for multimodal optimization. | Slower convergence in large search spaces. | [92] |
| Monkey search algorithm (MSA) | Model monkeys’ climbing behavior to search for solutions. | Climbing rate, jumping rate | Good convergence rate. | Sensitive to parameter settings. | [93] |
| Biogeography-based optimization (BBO) | Model species migration among habitats for optimization. | Migration rate, mutation rate | Good for constrained optimization problems. | May converge prematurely. | [94] |
| Cultural algorithm (CA) | Incorporates cultural evolution principles for optimization. | Knowledge space, acceptance criteria | Enhances search efficiency. | Complexity in implementation. | [95] |
| Wolf colony algorithm (WCA) | Simulates hunting and social behavior of wolves. | Number of wolves, hunting radius | Effective for large-scale problems. | Requires careful tuning. | [96] |
| Flower pollination algorithm (FPA) | Simulates pollination process of flowers. | Switch probability, step size | Handles multimodal optimization. | Requires parameter tuning. | [97] |
| Whale optimization algorithm with adaptive parameters (WOA-AP) | Enhanced whale optimization algorithm with adaptive parameters. | Adaptation rate | Improves performance of standard WOA. | Computationally expensive. | [98] |
| Artificial immune system (AIS) | Mimics immune response mechanisms to optimize feature selection. | Clonal selection, mutation rate | Handles high-dimensional problems well. | Complex algorithm structure. | [99] |
| Eagle strategy (ES) | Model eagles’ soaring and hunting behavior for optimization. | Search radius, soaring phase | Strong balance between local and global search. | Requires computational resources. | [100] |
| Quantum particle swarm optimization (QPSO) | Incorporates quantum behavior to enhance PSO. | Contraction–expansion coefficient β | Efficient in balancing exploration and exploitation. | Requires parameter tuning and is computationally heavy. | [101] |
| Gray wolf optimizer with adaptive mechanism (GWO-AM) | Enhanced gray wolf optimizer with adaptive parameters. | Adaptation function | Improved convergence and efficiency. | Parameter dependency. | [102] |
| Beetle antennae search (BAS) | Models beetle behavior using antennae for directional search. | Step size, antennae length | Lightweight and easy to implement. | Limited exploration capabilities in large feature spaces. | [103] |
| Adaptive firefly algorithm (FA-C) | Introduces chaos into FA for enhanced exploration. | Light absorption, randomness factor | Avoids premature convergence. | Can be slower for large search spaces. | [104] |
| Hybrid genetic algorithm and simulated annealing (GA-SA) | Integrates GA’s global search with SA’s local refinement. | Population size, mutation rate, temperature | Excellent global and local search. | Computationally intensive. | [105] |
| Black hole optimization algorithm (BHO) | Models’ stars being pulled toward the best solution (black hole). | Event horizon | Simple and effective in feature-rich datasets. | May not handle multimodal problems effectively. | [106] |
| Dragonfly algorithm with adaptive mechanism (DA-AM) | Modified dragonfly algorithm with adaptive learning. | Adaptive learning rate | Better adaptation to changing environments. | Higher computational complexity. | [107] |
| Hybrid ant colony and PSO (ACO-PSO) | Combines ACO’s pheromone-based approach with PSO’s movement optimization. | Number of ants, particles, inertia weight | Captures global and local search effectively. | May require significant computational resources. | [108] |
| Dynamic crow search algorithm (DCSA) | Enhances CSA with dynamic flight length and awareness. | Awareness probability, flight length | Improved convergence properties. | May converge slowly without proper tuning. | [31] |
| Coyote optimization algorithm (CoyoteOA) | Models coyote social behavior for optimization. | Social adaptability, pack size | Adaptive and robust in dynamic environments. | Requires more empirical validation. | [109] |
| Salp swarm algorithm with adaptive parameters (SSA-AP) | Enhanced SSA with parameter adaptation. | Adaptive parameter function | Faster convergence than standard SSA. | Sensitive to initial parameters. | [110] |
| Golden eagle optimizer (GEO) | Mimics predatory behavior of golden eagles. | Flight radius | Efficient for large feature spaces. | Requires parameter tuning. | [111] |
References
- Siriopoulos, C.; Samitas, A.; Dimitropoulos, V.; Boura, A.; AlBlooshi, D.M. Chapter 12-Health economics of air pollution. In Pollution Assessment for Sustainable Practices in Applied Sciences and Engineering; Mohamed, A.-M.O., Paleologos, E.K., Howari, F.M., Eds.; Butterworth-Heineman: Oxford, UK, 2021; pp. 639–679. [Google Scholar] [CrossRef]
- Mondal, A.; Mondal, S.; Ghosh, P.; Das, P. Analyzing the interconnected dynamics of domestic biofuel burning in India: Unravelling VOC emissions, surface-ozone formation, diagnostic ratios, and source identification. RSC Sustain. 2024, 2, 2150–2168. [Google Scholar] [CrossRef]
- Manisalidis, I.; Stavropoulou, E.; Stavropoulos, A.; Bezirtzoglou, E. Environmental and Health Impacts of Air Pollution: A Review. Front. Public Health 2020, 8, 14. [Google Scholar] [CrossRef] [PubMed]
- Udristioiu, M.T.; EL Mghouchi, Y.; Yildizhan, H. Prediction, modelling, and forecasting of PM and AQI using hybrid machine learning. J. Clean. Prod. 2023, 421, 138496. [Google Scholar] [CrossRef]
- El Mghouchi, Y.; Udristioiu, M.T.; Yildizhan, H.; Brancus, M. Forecasting ground-level ozone and fine particulate matter concentrations at Craiova city using a meta-hybrid deep learning model. Urban Clim. 2024, 57, 102099. [Google Scholar] [CrossRef]
- El Mghouchi, Y.; Udristioiu, M.T.; Yildizhan, H. Multivariable Air-Quality Prediction and Modelling via Hybrid Machine Learning: A Case Study for Craiova, Romania. Sensors 2024, 24, 1532. [Google Scholar] [CrossRef]
- Rich, D.Q.; Balmes, J.R.; Frampton, M.W.; Zareba, W.; Stark, P.; Arjomandi, M.; Hazucha, M.J.; Costantini, M.G.; Ganz, P.; Hollenbeck-Pringle, D.; et al. Cardiovascular function and ozone exposure: The Multicenter Ozone Study in oldEr Subjects (MOSES). Environ. Int. 2018, 119, 193–202. [Google Scholar] [CrossRef]
- Emberson, L. Effects of ozone on agriculture, forests and grasslands. Philos. Trans. A Math. Phys. Eng. Sci. 2020, 378, 20190327. [Google Scholar] [CrossRef]
- Du, J.; Qiao, F.; Lu, P.; Yu, L. Forecasting ground-level ozone concentration levels using machine learning. Resour. Conserv. Recycl. 2022, 184, 106380. [Google Scholar] [CrossRef]
- Zaini, N.; Ahmed, A.N.; Ean, L.W.; Chow, M.F.; Malek, M.A. Forecasting of fine particulate matter based on LSTM and optimization algorithm. J. Clean. Prod. 2023, 427, 139233. [Google Scholar] [CrossRef]
- Karthikeyan, A.; Priyakumar, U.D. Artificial intelligence: Machine learning for chemical sciences. J. Chem. Sci. 2022, 134, 2. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Zhang, W.; Palazoglu, A.; Sun, W. Prediction of ozone levels using a Hidden Markov Model (HMM) with Gamma distribution. Atmos. Environ. 2012, 62, 64–73. [Google Scholar] [CrossRef]
- Kaur, J.; Parmar, K.S.; Singh, S. Autoregressive models in environmental forecasting time series: A theoretical and application review. Environ. Sci. Pollut. Res. 2023, 30, 19617–19641. [Google Scholar] [CrossRef]
- Gao, W.; Xiao, T.; Zou, L.; Li, H.; Gu, S. Analysis and Prediction of Atmospheric Environmental Quality Based on the Autoregressive Integrated Moving Average Model (ARIMA Model) in Hunan Province, China. Sustainability 2024, 16, 8471. [Google Scholar] [CrossRef]
- Saravanan, D.; Kumar, K.S. IoT based improved air quality index prediction using hybrid FA-ANN-ARMA model. Mater. Today: Proc. 2022, 56, 1809–1819. [Google Scholar] [CrossRef]
- Muzakki, N.F.; Putri, A.Z.; Maruli, S.; Kartiasih, F. Forecasting the Air Quality Index by Utilizing Several Meteorological Factors Using the ARIMAX Method (Case Study: Central Jakarta City). J. JTIK J. Teknol. Inf. Dan Komun. 2024, 8, 569–586. [Google Scholar] [CrossRef]
- Bhatti, U.A.; Yan, Y.; Zhou, M.; Ali, S.; Hussain, A.; Qingsong, H.; Yu, Z.; Yuan, L. Time Series Analysis and Forecasting of Air Pollution Particulate Matter (PM2.5): An SARIMA and Factor Analysis Approach. IEEE Access 2021, 9, 41019–41031. [Google Scholar] [CrossRef]
- Yi, M.; Lin, F. A Hybrid Air Quality Prediction Method Based on VAR and Random Forest. J. Comput. Commun. 2025, 13, 142–154. [Google Scholar] [CrossRef]
- Xu, X.; Zhang, Y. Soybean and Soybean Oil Price Forecasting through the Nonlinear Autoregressive Neural Network (NARNN) and NARNN with Exogenous Inputs (NARNN–X). Intell. Syst. Appl. 2022, 13, 200061. [Google Scholar] [CrossRef]
- Maleki, H.; Sorooshian, A.; Goudarzi, G.; Baboli, Z.; Tahmasebi Birgani, Y.; Rahmati, M. Air pollution prediction by using an artificial neural network model. Clean Techn. Environ. Policy 2019, 21, 1341–1352. [Google Scholar] [CrossRef]
- Naveen, S.; Upamanyu, M.S.; Chakki, K.M.C.; Hariprasad, P. Air Quality Prediction Based on Decision Tree Using Machine Learning. In Proceedings of the 2023 International Conference on Smart Systems for Applications in Electrical Sciences (ICSSES), Tumakuru, India, 7–8 July 2023. [Google Scholar]
- Liu, C.C.; Lin, T.C.; Yuan, K.Y.; Chiueh, P.T. Spatio-temporal prediction and factor identification of urban air quality using support vector machine. Urban Clim. 2022, 41, 101055. [Google Scholar] [CrossRef]
- Baran, B. Prediction of Air Quality Index by Extreme Learning Machines. In Proceedings of the 2019 International Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, Turkey, 21–22 September 2019. [Google Scholar] [CrossRef]
- Shyamala, K.; Sujatha, R. Modified Extreme Gradient Boosting Algorithm for Prediction of Air Pollutants in Various Peak Hours. In Advancements in Smart Computing and Information Security; Springer: Cham, Switzerland, 2024; pp. 125–141. [Google Scholar] [CrossRef]
- Reddy, P.D.; Parvathy, L.R. Prediction Analysis using Random Forest Algorithms to Forecast the Air Pollution Level in a Particular Location. In Proceedings of the 2022 3rd International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 20–22 October 2022; pp. 1585–1589. [Google Scholar] [CrossRef]
- Abellán, J.; Masegosa, A.R. Bagging Decision Trees on Data Sets with Classification Noise. In Foundations of Information and Knowledge Systems; Link, S., Prade, H., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 248–265. [Google Scholar] [CrossRef]
- Lesaffre, E.; Marx, B.D. Collinearity in generalized linear regression. Commun. Stat.-Theory Methods 1993, 22, 1933–1952. [Google Scholar] [CrossRef]
- Liu, H.; Yang, C.; Huang, M.; Wang, D.; Yoo, C. Modeling of subway indoor air quality using Gaussian process regression. J. Hazard. Mater. 2018, 359, 266–273. [Google Scholar] [CrossRef]
- Sonu, S.B.; Suyampulingam, A. Linear Regression Based Air Quality Data Analysis and Prediction using Python. In Proceedings of the 2021 IEEE Madras Section Conference (MASCON), Chennai, India, 27–28 August 2021; pp. 1–7. [Google Scholar] [CrossRef]
- Ravindra, K.; Rattan, P.; Mor, S.; Aggarwal, A.N. Generalized additive models: Building evidence of air pollution, climate change and human health. Environ. Int. 2019, 132, 104987. [Google Scholar] [CrossRef] [PubMed]
- Vovk, V. Kernel Ridge Regression. In Empirical Inference: Festschrift in Honor of Vladimir, N. Vapnik; Schölkopf, B., Luo, Z., Vovk, V., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 105–116. [Google Scholar] [CrossRef]
- Al-Eidi, S.; Amsaad, F.; Darwish, O.; Tashtoush, Y.; Alqahtani, A.; Niveshitha, N. Comparative Analysis Study for Air Quality Prediction in Smart Cities Using Regression Techniques. IEEE Access 2023, 11, 115140–115149. [Google Scholar] [CrossRef]
- Tella, A.; Balogun, A.L.; Adebisi, N.; Abdullah, S. Spatial assessment of PM10 hotspots using Random Forest, K-Nearest Neighbour and Naïve Bayes. Atmos. Pollut. Res. 2021, 12, 101202. [Google Scholar] [CrossRef]
- Su, Y. Prediction of air quality based on Gradient Boosting Machine Method. In Proceedings of the 2020 International Conference on Big Data and Informatization Education (ICBDIE), Zhangjiajie, China, 23–25 April 2020; pp. 395–397. [Google Scholar] [CrossRef]
- Xie, X.; Zuo, J.; Xie, B.; Dooling, T.A.; Mohanarajah, S. Bayesian network reasoning and machine learning with multiple data features: Air pollution risk monitoring and early warning. Nat. Hazards 2021, 107, 2555–2572. [Google Scholar] [CrossRef]
- Athira, V.; Geetha, P.; Vinayakumar, R.; Soman, K.P. DeepAirNet: Applying Recurrent Networks for Air Quality Prediction. Procedia Comput. Sci. 2018, 132, 1394–1403. [Google Scholar] [CrossRef]
- Krishan, M.; Jha, S.; Das, J.; Singh, A.; Goyal, M.K.; Sekar, C. Air quality modelling using long short-term memory (LSTM) over NCT-Delhi, India. Air Qual. Atmos. Health 2019, 12, 899–908. [Google Scholar] [CrossRef]
- Rao, K.S.; Devi, G.L.; Ramesh, N. Air Quality Prediction in Visakhapatnam with LSTM based Recurrent Neural Networks. Int. J. Intell. Syst. Appl. 2019, 11, 18–24. [Google Scholar] [CrossRef]
- Chakma, A.; Vizena, B.; Cao, T.; Lin, J.; Zhang, J. Image-based air quality analysis using deep convolutional neural network. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3949–3952. [Google Scholar] [CrossRef]
- Gong, X.; Liu, L.; Huang, Y.; Zou, B.; Sun, Y.; Luo, L.; Lin, Y. A pruned feed-forward neural network (pruned-FNN) approach to measure air pollution exposure. Environ. Monit. Assess. 2023, 195, 1183. [Google Scholar] [CrossRef]
- Zaini, N.; Ean, L.W.; Ahmed, A.N.; Malek, M.A. A systematic literature review of deep learning neural network for time series air quality forecasting. Environ. Sci. Pollut. Res. 2022, 29, 4958–4990. [Google Scholar] [CrossRef]
- Zhang, B.; Zou, G.; Qin, D.; Ni, Q.; Mao, H.; Li, M. RCL-Learning: ResNet and convolutional long short-term memory-based spatiotemporal air pollutant concentration prediction model. Expert Syst. Appl. 2022, 207, 118017. [Google Scholar] [CrossRef]
- Liu, J.; Xing, J. Identifying Contributors to PM2.5 Simulation Biases of Chemical Transport Model Using Fully Connected Neural Networks. J. Adv. Model. Earth Syst. 2023, 15, e2021MS002898. [Google Scholar] [CrossRef]
- Chang, S.W.; Chang, C.L.; Li, L.T.; Liao, S.W. Reinforcement Learning for Improving the Accuracy of PM2.5 Pollution Forecast Under the Neural Network Framework. IEEE Access 2020, 8, 9864–9874. [Google Scholar] [CrossRef]
- Samal, K.K.R.; Babu, K.S.; Das, S.K. Temporal convolutional denoising autoencoder network for air pollution prediction with missing values. Urban Clim. 2021, 38, 100872. [Google Scholar] [CrossRef]
- Kavasidis, I.; Lallas, E.; Gerogiannis, V.C.; Charitou, T.; Karageorgos, A. Predictive maintenance in pharmaceutical manufacturing lines using deep transformers. Procedia Comput. Sci. 2023, 220, 576–583. [Google Scholar] [CrossRef]
- Dong, J.; Zhang, Y.; Hu, J. Short-term air quality prediction based on EMD-transformer-BiLSTM. Sci. Rep. 2024, 14, 20513. [Google Scholar] [CrossRef]
- Tian, J.; Liang, Y.; Xu, R.; Chen, P.; Guo, C.; Zhou, A.; Pan, L.; Rao, Z.; Yang, B. Air Quality Prediction with Physics-Informed Dual Neural ODEs in Open Systems. arXiv 2025, arXiv:2410.19892. [Google Scholar] [CrossRef]
- Luo, J.; Gong, Y. Air pollutant prediction based on ARIMA-WOA-LSTM model. Atmos. Pollut. Res. 2023, 14, 101761. [Google Scholar] [CrossRef]
- Park, J.; Seo, Y.; Cho, J. Unsupervised outlier detection for time-series data of indoor air quality using LSTM autoencoder with ensemble method. J. Big Data 2023, 10, 66. [Google Scholar] [CrossRef]
- Anas, H.; El Mghouchi, Y.E.; Halima, Y.; Nawal, A.; Mohamed, C. Novel climate classification based on the information of solar radiation intensity: An application to the climatic zoning of Morocco. Energy Convers. Manag. 2021, 247, 114770. [Google Scholar] [CrossRef]
- Hector, I.; Panjanathan, R. Predictive maintenance in Industry 4.0: A survey of planning models and machine learning techniques. PeerJ Comput. Sci 2024, 10, e2016. [Google Scholar] [CrossRef] [PubMed]
- Condurache-Bota, S.; Draşovean, R.M.; Tigau, N. Analysis of the particulate matter long term emissions in Romania by sectors of activities. Math. Phys. Theor. Mech. 2023, 46, 90–100. [Google Scholar] [CrossRef]
- Moldovan, C.S.; Mateescu, M.D.; Sbirna, L.S.; Ionescu, C.; Sbirna, S. Study regarding concentration of main air pollutants in Craiova (Romania)–Comparison between the first half of 2011 and the similar period of 2010. In SESAM; INSEMEX: Petrosani, Romania, 2011; p. 368. [Google Scholar]
- Buzatu, G.D.; Dodocioiu, A.M. Air quality study in Craiova municipality based on data provided by uRADm independent sensor network. Ann. Univ. Craiova Biol. Hortic. Food Prod. Process. Technol. Environ. Eng. 2022, 27, 63. [Google Scholar] [CrossRef]
- Velea, L.; Udriștioiu, M.T.; Puiu, S.; Motișan, R.; Amarie, D. A Community-Based Sensor Network for Monitoring the Air Quality in Urban Romania. Atmosphere 2023, 14, 840. [Google Scholar] [CrossRef]
- Udristioiu, M.T.; Velea, L.; Motisan, R. First results given by the independent air pollution monitoring network from Craiova city Romania. AIP Conf. Proc. 2023, 2843, 040001. [Google Scholar] [CrossRef]
- Pekdogan, T.; Udriștioiu, M.T.; Yildizhan, H.; Ameen, A. From Local Issues to Global Impacts: Evidence of Air Pollution for Romania and Turkey. Sensors 2024, 24, 1320. [Google Scholar] [CrossRef]
- Yildizhan, H.; Udriștioiu, M.T.; Pekdogan, T.; Ameen, A. Observational study of ground-level ozone and climatic factors in Craiova, Romania, based on one-year high-resolution data. Sci. Rep. 2024, 14, 26733. [Google Scholar] [CrossRef]
- Dunea, D.; Pohoata, A.; Iordache, S. Using wavelet–feedforward neural networks to improve air pollution forecasting in urban environments. Environ. Monit. Assess. 2015, 187, 477. [Google Scholar] [CrossRef] [PubMed]
- Dudáš, A.; Udristioiu, M.T.; Alkharusi, T.; Yildizhan, H.; Sampath, S.K. Examining effects of air pollution on photovoltaic systems via interpretable random forest model. Renew. Energy 2024, 232, 121066. [Google Scholar] [CrossRef]
- Velea, L.; Bojariu, R.; Burada, C.; Udristioiu, M.T.; Paraschivu, M.; Burce, R.D. Characteristics of extreme temperatures relevant for agriculture in the near future (2021–2040) in Romania. Sci. Papers. Ser. E Land Reclam. Earth Obs. Surv. Environ. Engineering 2021, 10, 70–75. [Google Scholar]
- Brâncuș, M.M.; Burada, C.; Mănescu, C. The impact of late and early snowfall on urban areas in southwestern Romania. In Proceedings of the Climate, Water and Society: Changes and Challenges, 36th Conference of the International Association of Climatology, Bucharest, Romania, 3–7 July 2023; pp. 3–7. [Google Scholar]
- Udristioiu, M.T.; Velea, L.; Bojariu, R.; Sararu, S.C. Assessment of urban heat Island for Craiova from satellite-based LST. AIP Conf. Proc. 2017, 1916, 040004. [Google Scholar] [CrossRef]
- Badescu, V. Assessing the performance of solar radiation computing models and model selection procedures. J. Atmos. Sol.-Terr. Phys. 2013, 105, 119–134. [Google Scholar] [CrossRef]
- Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef]
- Variance Threshold as Early Screening to Boruta Feature Selection for Intrusion Detection System. Available online: https://ieeexplore.ieee.org/abstract/document/9608852 (accessed on 15 March 2025).
- Wang, J.; Zhang, H.; Wang, J.; Pu, Y.; Pal, N.R. Feature Selection Using a Neural Network with Group Lasso Regularization and Controlled Redundancy. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 1110–1123. [Google Scholar] [CrossRef] [PubMed]
- Rahmat, F.; Zulkafli, Z.; Ishak, A.J.; Abdul Rahman, R.Z.; Stercke, S.D.; Buytaert, W.; Tahir, W.; Ab Rahman, J.; Ibrahim, S.; Ismail, M. Supervised feature selection using principal component analysis. Knowl. Inf. Syst. 2024, 66, 1955–1995. [Google Scholar] [CrossRef]
- Ahmad, I. Feature Selection Using Particle Swarm Optimization in Intrusion Detection. Int. J. Distrib. Sens. Netw. 2015, 11, 806954. [Google Scholar] [CrossRef]
- Bai, Y.; Xie, J.; Liu, C.; Tao, Y.; Zeng, B.; Li, C. Regression modeling for enterprise electricity consumption: A comparison of recurrent neural network and its variants. International J. Electr. Power Energy Syst. 2021, 126, 106612. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Najibi, F.; Apostolopoulou, D.; Alonso, E. Enhanced performance Gaussian process regression for probabilistic short-term solar output forecast. Int. J. Electr. Power Energy Syst. 2021, 130, 106916. [Google Scholar] [CrossRef]
- Mousavirad, S.J.; Ebrahimpour-Komleh, H. Feature selection using modified imperialist competitive algorithm. In Proceedings of the ICCKE 2013, Mashhad, Iran, 31 October–1 November 2013; pp. 400–405. [Google Scholar] [CrossRef]
- Got, A.; Moussaoui, A.; Zouache, D. Hybrid filter-wrapper feature selection using whale optimization algorithm: A multi-objective approach. Expert. Syst. Appl. 2021, 183, 115312. [Google Scholar] [CrossRef]
- Kubalík, J.; Derner, E.; Žegklitz, J.; Babuška, R. Symbolic Regression Methods for Reinforcement Learning. IEEE Access 2021, 9, 139697–139711. [Google Scholar] [CrossRef]
- Abdel-Basset, M.; Ding, W.; El-Shahat, D. A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection. Artif. Intell. Rev. 2021, 54, 593–637. [Google Scholar] [CrossRef]
- van der Heide, E.M.M.; Veerkamp, R.F.; van Pelt, M.L.; Kamphuis, C.; Athanasiadis, I.; Ducro, B.J. Comparing regression, naive Bayes, and random forest methods in the prediction of individual survival to second lactation in Holstein cattle. J. Dairy Sci. 2019, 102, 9409–9421. [Google Scholar] [CrossRef]
- Moayedikia, A.; Ong, K.L.; Boo, Y.L.; Yeoh, W.G.; Jensen, R. Feature selection for high dimensional imbalanced class data using harmony search. Eng. Appl. Artif. Intell. 2017, 57, 38–49. [Google Scholar] [CrossRef]
- Rodrigues, D.; Pereira, L.A.M.; Nakamura, R.Y.M.; Costa, K.A.P.; Yang, X.S.; Souza, A.N.; Papa, J.P. A wrapper approach for feature selection based on Bat Algorithm and Optimum-Path Forest. Expert. Syst. Appl. 2014, 41, 2250–2258. [Google Scholar] [CrossRef]
- Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
- Maulud, D.; Abdulazeez, A.M. A Review on Linear Regression Comprehensive in Machine Learning. J. Appl. Sci. Technol. Trends 2020, 1, 140–147. [Google Scholar] [CrossRef]
- Hastie, T.J. Generalized Additive Models. In Statistical Models in S; Routledge: New York, NY, USA, 1992. [Google Scholar]
- Brabazon, A.; Cui, W.; O’Neill, M. The raven roosting optimisation algorithm. Soft Comput. 2016, 20, 525–545. [Google Scholar] [CrossRef]
- Maaroof, B.B.; Rashid, T.A.; Abdulla, J.M.; Hassan, B.A.; Alsadoon, A.; Mohammadi, M.; Khishe, M.; Mirjalili, S. Current Studies and Applications of Shuffled Frog Leaping Algorithm: A Review. Arch. Comput. Methods Eng. 2022, 29, 3459–3474. [Google Scholar] [CrossRef]
- Jia, H.; Xing, Z.; Song, W. A New Hybrid Seagull Optimization Algorithm for Feature Selection. IEEE Access 2019, 7, 49614–49631. [Google Scholar] [CrossRef]
- Hegazy, A.H.E.; Makhlouf, M.A.; El-Tawel, G.H.S. Improved salp swarm algorithm for feature selection. J. King Saud. Univ.-Comput. Inf. Sci. 2020, 32, 335–344. [Google Scholar] [CrossRef]
- Song, Y.; Liang, J.; Lu, J.; Zhao, X. An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 2017, 251, 26–34. [Google Scholar] [CrossRef]
- Moe, S.J.; Wolf, R.; Xie, L.; Landis, W.G.; Kotamäki, N.; Tollefsen, K.E. Quantification of an Adverse Outcome Pathway Network by Bayesian Regression and Bayesian Network Modeling. Integr. Environ. Assess. Manag. 2021, 17, 147–164. [Google Scholar] [CrossRef]
- Zivkovic, M.; Jovanovic, L.; Ivanovic, M.; Krdzic, A.; Bacanin, N.; Strumberger, I. Feature Selection Using Modified Sine Cosine Algorithm with COVID-19 Dataset. In Evolutionary Computing and Mobile Sustainable Networks; Suma, V., Fernando, X., Du, K.L., Wang, H., Eds.; Springer: Singapore, 2022; pp. 15–31. [Google Scholar] [CrossRef]
- Emary, E.; Zawbaa, H.M. Feature selection via Lèvy Antlion optimization. Pattern Anal. Applic. 2019, 22, 857–876. [Google Scholar] [CrossRef]
- Shahrom, M.A.S.M.; Zainal, N.; Aziz, M.F.A.; Mostafa, S.A. A Review of Glowworm Swarm Optimization Meta-Heuristic Swarm Intelligence and its Fusion in Various Applications. Fusion: Pract. Appl. 2023, 13, 89–102. [Google Scholar] [CrossRef]
- Hafez, A.I.; Hassanien, A.E.; Zawbaa, H.M.; Emary, E. Hybrid Monkey Algorithm with Krill Herd Algorithm optimization for feature selection. In Proceedings of the 2015 11th International Computer Engineering Conference (ICENCO), Cairo, Egypt, 29–30 December 2015; pp. 273–277. [Google Scholar] [CrossRef]
- Rostami, O.; Kaveh, M. Optimal feature selection for SAR image classification using biogeography-based optimization (BBO), artificial bee colony (ABC) and support vector machine (SVM): A combined approach of optimization and machine learning. Comput. Geosci. 2021, 25, 911–930. [Google Scholar] [CrossRef]
- Sarbazi-Azad, S.; Saniee Abadeh, M.; Mowlaei, M.E. Using data complexity measures and an evolutionary cultural algorithm for gene selection in microarray data. Soft Comput. Lett. 2021, 3, 100007. [Google Scholar] [CrossRef]
- Li, Z. A local opposition-learning golden-sine grey wolf optimization algorithm for feature selection in data classification. Appl. Soft Comput. 2023, 142, 110319. [Google Scholar] [CrossRef]
- Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013, 7, 21. [Google Scholar] [CrossRef]
- Nadimi-Shahraki, M.H.; Zamani, H.; Mirjalili, S. Enhanced whale optimization algorithm for medical feature selection: A COVID-19 case study. Comput. Biol. Med. 2022, 148, 105858. [Google Scholar] [CrossRef]
- Ming, L.; Zhao, J. Feature selection for chemical process fault diagnosis by artificial immune systems. Chin. J. Chem. Eng. 2018, 26, 1599–1604. [Google Scholar] [CrossRef]
- Dhal, K.G.; Das, A.; Sasmal, B.; Ghosh, T.K.; Sarkar, K. Eagle Strategy in Nature-Inspired Optimization: Theory, Analysis, Applications, and Comparative Study. Arch. Comput. Methods Eng. 2024, 31, 1213–1232. [Google Scholar] [CrossRef]
- Wu, Q.; Ma, Z.; Fan, J.; Xu, G.; Shen, Y. A Feature Selection Method Based on Hybrid Improved Binary Quantum Particle Swarm Optimization. IEEE Access 2019, 7, 80588–80601. [Google Scholar] [CrossRef]
- Zhang, L.; Chen, X. A Velocity-Guided Grey Wolf Optimization Algorithm with Adaptive Weights and Laplace Operators for Feature Selection in Data Classification. IEEE Access 2024, 12, 39887–39901. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Chakraborty, K.; Mehrotra, K.; Mohan, C.K.; Ranka, S. Forecasting the behavior of multivariate time series using neural networks. Neural Netw. 1992, 5, 961–970. [Google Scholar] [CrossRef]
- Perez, M.; Marwala, T. Microarray data feature selection using hybrid genetic algorithm simulated annealing. In Proceedings of the 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, Eilat, Israel, 14–17 November 2012; pp. 1–5. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, Y.; Gao, B.; Lu, T.; Li, H.; Wu, Y.; Zhang, D.; Liao, X. A Hybrid Binary Dragonfly Algorithm with an Adaptive Directed Differential Operator for Feature Selection. Remote Sens. 2023, 15, 3980. [Google Scholar] [CrossRef]
- Menghour, K.; Souici-Meslati, L. Hybrid ACO-PSO Based Approaches for Feature Selection. Int. J. Intell. Eng. Syst. 2016, 9, 65–79. [Google Scholar] [CrossRef]
- Chin, V.J.; Salam, Z. Coyote optimization algorithm for the parameter extraction of photovoltaic cells. Sol. Energy 2019, 194, 656–670. [Google Scholar] [CrossRef]
- Ahmed, S.; Mafarja, M.H.; Aljarah, I. Feature Selection Using Salp Swarm Algorithm with Chaos. In Proceedings of the 2nd International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence, New York, NY, USA, 24–25 March 2018; Association for Computing Machinery: New York, NY, USA, 2019; pp. 65–69. [Google Scholar] [CrossRef]
- Panahi, M.; Sadhasivam, N.; Pourghasemi, H.R.; Rezaie, F.; Lee, S. Spatial prediction of groundwater potential mapping based on convolutional neural network (CNN) and support vector regression (SVR). J. Hydrol. 2020, 588, 125033. [Google Scholar] [CrossRef]












| Category | Model/Method | Key Features | Primary Application(s) | Advantages | Limitations | Reference |
|---|---|---|---|---|---|---|
| Statistical models | Hidden Markov model (HMM) | Models and predicts state transitions in machine operations | Failure state prediction. | It is well suited for modeling sequential data such as time series sensor readings, making them effective for predicting state transitions in machinery | Assume independence of observations given the hidden state, which may not always hold true in complex systems | [12] |
| Autoregressive (AR) model | Time series forecasting | Models time series based on its own past values; linear dependence. | Simple, interpretable, suitable for stationary data | Limited to linear relationships, assumes stationarity | [13] | |
| Autoregressive integrated moving average (ARIMA) | Forecasting, economic and industrial time series | Combines AR and MA models with differencing for non-stationary data. | Effective for non-stationary data, well established | Requires parameter tuning, assumes linearity | [14] | |
| Autoregressive moving average (ARMA) | Stationary time series forecasting | Combines AR and MA components to model time series with past values and errors. | Captures both trend and cycle, useful for stationary series | Not suitable for non-stationary data | [15] | |
| Autoregressive moving average with exogenous inputs (ARIMAX) | Forecasting with external variables | Extends ARIMA by including exogenous (external) variables. | Incorporates additional influencing factors, flexible | Complex model fitting data preprocessing is needed | [16] | |
| Seasonal ARIMA (SARIMA) | Seasonal time series forecasting | ARIMA model with seasonal components for periodic data. | Good for data with clear seasonality | Complex parameter tuning, longer training time | [17] | |
| Nonlinear autoregressive moving average with exogenous inputs (NARMAX) | Complex system modeling, process forecasting | Extends ARMA with nonlinear dynamics and exogenous inputs. | Modeling nonlinear relationships, robust for dynamic systems | Complex to implement, computationally intensive | [4] | |
| Vector autoregression (VAR) | Multivariate time series forecasting | Modeling interdependencies among multiple time series variables. | Captures relationships between series, effective for macroeconomic data | Assumes all series are stationary, complex modeling | [18] | |
| Nonlinear Autoregressive neural network (NARNN) | Nonlinear time series forecasting | Uses neural network architecture for autoregressive forecasting tasks. | Captures complex, nonlinear relationships | Requires extensive data and computational resources | [19] | |
| Machine learning | Artificial neural networks (ANN) | Regression, classification, pattern recognition | Composed of layers of interconnected neurons, it can model complex, nonlinear relationships. | High accuracy, flexible modeling | Computationally intensive, requires large datasets | [20] |
| Decision tree (DT) | Classification, regression, feature selection | Hierarchical structure with nodes representing decisions/tests on features; interpretable results. | Simple, interpretable, handles both numerical/categorical data | Prone to overfitting | [21] | |
| Support vector machine (SVM) | Classification, regression, detecting anomalies | Separates classes with a hyperplane, maximizing margin; kernel functions extend to nonlinear boundaries. | Effective for high-dimensional data | Sensitive to parameter selection and scaling | [22] | |
| Extreme learning machine (ELM) | Fast neural network training, regression | A single hidden layer consists of random weights in the hidden layer for faster training. | Quick training time, straightforward implementation | Less flexible in architecture | [23] | |
| Extreme gradient boosting (XGBoost) | Classification, regression, ensemble learning | Builds decision trees sequentially, correcting previous errors; uses gradient descent optimization. | High efficiency, strong performance in competitions | It can be complex to tune | [24] | |
| Random forest (RF) | Classification, regression, anomaly detection | Ensemble decision trees trained on different data subsets; combined results for improved accuracy. | Robust against overfitting, handles large datasets | Large ensembles can be resource-intensive | [25] | |
| Tree bagger (TB) | Ensemble learning, boosting, classification | Ensemble of decision trees trained with bootstrapping; combines predictions for a final output. | Reduces overfitting, reliable performance | Not ideal for very high-dimensional data | [26] | |
| Generalized linear regression model (GLRM) | Regression, predictive modeling | Extension of linear regression with a link function for handling non-normally distributed data. | Useful for complex distributions | Assumes linear predictor relationships | [27] | |
| Gaussian process regression (GPR) | Non-parametric regression | Models output as a distribution over functions, providing a probabilistic prediction. | Provides uncertainty estimates | Computationally expensive for large datasets | [28] | |
| Linear regression (LR) | Basic regression | Models the relationship as a linear equation between dependent and independent variables. | Simple, interpretable, easy to implement | Limited to linear relationships | [29] | |
| Generalized additive model (GAM) | Regression, nonlinear modeling | Extends GLRMs with additive, nonlinear effects for each predictor. | Flexibility in nonlinear relationships | Computationally more complex than linear regression | [30] | |
| Kernelized ridge regression model (KRRM) | Nonlinear regression | Combines ridge regression regularization with a kernel trick for nonlinear transformations. | Good for nonlinear relationships, mitigates overfitting | Kernel choice is critical | [31] | |
| Linear ridge regression (LRR) | Linear regression with regularization | Adds a penalty to linear regression coefficients to reduce multicollinearity issues. | Reduces overfitting, simple solution | Limited to linear solutions | [32] | |
| K-nearest neighbors (KNN) | Classification, regression | Instance-based learning finds the majority class among k-nearest data points. | Simple, non-parametric, easy to implement | Computationally expensive for large datasets, sensitive to feature scaling | [33] | |
| Bayesian linear regression (BayesLR) | Classification, text analysis | A probabilistic classifier based on Bayes’ theorem with strong independence assumptions. | Fast, works well with small datasets | Assumes feature independence, which can be unrealistic | [33] | |
| Gradient boosting machine (GBM) | Classification, regression, ensemble learning | Sequentially builds decision trees to minimize prediction errors through gradient descent. | High accuracy, adaptable to different data types | Sensitive to overfitting, requires careful tuning | [34] | |
| Bayesian network (BN) | Probabilistic inference, diagnosis, decision support | Graphical model representing relationships between variables using probabilities. | Handles uncertainty well, interpretable | Computationally intensive for large networks | [35] | |
| Deep learning | Recurrent neural networks (RNNs) | Sequential data modeling for time series patterns | Predicting RUL and anomaly detection in sensor data. | They are designed to process sequential data such as time series sensor readings, making them effective for maintenance forecasting | Training RNNs is computationally expensive, especially for large datasets. | [36] |
| Long short-term memory (LSTM) | Advanced RNN for long-term dependencies in time series | Time series forecasting. | Effective for both short-term and long-term predictions, making them highly applicable for RUL estimation and fault detection | It is computationally intensive and requires significant training time compared to simpler models | [37] | |
| LSTM-based recurrent neural networks (LSTM-RNNs) | Time series prediction, sequence modeling | Incorporates memory cells for sequential data and cyclic connections in the network. | Good for sequential tasks, language modeling | Training can be slow, vanishing gradient problem | [38] | |
| Convolutional neural networks (CNNs) | Image and video analysis | Applies convolutional layers for automatic feature extraction in spatial data. | Effective for high-dimensional image data | Requires large amounts of data and high computational power | [39] | |
| Feedforward neural network (FNN) | - Unidirectional flow of information (no cycles) - Consists of input, hidden, and output layers - Typically trained using backpropagation | - Pattern recognition. - Function approximation. - Classification and regression tasks. | - Simple and easy to implement - Effective for structured data | - Prone to overfitting if too complex - Cannot model temporal dependencies | [40] | |
| Deep neural network (DNN) | - Multiple hidden layers (deep architecture) - Nonlinear activation functions - Trained using backpropagation with optimization techniques | - Image and speech recognition. - Natural language processing. - Complex pattern recognition. | - Can model complex relationships in data - High capacity for learning | - Requires large amounts of data and computational power - Prone to vanishing/exploding gradient problems | [41] | |
| Residual neural network (ResNet) | - Introduces skip (residual) connections to bypass layers - Reduces vanishing gradient issues - Can have very deep architectures | - Computer vision (image classification, object detection). - Speech recognition. - Time series forecasting. | - Allows training of very deep networks - Mitigates vanishing gradient problem - Improves accuracy in deep learning tasks | - Higher computational cost - Requires careful tuning of architecture | [42] | |
| Fully connected neural network (FCNN) | - Every neuron in one layer is connected to every neuron in the next layer - Uses dense layers without specialized structures like convolution or recurrence | - General purpose modeling. - Classification and regression. - Feature extraction. | - Simple and widely applicable - Works well for structured data | - Computationally expensive for large inputs - Prone to overfitting due to large number of parameters | [43] | |
| Reinforcement learning models (RL) | Robotics, optimization, game AI | Learned through trial and error, receiving rewards for actions in an environment. | Adaptable, can learn complex tasks | Training can be time-consuming and data-hungry | [44] | |
| Autoencoders | Learns compressed representations for anomaly detection | Anomaly detection. | It can effectively reduce the dimensionality of high-dimensional sensor data, simplifying analysis | Are highly sensitive to the quality and variability of training data, which may lead to false positives/negatives | [45] | |
| Transformer models | Handles long-term dependencies in time-series data | Advanced time series analysis. | Can capture long-term dependencies in time series data | Requires significant computational resources, especially for large datasets | [46,47] | |
| Physics-informed models | Integrates domain knowledge with AI for improved predictions | Complex mechanical systems modeling. | Incorporate physical laws and domain knowledge, improving prediction accuracy and interpretability | Developing physics-informed models requires significant domain knowledge, which can limit their accessibility | [48] | |
| Hybrid models | LSTM + ARIMA | Combines deep learning with statistical methods for improved predictions | Long-term forecasting. | Combines LSTM’s ability to model nonlinear and long-term dependencies with ARIMA’s strength in handling linear and short-term trends | Requires careful preprocessing to align both models’ requirements such as stationarity for ARIMA and scaled data for LSTM | [49] |
| Autoencoders + clustering | Detects and classifies anomalies | Anomaly detection and classification. | Autoencoders identify anomalies by reconstructing normal behavior, while clustering classifies these anomalies into meaningful categories | Combining autoencoders with clustering increases computational overhead during both training and inference | [50] | |
| Feature engineering | K-means clustering | Elbow plot is used for unsupervised anomaly detection. Segmentation of data into clusters | Partitions data into K clusters based on distance to centroids. Identifying abnormal operating conditions. | Simple, efficient, and widely used for large datasets | Sensitive to initialization, assumes spherical clusters | [51] |
| Average silhouette plot | Silhouette analysis is used to evaluate the quality of clusters | Measures how similar an object is to its cluster compared to other clusters. | Helps in choosing the optimal cluster count | Less informative for very complex clustering structures | ||
| Hierarchical clustering | Dendrogram is used to understand data structure, hierarchical clustering | Creates nested clusters by merging or splitting data points. | Does not require pre-specifying the number of clusters, visual hierarchy | Computationally intensive for large datasets, sensitive to linkage method | ||
| DBSCAN | Detects dense clusters for anomalies | Clustering operational anomalies. | Effectively identifies and handles noise or outliers, which are common in maintenance data | Performance is highly sensitive to the choice of epsilon and minimum points, which may require trial-and-error tuning | [52] | |
| PCA, t-SNE | Dimensionality reduction techniques | Reducing data complexity for (PdM) models. | Reduces the complexity of high-dimensional data while retaining most of the variance. | Requires proper scaling of input data for meaningful results | ||
| SHAP, LIME | Key feature identification methods | Selecting critical variables like vibration, temperature, etc. | Both work with any machine learning model, making them versatile for predictive maintenance applications | Both methods can be computationally expensive, especially for complex models and large datasets |
| Station ID | Coordinates (Latitute, Longitude, Altitude) | Type | Location | Measured Parameters |
|---|---|---|---|---|
| DJ 1 | 44.3266373; 23.7967072; 113.00 | Traffic | Calea Bucuresti | C6H6, CO, C8H10, m-, o- and p-Xylene, NO2, NO, NOx, PM10, SO2, C7H8 |
| DJ 2 | 44.3266373; 23.7967072; 113.00 | Urban background | City Hall, A.I. Cuza Street | T; P; RH; wind direction and velocity; precipitations; sun brightness; C6H6; CO; NO; NO2; NOx; SO2; C7H8; PM2.5; PM10; SO2; m-, o- and p-xylene |
| DJ 3 | 44.3267708; 23.8044415; 83.00 | Mixed (industrial and traffic) | Maria Tanase Street | NH3, NO, NO2, NOx, Ni, O3, PM10, SO2 |
| DJ 5 | 44.3422203; 23.7197227; 85.00 | Suburban background | Breasta Water Station | CO, NO, NO2, NOx, O3, PM10, SO2 |
| Variable’s Number | Index | Parameter | Unit |
|---|---|---|---|
| Variable 1 | Benzene (C6H6) | μg/m3 | |
| Variable 2 | Carbon monoxide (CO) | μg/m3 | |
| Variable 3 | Ethylbenzene (C6H5CH2CH3 or C8H10) | μg/m3 | |
| Variable 4 | m-Xylene (C6H4(CH3)2) | μg/m3 | |
| Variable 6 | Nitric monoxide (NO) | μg/m3 | |
| Variable 6 | Nitrogen dioxide (NO2) | μg/m3 | |
| Variable 7 | Oxides of nitrogen (NOx) | μg/m3 | |
| Variable 8 | o-xylene (C6H4(CH3)2) | μg/m3 | |
| Variable 9 | p-xylene (C6H4(CH3)2) | μg/m3 | |
| Variable 10 | Precipitations | mm | |
| Variable 11 | Relative humidity (RH) | % | |
| Variable 12 | Sulfur dioxide (SO2) | μg/m3 | |
| Variable 13 | Solar brightness | W/m2 | |
| Variable 14 | Temperature (T) | °C | |
| Variable 15 | Toluene (C6H5CH3 or C8H10) | μg/m3 | |
| Variable 16 | Wind direction | grN | |
| Variable 17 | Wind velocity | m/s | |
| Output 1 | Ozone (O3) | μg/m3 |
| Model | R | MBE | MAE | RMSE | MAPE | σ | Rank |
|---|---|---|---|---|---|---|---|
| Deep-NARMAX | 0.92615 | 0.035727 | 5.8357 | 8.8693 | 0.03528 | 8.8695 | 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Udristioiu, M.T.; El Mghouchi, Y. Deep Hybrid AI Models Applied to Predict, Model, and Forecast the Next Upcoming Periods of Ozone in Craiova City. Appl. Sci. 2025, 15, 12187. https://doi.org/10.3390/app152212187
Udristioiu MT, El Mghouchi Y. Deep Hybrid AI Models Applied to Predict, Model, and Forecast the Next Upcoming Periods of Ozone in Craiova City. Applied Sciences. 2025; 15(22):12187. https://doi.org/10.3390/app152212187
Chicago/Turabian StyleUdristioiu, Mihaela Tinca, and Youness El Mghouchi. 2025. "Deep Hybrid AI Models Applied to Predict, Model, and Forecast the Next Upcoming Periods of Ozone in Craiova City" Applied Sciences 15, no. 22: 12187. https://doi.org/10.3390/app152212187
APA StyleUdristioiu, M. T., & El Mghouchi, Y. (2025). Deep Hybrid AI Models Applied to Predict, Model, and Forecast the Next Upcoming Periods of Ozone in Craiova City. Applied Sciences, 15(22), 12187. https://doi.org/10.3390/app152212187

