1. Introduction
In indoor environments, avoiding high concentrations of aerosols (microscopic particles exhaled when speaking or breathing) is critical, as these can degrade air quality and increase health risks. Poorly ventilated closed spaces exacerbate this issue, as the accumulation of aerosols and CO
2 rises with the number of occupants and the time spent in such environments [
1]. This situation is common in classrooms during face-to-face sessions, where ventilation is often insufficient to maintain moderate CO
2 levels. Continuous CO
2 monitoring is essential to assess air quality, along with predicting or forecasting CO
2 concentration over time. This allows us to estimate how long it will take for a given space to reach CO
2 levels that could pose a significant risk, enabling proactive air quality management.
The clean air we breathe “outdoors”, without pollution, contains approximately 400 parts per million (ppm) of CO
2. In the literature, minimum reference levels are reported between 412 ppm and 420 ppm, according to various sources [
2]. Air with this concentration of CO
2 is considered to not have been breathed recently. CO
2 concentrations above the reference level indicate that the air has already been partially exhaled by someone, as shown in
Table 1. For instance, when the CO
2 concentration reaches 1000 ppm, it is estimated that approximately 1.5% of the air has already been previously exhaled. Concentrations above 1000 ppm not only reflect reduced air quality but also pose a potential health risk, as elevated CO
2 levels can be toxic [
3,
4].
To assess air quality, the appropriate sensor must be selected. NDIR (non-dispersive infrared) sensors are suitable for measuring the concentration of CO
2 since the molecules of this gas are prone to absorbing infrared light. Evaluations have already been made of NDIR sensors as a low-cost option for CO
2 measurement. One of these was performed in a laboratory environment, demonstrating that, without any calibration or correction, NDIR sensors achieve RMS errors between 5 ppm and 21 ppm compared to a precision sensor [
6]. CO
2 measurements can be managed and analyzed by various methods. Typically, as a complement to monitoring systems, diagnosis/prognosis applications are developed where the data are processed through specialized programs (e.g., MATLAB) or cloud computing services, such as ThingSpeak, Microsoft Azure, and Amazon Web Services, among others [
7]. Cloud computing offers data storage and analysis services to forecast various physical variables using computational intelligence techniques like neural networks. In Robin et al. [
8], convolutional neural networks were evaluated to monitor air quality; on the other hand, Altikat et al. [
9] also used neural networks to predict the passage of CO
2 from the ground to the atmosphere. Recently, Kapoor et al. [
10] designed a pilot monitoring system for CO
2 using neural networks and support vector machines. However, real-time CO
2 monitoring alone is insufficient for effective air quality management. Predictive modeling is necessary to estimate future CO
2 concentrations and optimize ventilation strategies.
Monitoring CO
2 levels is essential for ensuring indoor air quality (IAQ) and occupant well-being. Kwon et al. [
11] classify CO
2 sensors into two main types: chemical sensors, which are energy-efficient and compact but suffer from short lifespan and low durability, and non-dispersive infrared (NDIR) sensors, which offer higher accuracy and are commonly used for air quality monitoring. The integration of Internet of Things (IoT) technologies has significantly improved CO
2 monitoring by enabling real-time data acquisition and remote accessibility. Marques and Pitarma [
12] introduced iAQ WiFi, an IoT-based system that collects environmental data using low-cost sensors and transmits them via WiFi for real-time visualization and analysis. Marques et al. [
13] expanded on this work with iAir CO
2, an advanced IoT solution designed for continuous CO
2 monitoring. Their study emphasizes the importance of real-time air quality tracking to anticipate and mitigate potential health risks.
Machine learning techniques have also been applied to CO
2 forecasting, allowing for more efficient and proactive air quality management. Kallio et al. [
14] investigated multiple machine learning models, including ridge regression, decision trees, random forest, and multilayer perceptron, to predict indoor CO
2 concentration. Arsiwala et al. [
15] developed a digital twin system integrating IoT, artificial intelligence, and Building Information Modeling (BIM) to automate CO
2 emissions tracking. Alsamrai et al. [
16] provided a comprehensive review of IoT-based air quality monitoring systems, emphasizing the growing use of low-cost sensors and microcontrollers such as ESP8266 and ESP32. Their findings confirm that IoT applications offer a cost-effective and scalable alternative for pollution monitoring.
Building upon these advancements, this study integrates predictive modeling and real-time data collection to enhance CO2 monitoring solutions. By leveraging machine learning and IoT technologies, our approach improves forecasting accuracy, supports proactive air quality management, and contributes to healthier indoor environments.
This study proposes an IoT-based CO2 monitoring and forecasting system, integrating low-cost monitoring stations equipped with NDIR sensors and ESP32 microcontrollers to provide real-time CO2 measurements. These devices are strategically deployed in classrooms, offices, and laboratories within Tecnológico Nacional de México campus Tuxtla Gutiérrez. The collected CO2 data are processed using an LSTM autoregressive neural network, trained to predict future CO2 concentrations up to eight hours in advance. Unlike traditional mathematical forecasting models, this approach allows the neural network to learn patterns directly from sensor data, enhancing adaptability to different environmental conditions. The results of this study suggest an alternative for scheduling classroom sessions to ensure safe air quality conditions. The main contributions of this study are summarized as follows:
The LSTM network analyzes historical data to accurately forecast CO2 levels up to four hours in advance, eliminating the need for explicit models.
A network of affordable sensors and wireless transmitters enables cost-effective deployment and easy maintenance, making the system highly scalable.
Predictive insights allow proactive ventilation control, improving air quality, health, and cognitive performance in indoor environments.
Real-time monitoring and forecasting optimize space utilization and enhance safety in educational institutions, supporting data-driven decision making.
The remainder of this document is organized as follows:
Section 2 presents the materials and methods used for the monitoring system and the configuration of the LSTM network for CO
2 concentration forecasting.
Section 3 describes the results obtained in different configurations and the comparison with different methods reported in the literature. Finally,
Section 4 presents the conclusions.
2. Materials and Methods
The CO
2 monitoring system to prevent COVID-19 infection involves using an NDIR sensor and an ESP32-Core2 microcontroller board to monitor CO
2 levels in indoor environments, as seen in
Figure 1. The DNIR sensor is a kind of optical sensor that can detect the concentration of CO
2 in the air by measuring light absorption at a specific wavelength.
The ESP32-Core2 microcontroller board reads data from the DNIR sensor and sends them to the Thinhspeak cloud using WiFi connectivity. Thinhspeak is an IoT platform that provides data storage, analysis, and visualization tools. The collected CO2 data are then analyzed in Matlab, a popular data analysis and modeling tool. Using these data, a CO2 level prediction algorithm can be developed, which can estimate the CO2 level in the near future based on current measurements.
The CO2 level prediction algorithm based on LSTM results can be displayed to users on their mobile devices using an application. The application can show real-time monitoring graphs and alert users if the CO2 level exceeds a certain threshold, indicating that the indoor environment may be poorly ventilated and potentially hazardous to human health.
This system has the potential to prevent the spread of airborne diseases, such as COVID-19, by providing a tool for monitoring indoor air quality and identifying poorly ventilated environments that could increase the risk of pathogen transmission. The following sections describe the method in detail, including device connections for collecting sensor data, the neural network used, the training process, and the tested configurations.
2.1. Measurement of CO2 Concentration
To measure the concentration of CO
2 in the air, an NDIR sensor is used, which is quite precise and easy to calibrate. This sensor consists of a tube, an optical filter, an emitter, and an infrared (IR) detector, as shown in
Figure 2. The emitter produces IR light waves that travel through the air sample tube. The IR waves move toward the optical filter in front of the detector. The detector measures the amount of IR light that passes through the filter.
The radiation emitters band coincides with the CO2’s absorption band, located around 4.26 μm. The absorption spectrum is unique, so it is a signature or fingerprint to identify the CO2 molecule.
As IR light travels through the tube, the CO2 gas molecules absorb the characteristic 4.26 μm band while letting other wavelengths pass. At the detector end, the remaining light is incident on an optical filter that absorbs all wavelengths of light except the wavelength absorbed by the CO2 molecules in the tube containing the air sample.
Finally, the detector receives the remaining amount of IR light not absorbed by the CO2 molecules or the optical filter. To calculate the CO2 concentration, the difference between the amount of IR light radiated by the emitter and the amount of IR light received by the detector is measured. Since this difference results from light absorption by the CO2 molecules in the tube, it is directly proportional to the number of CO2 molecules in the air sample.
In the monitoring station where the sensor is embedded, some aspects are taken into account so that the measurements are as accurate as possible; one of them is the warm-up time, which lasts approximately 60 s; during this time, the data are unreliable and are not recorded. In addition, the sensor must be calibrated using a process that references the lowest concentration recorded outdoors over some time. To verify the proper operation and accuracy of each NDIR sensor, its readings are compared to those of a precision CO2 meter to validate the calibration.
The CO
2 concentrations recorded by the sensor vary depending on where it is placed within the monitored space; for a reliable reading, considering the influence of ventilation, the sensors are placed at least 120 cm from the ground, 60 cm from air flows (windows), and SI2m of the people inside the room, as suggested by previous studies [
17].
2.2. Monitoring Stations
The monitoring stations capture the measurements from the sensors to transmit them in an IoT network where the measurements of all the monitored classrooms or offices converge. Each station is made up of the following elements:
An ESP32 microcontroller module with WiFi, Bluetooth, and LoRa wireless connections; it also allows wired connections using I2C, UART, and SPI protocols.
An NDIR sensor model MH-Z19D with UART-type serial interface; its detection range goes from 400 ppm to 10,000 ppm, with a maximum error of 50 ppm.
Connection to an IoT network by WiFi or LoRa (long-range radio frequency), depending on the wireless connectivity available in each station.
The microcontroller captures the values the NDIR sensor detects and sends the data wirelessly for recording and processing in the cloud. The electrical diagram of the M5 tough device is shown in
Figure 3, where the connections of the microcontroller with the sensor and the visual/sound indicators used as an alarm are specified when the concentration of CO
2 exceeds the safe values; and its specifications are presented in
Table 2.
The monitoring stations comprise an IoT network managed by ThingSpeak, a Cloud Computing service operated by MathWorks. The data are stored in the cloud, which can be updated and viewed using the ThingSpeak API, allowing them to be viewed on computers or mobile devices connected to the internet. The final prototype is presented in
Figure 4, where the three concentration levels of CO
2 are shown, according to ppm and visual indicators (green, yellow, and red). The design includes a 2 cm hole which allows air to flow freely through it, thus facilitating readings from the CO
2 sensor. The marked levels are presented according to
Table 3.
The screenshot in
Figure 5 shows the remote interface of one of the stations. The sensor sampling period is one second, although the cloud data are updated every 15 s due to limitations in bandwidth and available storage space; the communication latency is approximately one second, which is sufficient considering the frequency of data updating and the slow dynamics in the monitored area since there are no sudden changes in the concentration of CO
2.
2.3. Prognosis of CO2 Concentration with LSTM Network
A type of neural network based on deep learning frequently used in time–series forecasting is autoregressive networks and those called Long Short-Term Memory (LSTM). From the CO
2 concentrations recorded by the stations, time–series are created for each monitored classroom, which are used to estimate future CO
2 concentrations based on the most recent measurements. LSTM networks work with time–series processing, using loops in their network diagram, and allowing them to remember/forget previous states and use this information to decide the next one. This LSTM comprises a status cell that transmits the data to be processed through the network. This gate allows us to decide what information is going to be discarded and another allows us to update the memory, as shown in
Figure 6 and as expressed in Equations (
1)–(
6). Where
are the input data;
,
, and
, are the outputs of each gate, enabled by the activation function,
or tanh. The subscripts
f,
i, and
o are indicative of the gate that corresponds to them,
forget,
input, and
output. In addition, there are short- and long-term memories,
and
.
Conventional recursive networks are used to model short-term dependencies (i.e., close relationships in time–series), whereas LSTMs are useful for modeling long-term dependencies. The LSTM architecture is a block comprising three neural networks, better known as gates, which allow us to weigh the dataset to remember, discard, and update the information at the convenience of its application. This network will enable us to make a more extended prediction due to its long-term memory derived from the gates above. The LSTM block configurations are presented, which were proposed to analyze its performance with the dataset described above. The LSTM configurations were trained with 70% of the available data, and the other 30% were used to perform the forecast tests. The Adam learning algorithm was used with an initial learning rate of 0.005, and the iterations for the training were varied to know its impact on the performance of the network.
The number of hidden units and training times were varied in the neural architecture tuning. The first configuration selected took 200 epochs to train, having 128 hidden units. The second setup was 30 hidden units and trained in 1000 epochs. The third analysis case was of 208 hidden units and was trained in 1000 epochs.
3. Results and Discussion
The previously mentioned configurations of the LSTM block, varying the number of hidden units (128, 30, and 208 units), registered the best performances, obtaining accurate forecasts with a competitive RMSE, being lower concerning the results obtained with the NAR architecture. The first configuration with 128 hidden units took 200 epochs to train, obtaining an RMSE of 57.4396 ppm, an MAD of 27.67 ppm, and an MAPE of 0.026887%.
Figure 7 shows the network output,
Figure 8 contrasts the measured and forecast data, whereas
Figure 9 presents the error between them.
The second configuration with 30 hidden units took 1000 epochs to train, obtaining an RMSE of 68.17 ppm, an MAD of 31.33 ppm, and an MAPE of 0.02871%.
Figure 10 shows the network output,
Figure 11 contrasts the measured and forecast data, and
Figure 12 presents the error between them.
The third configuration, with 208 hidden units, took 1000 epochs to train, obtaining an RMSE of 69.86 ppm, an MAD of 29.1748 ppm, and an MAPE of 0.017992%.
Figure 13 shows the network performance,
Figure 14 contrasts the measured and forecast data, and
Figure 15 presents the error between them.
Validation of the Proposed LSTM
Different architecture configurations for the validation of the LSTM were analyzed, which are presented below (
Table 4,
Table 5,
Table 6 and
Table 7), highlighting the best performances. These tables are separated concerning selection percentages for training and test data to analyze their impact on configurations; in addition, they present the statistical indices to measure their performance. From this, it was concluded that the best results were obtained by selecting a data percentage of 70% for training and 30% for testing, maintaining the lowest RMSE on average.
In relation to the results obtained, a comparison is made with previous research carried out by other authors, who have addressed the analysis of the concentration of CO
2 using various machine learning and deep learning approaches and techniques. These approaches and techniques are detailed in
Table 8. A nonlinear autoregressive network (NAR) was tested, with a similar configuration presented for the LSMT of this work, the NAR obtained lower performance than the LSTM. On the other hand, the results were compared with the SVM, a linear regressive network (LR), and an artificial neural network (ANN) although its topology is not presented, reported in [
19]. As can be seen, the proposed LSTM configuration obtained the lowest RMSE to predict CO
2.
Table 9 expands this comparison by summarizing the key differences between the study by Liu et al. and the present research; whereas Liu et al. [
19] achieved lower RMSE values (16.77 ppm), their model is limited to a 1 min prediction window. In contrast, the proposed approach extends the forecast to 8 h, making it more suitable for long-term air quality management. Additionally, the methodology provides a detailed description of the IoT implementation, specifying the use of NDIR sensors and ESP32 microcontrollers, whereas Liu et al. do not specify their hardware components; while Liu et al. validated their study in a residential environment, the present research was tested in various indoor spaces, such as classrooms, offices, and laboratories, demonstrating broader applicability.