The measurements were made along two lines of the city bus transport in the city of Yambol, Bulgaria. They were carried out along part of the routes of lines #5 and #25. These bus lines have been chosen as representative samples for the entire city bus transport system. By selecting routes that cover diverse areas, such as city centers, outskirts, and different types of neighborhoods, the collected data can provide a comprehensive understanding of pollution patterns across the city.
The public transport buses are renovated, model Graf und Stift (MAN Truck&Bus SE, Munich, Germany), and delivered from Vienna for the transport operator serving the lines in the city. The buses are equipped with security systems, passenger comfort, and an electronic information system. The vehicles are environmentally friendly, according to the Euro 5 standard, as they are powered via a six-cylinder internal combustion engine. Its volume is 12.8 L, with liquefied petroleum gas (LPG) as the main fuel.
According to the data from a GPS device, the length of the routes, the speed of the vehicle, and the time spent at the bus stops were determined. The total length of both routes is 6 km. The total travel time is 120 min. The average speed of the buses is 35 km/h, and the maximum speed is 45 km/h. The stay time at the stops is a total of 60 min.
The measurements were taken in heavy and light traffic of people and cars in an urban environment.
The data from the automatic station are compared with those from the measurement system proposed in this work. The differences in the readings between the two measuring devices do not exceed 10%.
3.1. Sensor Devices
Sensor for temperature and relative air humidity. A DHT22 digital sensor (Aosong (Guangzhou) Electronics Co., Ltd., Guangzhou, China) was selected. The sensor has an operating voltage of 3.3–5 V DC. It measures relative air humidity 0–99% RH, with an accuracy of ±2% (at 25°) and a resolution of down to 0.1%. The temperature is measured in the range of −40–80 °C, with an accuracy of ±0.5 °C and a resolution of 0.1 °C. The refresh rate is 1 Hz (reports are every 1 s). The sensor uses “One Wire” protocol. The “DHT.h” library is used for its work.
Sensor for TVOC, eCO
2, H
2 and ethanol. A digital sensor SGP30 (Pimoroni Ltd., Sheffield, UK) was selected. It is presented in position b. The sensor uses a I
2C digital communication interface, with address 0 × 58. It uses the libraries <Wire.h> and “Adafruit_SGP30.h”. The device can measure the concentration of four gases simultaneously. TVOC and eCO
2 are preset by the manufacturer. H
2 and ethanol are obtained as a decimal value from the ADC of the sensor. The sensor has automatic humidity and temperature compensation. In order to be able to use the data from the last two gases, a conversion is required, which is specified in the technical specification (Datasheet) of the sensor:
where
C, ppm is the concentration of the corresponding gas;
Sref is the ADC reference value at 0.5 ppm of the respective gas;
Sout is the data from the ADC when measuring (dimensionless quantity). For ethanol,
Cref = 0.4 ppm, and for H
2,
Cref = 0.5 ppm. Reference values of
Sref have been determined at 0.5 ppm of the respective gas. For H
2,
Sref = 14,055, and for ethanol,
Sref = 19,831. The reference data is dimensionless because it represents an output value from the sensor’s ADC.
Accelerometer. An ADXL345 accelerometer (Analog Devices, Inc., Wilmington, NC, USA) was selected. The three-axis digital accelerometer and gyroscope is an ADXL345 integrated circuit-based module. The module is small, thin, and has low power consumption, allowing for measurement with a sufficiently high resolution (13-bit) up to ±16 g. The digital output data is formatted as 16-bit (complementary) and is accessible via SPI or I2C interfaces. The ADXL345 is suitable for portable device applications. Its high milligram resolution (3.9 mg/LSB) allows for measurement of tilt changes of less than 1.0°. The module works with voltage: 4–6 V DC. It offers two options for serial synchronous communication, SPI and I2C. The DC current consumption is 23 µA in measurement mode. Library <Adafruit_ADXL345_U.h> is used to work with the module.
Noise sensor. A Sound Sensor module (Waveshare Electronics, Shenzhen, China) was used. The module is built from an electret microphone with a range of 50–20,000 Hz. The microphone signal is amplified via an LM386 operational amplifier. The module has a gain factor of 200. The microphone has a sensitivity of 52 dB. The supply voltage of the sensor module is 3.3–5.3 V DC. The determination of the loudness in dB was performed using a program code proposed in [
24]. The noise level in dB is calculated after reading the analog output of the sensor in a window of 50 ms.
Sensor for NO
x, SO
x and O
3. An MQ-135 sensor (Waveshare Electronics, Shenzhen, China) was used. Through this sensor, the total amount of sulfur and nitrogen oxides, as well as ozone, was determined. The data from the analog output of the sensor is fed to the analog input of the single-board microcomputer. The conversion of the data obtained from the ADC into the voltage values is in the range 0–5 V. From the measured voltage obtained at the analog output of the sensor, the resistance of the sensor can be calculated according to the following dependencies:
where
ADC is the value of the 0–1023 analog-to-digital converter.
Rs is the resistance of the sensor;
U is the voltage from the analog output of the sensor; and
RL is the load resistance of the sensor.
According to the technical documentation of the sensor, the load resistance of the sensor is RL = 20.1 kΩ. The resistance of the sensor in clean air is R0 = 10 kΩ.
A humidity and temperature correction factor has been determined for the MQ-135 sensor. Temperature and relative humidity were measured with a DHT22 sensor. According to the technical documentation of the sensor, the correction equation has the following form:
where
cf is a correction factor;
T, °C is ambient temperature; and
H, %RH is relative air humidity.
To determine the concentration of the relevant gas, a model is used, which is more often used in the sensors of the MQ-xxx series. According to the technical documentation of the sensors, these models have the following general appearance:
where
Cx, ppm is the concentration of the corresponding measured gas, and variables a and b are the model coefficients. The corrected values of (
Rs/R0) were used in the calculations. The models for the relevant gas are determined via the data from the MQ-135 sensor technical specification.
The MQ-135 sensor relies on a specific sensing mechanism where its sensitive layer interacts with gases, causing changes in its electrical resistance. While the sensor itself does not inherently separate gases, it can be calibrated and configured to detect specific gases based on their unique response patterns. By carefully tuning the sensor’s parameters and employing advanced signal processing techniques, it is possible to discern different gases and their concentrations, enabling the sensor to effectively differentiate between gases like NOx, SOx, and O3.
CO sensor. An MQ-9 sensor (Waveshare Electronics, Shenzhen, China) was used. Through this sensor, the carbon monoxide concentration was determined. The data from the analog output of the sensor is fed to the analog input of the single-board microcomputer. The conversion of the data received from the ADC into the voltage values is in the range 0–5 V. The load and fresh air resistance, as well as that of the sensor, are specified similar to the MQ-135 because they are from the same manufacturer and have the same values for both sensors. A humidity and temperature correction factor has been determined for the MQ-9 sensor. Temperature and relative humidity were measured with a DHT22 sensor.
According to the technical documentation of the sensor, the correction equation has the following form:
where
cf is a correction factor;
T, °C is ambient temperature; and
H, %RH is relative air humidity.
To determine the concentration of the relevant gas, a model of the second degree is used, which is more often used in the sensors of the MQ-xxx series. For an MQ-9 sensor, this model has the form:
where
C, ppm is the concentration of the corresponding measured gas (CO). The corrected values of (
Rs/R0) were used in the calculations. The CO model is determined from the data from the MQ-9 sensor technical specification.
The gas sensing process with the MQ series sensors occurs as follows:
The sensor’s heater raises the temperature of the sensing material to an optimal level, typically around 200–400 °C, depending on the specific MQ sensor model.
In the presence of the target gas, the gas molecules adsorb onto the surface of the sensing material, leading to a change in its electrical conductivity.
As the conductivity of the sensing material changes, the resistance between the sensor electrode and the heater electrode also varies.
The resistance change is then measured and converted into an electrical signal, which is proportional to the concentration of the target gas.
The output signal can be further processed via a microcontroller or other electronic components for data interpretation or display.
Particulate matter (PM) sensor. An SDS011 sensor (Shandong NOVA Technology Co., Ltd., Jinan, China) was used, which has a digital output. The SDS011 sensor device has a supply voltage of 5 V DC, and its current consumption is 70 mA ± 10 mA. Its UART interface, at TTL levels, is used. The refresh rate is 1 Hz (reports are every 1 s). The device measures PM2.5 and PM10 separately, with a range of 0–999.9 μg/m3. Library <SDS011-select-serial.h> is used for its work.
By conducting a thorough and rigorous validation of the sensor device data against the reference station measurements, the accuracy of the sensor data was verified, and any discrepancies or issues can be identified and addressed promptly. Validating data accuracy is crucial for using sensor devices effectively in environmental monitoring and decision-making processes.
3.8. Assessment of Classification Accuracy
In order to correctly classify the unknown input data, i.e., to evaluate the performance of a classifier on the class models created on the basis of the training samples, it is necessary to apply different approaches and quantitative evaluations [
30].
For example, the input data processed via the classifier can be assigned to the groups: correctly (Positive P) and incorrectly (Negative N) classified.
Table 1 shows the representation of the groups of class labels when classified into two classes.
Based on the descriptions of the objects with different types and number of signs, the classification accuracy is assessed, and the classifier is trained for each individual description with the subsequent classification of the test sample. Finally, an assessment is made of the proportion of incorrectly recognized objects compared to their total number. From here, the basic, actual, and total classification error for m-number of classes were calculated:
The basic error indicates what fraction of the data from class i is misclassified into the other classes, where FN is the number of the data from class i misassigned to other classes, and TP is the number of correctly classified data from class i.
The actual error indicates the relative proportion of data from other classes incorrectly assigned by the classifier to a given class i, where FP is the number of the data from other classes associated with class i.
The total error shows the misclassified data relative to all the data in the sample.