1. Introduction
Aerosols are airborne particulates composed of both solid and liquid particles [
1]. These aerosols, commonly known as Particulate Matter (PM), are linked to a multitude of issues pertaining to global climate effects and human health [
2,
3,
4,
5]. The concentration of particulate matter in the air is generally measured using two key parameters: PM
2.5 and PM
10. PM
2.5 refers to the total mass of all airborne particles with an aerodynamic diameter of up to 2.5 microns, known as fine particulate matter, while PM
10 encompasses all airborne particles with an aerodynamic diameter of up to 10 microns, known as coarse particulate matter.
The physiological effects of exposure to PM can be profound and varied, depending on the size and composition of the particles. Fine particulate matter (PM
2.5) can penetrate deep into the lungs [
6] and enter the bloodstream [
7], causing short-term effects such as respiratory irritation [
8], exacerbation of asthma [
9], and acute cardiovascular events [
10]. Long-term exposure to PM has been associated with chronic health issues, including chronic obstructive pulmonary disease (COPD) [
11], lung cancer [
12], and cardiovascular diseases [
13], leading to significant morbidity and mortality [
14,
15]. Moreover, vulnerable populations—such as children, the elderly, and individuals with pre-existing health conditions—face heightened risks [
16,
17]. Children are particularly susceptible due to their developing respiratory systems, while the elderly often have compromised health that makes them more vulnerable to the deleterious effects of air pollution [
18,
19,
20].
In addition to direct health impacts, PM exposure carries substantial socio-economic effects [
21]. The economic burden associated with PM-related health issues can be significant, encompassing healthcare costs for treatment and management of chronic diseases, lost productivity due to illness, and diminished quality of life for affected individuals and families [
22]. This burden disproportionately affects low-income communities [
23], where access to healthcare may be limited. Economic disparities can hinder these populations’ ability to mitigate the impacts of PM exposure, making it essential to develop targeted public health interventions that address these inequalities.
Indoor PM, especially fine particles (PM
2.5), has been linked to various adverse health effects, including respiratory and cardiovascular diseases [
24]. Long-term exposure can worsen asthma, reduce lung function, and increase the risk of heart disease and lung cancer [
25]. Major sources of indoor PM include cooking, heating, and tobacco smoke [
26].
Geographic disparities in PM exposure further complicate the health landscape. Urban areas, particularly those with high traffic and industrial activities, often experience higher concentrations of PM, resulting in worse health outcomes compared to rural areas [
27,
28]. Studies have shown that low-income neighborhoods in urban centers frequently encounter greater PM exposure due to environmental injustices, where socio-economic factors influence both the location of pollution sources and access to resources for health mitigation [
29]. For instance, research has highlighted that communities situated near highways or industrial zones are more likely to suffer from higher rates of respiratory diseases and other health complications linked to PM exposure. Understanding these geographic disparities highlights the importance of implementing effective, low-cost monitoring systems—such as LoRaWAN IoT, Wi-Fi-based, or LTE/GSM-based solutions—to identify pollution hot-spots and enable targeted interventions aimed at protecting vulnerable populations. The current article provides a comprehensive description of a LoRaWAN-based sensor illustrated in
Figure 1.
Measurements that sufficiently capture the spatial and temporal variability of atmospheric components, like particulate matter, are essential for accurately characterizing these components at the neighborhood scale. In most cases, rather than the actual physical scales of variability, the available fiscal and computational resources determine the resolution of atmospheric observations and simulation. In medium and large cities, for example, only a small number of monitoring stations are typically present. The Dallas-Fort Worth metroplex, which has a population of seven million people, has only three regulatory grade airborne particulate sensors. The ease of sensor placement is influenced by a given location’s access to power and network resources.
The spatial scales are dependent on variables such as local weather, distribution of sources, and terrain, as evidenced by street level surveys conducted with data at less than a meter resolution [
30]. The smart city paradigm is synonymous with having an increased number of devices that can provide adequate coverage within a given geographic area [
31,
32].
As low-cost sensing systems gain traction, environmental monitoring initiatives are being expanded to engage the public more effectively. Traditional air quality monitoring systems typically utilize a wider array of sensors, eschew low-power networks, and tend to be more expensive. Low-power wireless technologies become increasingly valuable in scenarios where energy sources are unavailable or present challenges. Recently, research has shifted towards low-power wide area network (LPWAN) technologies, including Long Range Wide Area Network (LoRaWAN), Narrowband IoT (NB-IoT), LTE-M, and Sigfox [
33,
34,
35,
36]. Although all these technologies are energy-efficient, we have chosen LoRaWAN for its scalability [
37], robustness [
38], and cost-effectiveness [
39]. In fact, LoRaWAN has been adopted for many IoT-related applications since its inception in 2013.
2. Design
The main elements of the sensing configuration are depicted in
Figure 2.
2.1. Main Controller Unit (MCU)
The MCU unit is where the main computing of the system is enacted. The MCU is comprised of three main components:
ATSAMD21G18: 32-Bit ARM Cortex M0+: This microcontroller unit serves as the end device controller, managing communication between the LoRaWAN radio module (RHF76-052DM) and the application, handling sensor data acquisition, and processing the data for transmission to the LoRaWAN module.
RHF76-052DM: LoRaWAN Module acts as a communication interface between the microcontroller unit (ATSAMD21G18) and the LoRaWAN network.
L70: GPS Module delivers precise location data.
The integrated unit is marketed as the
Seeeduino LoRaWAN with GPS by the vendor,
Seeed Technology Co., Ltd. (
Seeed Studio, Shenzhen, Guangdong, China), and is considered an Arduino development board that uses the LoRaWAN protocol.
ATMSAMD21G18 operates at 48 MHz maximum clock frequency and 3.3 V [
40].
The incorporated LoRaWAN Module (
RHF76-052DM) has dual band technology with 20 dBm max power being provided at 434–474 MHz and 14 dBm max power being provided at 868–915 MHz. LoRaWAN protocols Class A and Class C are supported by the module. All of these characteristics make it suitable for ultra-long range communications (up to 7 km in rural areas) and low power consumption [
41]. However, the range of LoRaWAN is typically limited in urban settings due to high-density environments and signal attenuation caused by obstacles such as buildings and walls. This limitation becomes even more significant when there is no direct line of sight (NLOS), as signal propagation is further degraded by diffraction, reflection, and scattering. Studies have shown that while LoRaWAN can achieve ranges of several kilometers in rural or line of sight (LOS) conditions, urban NLOS scenarios often reduce this range significantly [
42,
43]. The design presented in this paper uses the frequency band range of 902.3–914.9 MHz (USA LoRaWAN frequency spectrum) at 14 dBm and LoRaWAN protocol Class A for purposes of communication.
Additionally, the unit incorporates a low powered, small sized, GPS module. The
L70 can communicate with a GPS receiver by NMEA 0183 (a combined electrical and data specification) [
44]. This communication provides location information for the current application.
2.2. Power Control Unit (PCU)
The
PCU is marketed as the
Sunny Buddy—Maximum Power Point Tracking (MPPT) Solar Charger by the vendor,
SparkFun Electronics (
SparkFun, Niwot, CO 80503, USA). It features the
LT3652 circuit, a complete monolithic step-down battery charger that operates within an input voltage range of 4.95 V to 32 V. The
PCU is connected both to the solar panel and to the battery, and it regulates the power to the
MCU intelligently while keeping the battery charged throughout the day. A voltage regulation loop in the
LT3652 reduces the charge current when the input voltage falls below a programmed level, which is determined by a resistor divider that we have set. If sufficient power is supplied by a solar panel, an input loop is used to maintain the
LT3652’s maximum output. Charge current is terminated when the
LT3652 falls below a tenth of the maximum charge current. Once charging is terminated, the
LT3652 enters a low-current (85 μA) standby mode. If the battery voltage falls 2.5% below the programmed float voltage, an auto-recharge feature initiates a new charging cycle [
45].
2.3. INA219s—Power Sensors
The INA219 is an integrated I2C Bus and a shunt resistor. It monitors shunt current, shunt voltage (0 to 26 V), bus voltage and power by measuring the voltage drop across the shunt resistor. Operational voltage is 3–5.5 V. INA219 can be operated from −40–125 °C. I2C or SMBUS-compatible interface is used to communicate data with the MCU. Two separate INA219s are connected to the battery and the solar panel, and the flowing current from both sources is fed through to the PCU. For the I2C interface, two separate slave addresses are used to maintain uniqueness between the two INA219s.
2.4. TPL5110—Low-Power Timer
Designed for battery operated or duty cycled applications, the
TPL5110 nano timer is a low power, low current timer with an integrated MOSFET driver. Using only 35 nA of power, the
TPL5110 is able to enable the power supply line so as to drastically reduce the overall system standby current during sleep time. As a result of its reduced power consumption, it can also be used with smaller batteries, which makes it well suited for energy harvesting and wireless sensor applications, such as the one being implemented [
46].
TPL5110 functions to shut down an
MCU after a predetermined period of time and reactivate it after another predetermined period. This is particularly helpful if the system enters a sleep cycle when manual intervention is not possible in the field. However, the main purpose of the
TPL5110 is to preserve energy efficiently.
TPL5110 serves as a bridge between the
PCU and
MCU. In the design architecture of the system, embedded firmware is run inside of a
MCU during its life cycle.
TPL5110 allows us to choose the period in which the system will reset and rerun the firmware. A potentiometer is used to alter the period. In the current configuration, we used an external resistor to achieve this. The external resistor is set to limit the life cycle to a maximum of 15 min. However, in most cases, the timer will cut off power to the
MCU much sooner, depending on the battery’s stored energy and the solar panel’s real-time output. This is accomplished through a signal sent through to the
TPL5110 from the
MCU after the power metrics are measured via the
INA219s.
2.5. External Sensing Unit (ESU)
The ESU is housed within a solar radiation shield, which safeguards the sensing ensemble from climatic factors while ensuring accurate readings by promoting airflow around the sensing elements. The ESU incorporates two sensors: the IPS7100 for particulate matter measurements and the BME280 for climate-related measurements.
IPS7100: This sensor is an Optical Particle Counter (OPC), also known as an Intelligent Particle Sensor (IPS). It provides particle counts for sizes ranging from 0.1 to 10
m. Although it cannot directly count particles ≤0.1
m, it estimates particle counts for 0.1
m diameters through extrapolation. Operating at a rate of two cycles per second, the sensor classifies particles into seven size bins within this range. Additionally, the
IPS7100 provides conventional particle mass fractions used to quantify air pollution, including PM
1, PM
2.5, and PM
10, as well as measurements for PM
0.1, PM
0.3, PM
0.5, and PM
5. The
IPS7100 can be used with both UART and I
2C hardware communication protocols. In the present application, the hardware communication protocol I
2C is used. With a low power consumption of 0.325 W during measurements and only 0.001365 W in sleep mode, this device is well-suited for solar and battery applications, such as the one described here. The sensor draws air using a fan and is calibrated to the U.S. EPA-approved FEM (Federal Equivalent Method) reference instrument, the GRIMM EDM180, through the Piera Automated Calibration (PASCAL) system under a constant temperature (25 °C) and relative humidity (50%) [
47].
BME280: Optical particle counters usually consider temperature, pressure, and humidity as bias factors. We added the Bosch
BME280 sensor to account for such biases as well as report climate data. It is designed specifically for mobile and wearable applications. Compact and power-efficient, it provides us with crucial climate data.
BME280 is a digital humidity, pressure, and temperature sensor that supports both SPI and I
2C interfaces. It operates within a voltage range of 1.71–3.6 V [
48]. In our configuration the device is set to run at 3.3 V using the I
2C interface.
3. Operation
The sensing system is intended to obtain sensor measurements and transmit the data in a sustainable manner. The duration of a single firmware life cycle is 15 min, and its operation is depicted in
Figure 3. During a single life cycle, the first step is to read current battery and solar power voltages using the two
INA219s and select a power mode.
Figure 4 illustrates the different power modes. In the event that the sensor does not receive sufficient power, power mode 0 is set and the
TPL5110 is used to set the system to sleep for the remaining duration of the current life-cycle.
In the event that the sensor has sufficient power, a LoRaWAN connection is established between the sensor node and the central gateway. Such a life cycle involves setting one of the following power modes: 1, 2, 3, 4 or 5. In order to ensure maximum security, each node has a unique ID and an application key that is specific to that ID. If the specifics match, the connection is made. By default, the data rate is set to 2 (3125 bits per second). In some cases, the network may adjust the data rate automatically in response to transmission characteristics. Once a successful connection has been established, power mode information is transmitted as the initial transmission of the current life cycle. Sensors are then initialized and the system is set in motion to collect data. For a given life cycle, the power modes also govern the number of sampling cycles and frequency of sampling for each sensor—for instance, the primary sensor (IPS7100) will vary its sampling frequency between 1 and 2 min based on the power mode selected for a given life cycle.
After the 15 min of the current life cycle is exhausted, the TPL5110 resets the system so that it initiates a new life cycle for the system.
4. Power Management
Careful steps have been taken to have sufficient power to run the system through out its field deployment. This is predominantly controlled via the power mode.
Figure 4 gives the means in which power modes are set for each life cycle. In power mode 5, which is when the battery is fully charged as well as sufficient amount of sunlight is present, the firmware runs continuously without halting the system. On all the other power modes, after a predefined number of sensing cycles, the sensing system is halted to preserve continuous measurement of data throughout a given day, in all weather conditions. For example on power mode 1 and 2 the system will shut itself down after 1 sensing cycle which would only take around 7 min. And hence the system will sleep after for 8 (15–7) min of its ‘single life cycle’. And likewise on power mode 3, and 4 the system will shut itself down after 2 sensing cycles which would only take around 10 min. Although, power modes 1 and 2 as well as 3 and 4 are operationally similar for the current application, they are given specific power modes to cater future developments of the system, which is intended to have more sensing units. A summary of the sensing cycles for each power mode is provided in
Table 1.
Power mode 3 is the most commonly allocated mode for the sensor. In this mode, a single 15-min life-cycle consists of several sensor packets. Specifically, the sensor transmits two packets of
IPS7100 data (56 bytes each), two packets of
BME280 data (24 bytes each), one packet containing the pair of
INA219 data (32 bytes), and one packet of
L70 data (55 bytes). This totals 247 bytes transmitted every 15 min. The data rate of 16.47 bytes per minute refers only to the sensor data payload, excluding additional information such as headers and metadata within each packet. Package sizes of each sensor payload is further discussed on
Table 2.
The power modes are designed to prevent the battery from being completely depleted. In conditions of very low power, when the battery drops below 10%, and while in power mode 0 (see
Figure 4), the sensor predominantly stays in sleep mode for the majority of the 15-min cycle, only briefly waking to check the battery status. In the unlikely event that the sensor operates for an extended period without sunlight, causing the battery to reach 0%, the sensor will remain inactive until the solar panels recharge the battery sufficiently. Once the battery reaches a sufficient power level, the firmware will reactivate the system, enabling the sensor to resume operation.
5. Communication
This system is considered to be an end node within the LoRaWAN network. A central communication gateway, a component of a more comprehensive sensing suite referred to as a ‘
Central Node’, receives the data collected by the system. Integrated into the
Central Node is a
Dragino LIG16, manufactured by
Dragino Technology Co., Limited (
Dragino, Shenzhen 518116, China), featuring an
SX1302 LoRa concentrator. In this system, the nodes and the gateway are interconnected through a LoRaWAN network operating in
Sub-Band 8 (913.5–914.9 MHz) for uplinks.
In this architecture, the gateway is placed at the center, while the end nodes are arranged around it following a star topology. To activate end devices, this network uses Over-the-Air (OTAA) activation, since this is the most secure and reliable method. The system data is transmitted via packets with packet size and contents set by the sensor type. Details regarding the specific data packets are provided in
Table 2. LoRaWAN gateways gather data packets and send them up to a LoRaWAN cloud where they are sorted out and distributed using an MQTT pipeline. The MQTT pipeline is used to cater both the
SharedAirDFW public portal and an extensive analytical toolbox.
Figure 5 provides a comprehensive overview of the complete network architecture.
6. Analytical Toolbox
Over the past three years, our sensor network has accumulated approximately 6 TB of data, with storage demands expected to grow as new sensors are deployed each month. Of this total, only a small fraction (10%) originates from the LoRaWAN nodes examined in this study.
Most other sensing systems in our network, unconstrained by LoRaWAN’s bandwidth limitations, generate substantially larger data volumes. This is primarily due to their integration of a greater number of sensors and higher-frequency data collection. Our infrastructure includes extensive outdoor monitoring systems that operate continuously, measuring key environmental parameters such as particulate matter, meteorological variables (temperature, humidity, pressure, and precipitation), and gas concentrations (e.g., carbon dioxide and ozone). Additionally, our network includes light sensing suites and acoustic monitoring systems, with the latter utilizing edge computing for the analysis of bird calls. Beyond stationary monitoring, similar sensing technologies are deployed on mobile platforms, including an electric vehicle, a fully autonomous drone, and a fully autonomous boat, to enable dynamic data collection. We also develop wearable devices designed to capture biometric data, including heart rate, heart rate variability, and skin temperature, for physiological monitoring applications.
The high degree of spatial variability in environmental conditions necessitates a robust computational infrastructure capable of managing thousands of individual time series. This infrastructure must support real-time analysis, pollution alerts, and, ultimately, particle tracking and forecasting capabilities. To that end, we have developed a containerized solution comprising three open-source docker images: Node-RED, InfluxDB, and Grafana [
49,
50,
51]. Node-RED allows us to easily ingest and process incoming sensor data in a simple graphical pipeline. This data is then injected by Node-RED into InfluxDB, a time-series database optimized for handling high frequency data from thousands of sources. Finally, Grafana is then connected to the database to enable dynamic, highly configurable, real-time visualization dashboards. These dashboards provide an interactable web interface to the data beyond what the current website enables.
6.1. Node-RED
Node-RED is a graphical programming tool that provides a collection of interfaces to hardware, data APIs, and more. Using a selection of processing nodes, the user generates a flow diagram representing the data ingestion and processing pipeline, which can be easily deployed with one click. A key advantage worth highlighting is that Node-RED’s low/no-code approach allows for an easily maintainable and scalable solution that can be adapted to new sensors and architectures. In
Figure 6, the menu of nodes is visible on the left, while the deployment flow is shown in the center.
As discussed in the previous section, sensor data is published to a collection of MQTT topics in JSON format. The first step of the processing pipeline is to define a set of MQTT listeners that identify packets unique to each sensor type. When a message is identified, the packet is then processed into a JSON dictionary and subsequently parsed using a JavaScript node that allows for custom processing code. In this step, the relevant entries are parsed from the JSON dictionary and organized to be injected into the database. For instance, when a data packet from the
BME280 sensor is received, it is first structured as a JSON dictionary that includes the payload data,
FPort information, timestamp, and other metadata. The process begins by extracting and isolating the relevant objects from the packet, focusing on the payload data and associated details. A null check is performed to ensure the availability of data. Next, the flow uses the FPort field to identify the type of sensor data contained in the packet. For the
BME280, the
FPort value is 21 as indicated in
Table 2. Once identified, the payload is decoded using a JavaScript function to convert the raw byte data into meaningful measurements, such as temperature, pressure, or humidity. The decoded data is then assigned a unique node ID derived from the sensor’s MAC address. Finally, the processed data is stored in the InfluxDB database for further analysis and visualization. The full pipeline is illustrated on
Figure 6.
Credentials for connecting to the MQTT broker as well as database access tokens are automatically loaded from user-supplied configuration files. The entirety of the flows themselves are saved as JSON files, enabling easy version-control via git. Once a pipeline is developed in the web interface, the relevant files are stored and loaded automatically by a docker upon deployment.
6.2. InfluxDB
InfluxDB was chosen as the sensor network database for its ease of use, time-series optimizations, and open-source status. In particular, a machine with a 4 core CPU and 4 GB of RAM can support up to 5000 writes per second. Like Node-RED, InfluxDB has a web interface to enable data exploration and query generation. During the build process, our docker implementation generates a default bucket for the time series as well as default tokens to enable writing by Node-RED and reading by Grafana. A sample of the data explorer web interface is shown in
Figure 7.
Complicated queries can be constructed using InfluxDB’s Flux query language, as it applies mathematical functions to compute statistical values and parameters. Queried data can be downloaded directly into CSV files for further analysis, or users can connect directly to the database by access token in their programming language of choice. Official client libraries exist for most popular languages, including Python, C++, and JavaScript.
6.3. Grafana
The final piece of the analysis toolbox is Grafana. When connected to InfluxDB, this tool enables the creation of a suite of real-time dashboards that visualize sensor data with a wide selection of graph types.
The dashboard is shown in
Figure 8, in which a variety of visualizations have been selected to show real-time data for 7 consecutive days. Users can select a desired time range to query, or the dashboard can be set to automatically update at set rate (in the case of
Figure 8, every 5 s). The size and location of each panel can be arranged by the user to suit their individual liking, and each panel can also expand for a full-screen view. Currently, dashboards exist for both Central and LoRaWAN nodes. Dashboards depicting data from multiple nodes are also available.
These three tools combine together to form a comprehensive analysis and visualization platform. Future work will incorporate automatic alerts for pollution exceedances as well as additional external data sources providing users with unparalleled access to local air quality data.
7. Deployment
Each LoRaWAN device is designed for deployment in real-world field environments. For the upcoming deployment, 10 LoRaWAN End Nodes will be set up with a single central gateway. The system is intended for use across the Dallas-Fort Worth metroplex, with plans to deploy 10 such systems, totaling 100 Nodes. The system has been publicly utilized in the city of Joppa, TX, as part of the Shared Air DFW initiative.
Figure 9 shows the first two sensors deployed for public use in Joppa, TX, visible on a digital map available to the public at
sharedairdfw.com (accessed on 18 May 2022).
Additionally,
Figure 10 depicts a LoRaWAN Node deployment at the University of Texas at Dallas.
8. Sensor Calibration
In this study, we focus on the use of low-cost air quality sensing systems. Governmental agencies primarily rely on Federal Reference Method (FRM) or Federal Equivalent Method (FEM) instruments to enforce air quality regulations due to their high accuracy. However, these systems typically cost between
$25,000 and
$100,000 USD, making widespread deployment financially challenging [
52]. Additionally, FRM and FEM instruments often have lower sampling rates. This proposal introduces a low-cost sensing solution calibrated using co-located data collection with an FEM system employing an identical sensing package. This approach aims to provide a cost-effective alternative for air quality monitoring while maintaining data reliability and frequency.
A significant challenge for low-cost particulate matter sensors is the lack of integrated heaters to prevent hygroscopic growth, which often leads to the overestimation of particle sizes and consequently, PM mass concentrations. This issue is particularly pronounced when ambient temperature and dew point are closely aligned. Thus, the initial phase of our calibration process involves applying a humidity correction to the particle counts, followed by adjustments to the PM concentrations to ensure accurate measurement under varying humidity conditions. The humidity correction is based on the methodology developed in [
53]. The entire process of the sensor calibration is illustrated on
Figure 11.
Previous research indicates that low-cost particulate matter sensors are affected by a range of environmental factors, such as humidity, temperature, and atmospheric pressure [
54,
55,
56]. To improve the accuracy of these sensors, advanced machine learning algorithms are employed to calibrate humidity-adjusted readings, resulting in a more accurate representation of actual PM concentrations. The calibration process involved co-locating a low-cost sensing system, referred to as a
MINTS Node (developed at the University of Texas at Dallas as part of the
Multi-Scale Integrated Intelligent Interactive Sensing (MINTS) research program), with a sensing package similar to that of the LoRaWAN nodes near the Hawks Athletic Center in Fort Worth, Texas. This setup was placed alongside a Federal Equivalent Method (FEM) device, specifically the
BAM 1022—Beta Attenuation Mass Monitor, which continuously collected particulate data for comparison.
The dataset, consisting of input features gathered from the BME280 and IPS7100 sensors of the MINTS Node, along with the corresponding target variable (PM2.5 values from the BAM 1022), is first consolidated into a unified structure to optimize processing efficiency. To address the potential bias due to the unequal distribution of values, particularly the over-representation of smaller PM2.5 values, the data is divided into bins based on the range of each feature and the target variable. Within each bin, random sampling is employed to ensure a roughly equal number of data points across bins, guaranteeing a representative sample of the overall dataset, especially for underrepresented values. The dataset is then shuffled and split into training and independent validation subsets, with 80% of the data used for training and 20% reserved for validation. This approach ensures that the validation set remains separate for assessing model performance on unseen data, thereby providing an unbiased evaluation of the model’s generalization capabilities.
A random data split, rather than a sequential one, was performed to prevent potential bias, as the dataset does not yet cover the full range of atmospheric and particulate matter concentrations. Given the significant seasonal climate variations in Texas, a sequential split might have skewed the dataset and compromised the model’s performance. Once sufficient data is collected, a sequential split approach will be considered.
The machine learning model, trained on a sensing package similar to that of LoRaWAN nodes, can be applied to data from these nodes, enabling it to map to the reference-grade monitor as well. The machine learning pipeline is deployed on the University of Texas at Dallas (UTD) cloud infrastructure via an MQTT pipeline endpoint. The corrected data is subsequently fed back into the analytical toolbox as an additional data point. The dataset, collected from December 2023 to July 2024, will continue to grow, allowing for ongoing model refinement. As new data from Fort Worth’s co-located sensors becomes available, we will retrain the model to improve its coverage of climate and particulate matter variability.
A comparative analysis of several non-linear, non-parametric machine learning methods—including neural networks, support vector regression, ensemble decision trees, and a baseline linear regression model—was conducted. The Degrees of Freedom (DoF) for each model were considered to provide insight into their complexity and generalization capacity. Hyper-parameter optimization was applied to select models, and the results, as shown in
Table 3, indicate that neural networks and decision tree ensembles were the top performers. The optimal performance was achieved using the hyper-parameter-optimized ensemble of decision trees, specifically the
RandomForestRegressor algorithm from the
scikit-learn library, which employs the bagging ensemble method. Key hyperparameters influencing model performance, such as the number of estimators (trees), the minimum samples required to split a node, the minimum samples required at a leaf node, the maximum number of features considered for each split, and the maximum depth of the trees, were tuned using a randomized search approach. The optimal values were determined to be 400 estimators, 2 samples to split a node, 1 sample at a leaf node, 1 feature per split, and a maximum tree depth of 70.
Eleven input variables were used in the multivariate non-linear machine learning regression model. These included humidity-corrected particle counts for particle sizes of 0.1
m, 0.3
m, 0.5
m, 1
m, 2.5
m, 5
m, and 10
m, along with key environmental factors such as atmospheric pressure, temperature, dew point, and relative humidity, collected using the
BME280 sensor. The model’s performance, evaluated using the independent validation dataset, is illustrated in
Figure 12, with key variables highlighted from the calibration process.
Figure 12 presents the machine learning model’s performance, as assessed on the independent validation dataset. Alongside this, a quantile-quantile plot illustrates model accuracy, and an additional plot highlights the most influential variables identified during the calibration phase.
Results
Figure 12 illustrates the results of a multivariate, non-linear, non-parametric machine learning regression for PM
2.5. The scatter plot (
Figure 12a) shows the relationship between the PM
2.5 measurements from the
BAM 1022 reference instrument (x-axis) and the PM
2.5 levels predicted by the machine learning calibration of the low-cost instrument (y-axis). For the training dataset, the model achieved a coefficient of determination (R
2) value of 0.97 and a root mean square error (RMSE) of 1.72 μg/m
3. On the independent validation dataset, the model attained an R
2 value of 0.82 and an RMSE of 11.88 μg/m
3, indicating robust predictive performance and strong generalizability across both training and independent validation datasets. The RMSE of 11.88 μg/m
3 represents approximately 12.75% of the dynamic range of PM
2.5, which suggests that the model’s error is a small proportion of the overall variability in PM
2.5 measurements during the relevant time period.
Figure 12b presents quantile–quantile (Q-Q) plots that compare the probability distribution functions (PDFs) of PM
2.5 data from the
BAM 1022 reference instrument with those from the machine learning-calibrated low-cost instrument. A straight line in a Q-Q plot indicates identical distributions, as it compares the percentiles of one PDF with the corresponding percentiles of the other. In this instance, the plot shows a straight line from the 25th to the 75th percentiles, confirming the successful calibration of the low-cost instrument.
Figure 12c illustrates the relative importance of the input variables used in the machine learning calibration of the low-cost optical particle counter (
IPS7100) as well as the climate sensor (
BME280) found within the LoRaWAN System. The importance metric quantifies the increase in error when a specific input variable is omitted. The bar plots are arranged in descending order of importance, with the top bar representing the most influential variable. This analysis highlights the expected significance of the particle count for 2.5-micron particles, along with environmental factors such as temperature, pressure, dew point, and humidity, all of which contribute to the accurate calibration of the low-cost instrument against the reference instrument (
BAM 1022).
After developing the machine learning model, we applied it successfully to data collected from sensors across our network.
Figure 13 shows a time series from a LoRaWAN sensor deployed at the University of Texas at Dallas in Richardson, Texas, for October 2024. The figure compares the raw PM
2.5 measurements from the
IPS7100, the humidity-corrected PM
2.5, and the machine learning-corrected PM
2.5 values.
9. Empowering Policy Control and Urban Health
Air quality management using low-cost PM
2.5 sensor systems, as discussed in this article, presents an innovative approach to enhancing policy control and improving urban health outcomes. The system outlined here provides real-time, high-resolution air quality data, coupled with climate data and machine learning outputs, which significantly enrich the data stacks used by policymakers. Traditionally, policy frameworks have relied on expensive, large-scale monitoring systems. In contrast, the proposed system offers granular spatial and temporal assessments of air quality, capturing episodic pollution spikes, such as those caused by wildfires, temperature inversions, or urban industrial pollution. Similar approaches have been successfully implemented in California and Sarajevo, where properly calibrated low-cost sensors aligned with regulatory networks, demonstrating their utility in localized air quality monitoring and public engagement [
57,
58].
The LoRaWAN sensors, being small and cost-effective, enable widespread deployment and maximum ground coverage. This extensive network allows for proactive identification and simulation of air pollution hot-spots, aiding in urban planning and pollution control measures. Furthermore, the system’s ability to simulate real-time scenarios, such as the spread of a biological agent through a city, mimics plume dispersion dynamics with high accuracy. This capability enhances emergency response strategies and public safety measures, offering critical insights during potential crises.
Our infrastructure has been further enhanced to include real-time forecasting by integrating data from additional environmental monitoring systems built and deployed by our laboratory. These systems provide valuable inputs, such as wind speed and direction. When combined with data from the LoRaWAN sensor networks discussed here, alongside open-source climate and pollution data, this integration creates a dynamic and responsive platform for environmental forecasting. This integration supports proactive environmental health strategies, potentially leading to community-level interventions. For example, correlating high PM
2.5 levels with asthma incidents in schools could trigger early warnings, allowing authorities to adjust traffic or industrial activities to mitigate exposure. This approach not only reduces health risks but could also improve key learning outcomes, such as student absenteeism, by minimizing environmental triggers that exacerbate respiratory conditions [
59]. By emphasizing human-environment interactions and promoting citizen science, the method advances sustainable development goals and improves access to air quality monitoring.
Furthermore, the use of LoRaWAN sensor networks addresses equity concerns by filling gaps in regulatory networks, particularly in under-monitored areas. The enhanced data can also refine atmospheric models and guide urban planning efforts to reduce pollution sources. Ultimately, these low-cost sensors bridge technological and economic gaps, enabling communities and governments to make informed, data-driven decisions that promote healthier environments.
Our sensor network is already being utilized by organizations such as Downwinders at Risk, the City of Plano, and the City of Gunter, which actively use the sensors to monitor and assess air quality in their communities. Through the integration of these sensors into local environmental monitoring frameworks, these organizations gain real-time data to better understand pollution patterns and identify key areas for intervention. Downwinders at Risk, in particular, advocates for the use of this data to support environmental justice, ensuring vulnerable communities are shielded from the health impacts of air pollution. The City of Plano and the City of Gunter contribute to more informed decision-making, leveraging insights from these low-cost sensors to implement policies that reduce exposure to harmful particulate matter, particularly among vulnerable populations such as children with asthma.
10. Discussion
This study presents a self-powered IoT network leveraging LoRaWAN technology for air quality monitoring, addressing the critical need for scalable, cost-effective environmental sensing systems that offer open-source access to high-frequency data through public portals, alongside transparent software and firmware availability. Although the existing literature [
60,
61,
62,
63,
64] provides insights into some components of these technologies, this work integrates all essential elements into a comprehensive framework.
Through LoRaWAN, we enable both scientists and the public to collect data and securely transmit it to high-performance computing infrastructures for advanced analysis. This ‘citizen science’ model has gained momentum with the availability of affordable, high-precision sensors that reliably measure pollutants such as particulate matter. Many of these LoRaWAN nodes are now deployed within community areas, making real-time air quality monitoring accessible to local residents.
While FEM/FRM sensor networks offer reliable but low-frequency monitoring, they are costly and can introduce latency. Our objective was to create a low-cost alternative that caters to resource-conscious settings, maintaining comparable data quality through machine learning to achieve higher frequency and minimal latency. Furthermore, our devices utilize LoRaWAN for data transmission, avoiding the need for complex and expensive networking solutions like Wi-Fi.
In this study, we developed and trained a machine learning model using data collected from a specific location, which was then applied to a different geographic region. This approach introduces a geographic variability constraint, as the training and application locations differ, potentially influencing model performance. Atmospheric conditions vary across regions, which can affect the model’s accuracy when applied to new areas. However, our methodology considers exposure to a wide range of air masses originating from various geographic regions. Continuous, second-by-second air quality measurements recorded at a fixed location for training data collection allow us to capture atmospheric conditions shaped by multiple sources, including long-range trans-boundary pollution. For example, air masses reaching the Dallas-Fort Worth (DFW) area are influenced by sources such as the Sahara Desert, the Gulf of Mexico, wildfires in California, and local emissions within the metroplex. This variability in atmospheric transport introduces significant fluctuations in the dataset, which, rather than being a limitation, is an essential characteristic of the data, reflecting the complexity and dynamic nature of real-world atmospheric conditions.
11. Challenges and Future Directions
One of the key challenges encountered during the deployment of these sensors was the limited communication range in urban environments, where access to mounting locations with a clear line of sight was constrained. Consequently, the communication range between the LoRaWAN node and the gateway was restricted to less than one mile. To overcome this limitation, ongoing research is exploring the use of high-powered LoRaWAN radios with a maximum output of 20 dBm, in contrast to the 14 dBm radios used in the initial deployment. In addition, consultations with industry experts have been conducted to identify optimal antenna configurations aimed at improving signal strength and range.
Looking ahead, efforts are underway to expand the proposed system by integrating larger batteries and more energy-efficient solar panels. Currently, the battery life spans from 6 to 12 months, limiting the deployment and data collection capacity. The planned upgrades will enable the integration of additional sensors, allowing for the collection of more comprehensive measurements and enhancing the overall quality of the data. Furthermore, we are advancing our machine learning training processes, which are being conducted in both controlled laboratory environmental chambers and a mobile laboratory mounted on an electric vehicle. This approach is intended to improve the accuracy and reliability of the model, complementing ongoing training efforts with the sensing system deployed in Fort Worth, TX.
Supplementary Materials
The firmware has been made publicly available. The firmware for the LoRaWAN end nodes is available in the Zenodo data store:
https://doi.org/10.5281/zenodo.6727816 (accessed on 23 June 2022). The SharedAirDFW public portal, which provides up to date particulate matter data from the LoRaWAN end nodes, can be accessed at
https://www.sharedairdfw.com/ (accessed on 1 March 2025).
Author Contributions
Conceptualization: L.O.H.W., D.K. and D.J.L.; sensor construction: L.O.H.W., D.K., P.D., G.B., M.I., A.A., B.F., M.L., R.P., N.D., H.Z., D.X., V.A. and S.L.; methodology: L.O.H.W., D.K. and D.J.L.; data curation: L.O.H.W.; software: L.O.H.W., D.K., J.W. and D.J.L.; investigation: L.O.H.W., D.K. and D.J.L.; resources: D.J.L., C.S., L.O.H.W. and J.W.; writing—original draft preparation: L.O.H.W.; writing—review and editing: L.O.H.W., D.J.L., J.W. and V.S.; visualization: L.O.H.W., J.W., V.S. and R.P.; supervision: D.J.L.; project administration: D.J.L.; funding acquisition: D.J.L. and C.S. All authors have read and agreed to the published version of the manuscript.
Funding
We acknowledge funding from the following sources: Earth Day Texas; Downwinders at Risk; the City of Plano; TRECIS CC* Cyberteam (NSF #2019135); NSF OAC-2115094 Award; AFWERX AFX23D-TCSO1 Proposal # F2-17492; The US Army (Dense Urban Environment Dosimetry for Actionable Information and Recording Exposure, U.S. Army Medical Research Acquisition Activity, BAA CDMRP Grant Log #BA170483); EPA 16th Annual P3 Awards Grant Number 83996501, entitled Machine Learning-Calculated Low-Cost Sensing; The Texas National Security Network Excellence Fund Award for Environmental Sensing Security Sentinels; and the SOFWERX Award for Machine Learning for Robotic Team.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The original data used for this study will be made available by the authors upon request.
Acknowledgments
The authors extend their sincere gratitude to Eric Lenington, Andy Slote and Chris Simonds from Object Spectrum, LLC, for their unwavering support in the realm of LoRaWAN technologies. We wish to express our deep appreciation to the esteemed Undergraduate Research Students—Aaron Barbosa, Berkley Shofner (Clark Research Program), John Charles Sadler, Julia Boah Kim, and Giakhanh Huu Hoang—for their exceptional contributions that have played a crucial role in shaping the trajectory of our ongoing research. A special acknowledgment is dedicated to the distinguished Undergraduate Senior Design Students—Kameron Noorbakhsh, Nicholas Steele, Nikhil Nair, Jake Schroder, Benjamin Hogan, Jacob Scheller, Jonah Duncan, Getenet Demsie, Nathan Nguyen, Bryanth Fung, Keigo Ma, Robert Wu, Kangzhi Zhao, Sidney Evans, Kevin Flores, Fawaz Khurram, Veronica Ramirez, Daniel Yustana, Keshav Dhamanwala, Aditya Agrawal, Tommy Symalla, Dien Tran, Michael Villordon, Basil El-Hindi, George Yi, Eric Zhang, Trent Haines, and Noah Barber. Your collective endeavors have significantly enhanced our public sensing portal, sharedairdfw.com. We also express our profound gratitude to the high school students who generously volunteered their time and expertise. Inbar Leibovich and Shrey Joshi—your unwavering enthusiasm and dedication have been integral to the success of our projects, and your invaluable contributions are acknowledged with deep appreciation.
Conflicts of Interest
The author has declared that no competing interests exist.
References
- McMurry, P.H. A review of atmospheric aerosol measurements. Atmos. Environ. 2000, 34, 1959–1999. [Google Scholar] [CrossRef]
- Yang, Y.; Ruan, Z.; Wang, X.; Yang, Y.; Mason, T.G.; Lin, H.; Tian, L. Short-term and long-term exposures to fine particulate matter constituents and health: A systematic review and meta-analysis. Environ. Pollut. 2019, 247, 874–882. [Google Scholar] [CrossRef]
- Hime, N.J.; Marks, G.B.; Cowie, C.T. A comparison of the health effects of ambient particulate matter air pollution from five emission sources. Int. J. Environ. Res. Public Health 2018, 15, 1206. [Google Scholar] [CrossRef] [PubMed]
- Wu, W.; Jin, Y.; Carlsten, C. Inflammatory health effects of indoor and outdoor particulate matter. J. Allergy Clin. Immunol. 2018, 141, 833–844. [Google Scholar] [CrossRef]
- Ali, M.U.; Liu, G.; Yousaf, B.; Ullah, H.; Abbas, Q.; Munir, M.A.M. A systematic review on global pollution status of particulate matter-associated potential toxic elements and health perspectives in urban environment. Environ. Geochem. Health 2019, 41, 1131–1162. [Google Scholar] [CrossRef]
- Li, R.; Zhou, R.; Zhang, J. Function of PM2.5 in the pathogenesis of lung cancer and chronic airway inflammatory diseases. Oncol. Lett. 2018, 15, 7506–7514. [Google Scholar] [CrossRef]
- Thangavel, P.; Park, D.; Lee, Y.C. Recent insights into particulate matter (PM2.5)-mediated toxicity in humans: An overview. Int. J. Environ. Res. Public Health 2022, 19, 7511. [Google Scholar] [CrossRef] [PubMed]
- Xing, Y.F.; Xu, Y.H.; Shi, M.H.; Lian, Y.X. The impact of PM2.5 on the human respiratory system. J. Thorac. Dis. 2016, 8, E69. [Google Scholar]
- Loftus, C.; Yost, M.; Sampson, P.; Arias, G.; Torres, E.; Vasquez, V.B.; Bhatti, P.; Karr, C. Regional PM2.5 and asthma morbidity in an agricultural community: A panel study. Environ. Res. 2015, 136, 505–512. [Google Scholar] [CrossRef]
- Wang, C.; Tu, Y.; Yu, Z.; Lu, R. PM2.5 and cardiovascular diseases in the elderly: An overview. Int. J. Environ. Res. Public Health 2015, 12, 8187–8197. [Google Scholar] [CrossRef]
- Ni, L.; Chuang, C.C.; Zuo, L. Fine particulate matter in acute exacerbation of COPD. Front. Physiol. 2015, 6, 294. [Google Scholar] [CrossRef] [PubMed]
- Hamra, G.B.; Guha, N.; Cohen, A.; Laden, F.; Raaschou-Nielsen, O.; Samet, J.M.; Vineis, P.; Forastiere, F.; Saldiva, P.; Yorifuji, T.; et al. Outdoor particulate matter exposure and lung cancer: A systematic review and meta-analysis. Environ. Health Perspect. 2014, 122, 906–911. [Google Scholar] [CrossRef]
- Du, Y.; Xu, X.; Chu, M.; Guo, Y.; Wang, J. Air particulate matter and cardiovascular disease: The epidemiological, biomedical and clinical evidence. J. Thorac. Dis. 2016, 8, E8. [Google Scholar]
- Lippmann, M.; Ito, K.; Nadas, A.; Burnett, R.T. Association of Particulate Matter Components with Daily Mortality and Morbidity in Urban Populations; Research Report; Health Effects Institute: Boston, MA, USA, 2000; pp. 5–72. [Google Scholar]
- Fang, Y.; Naik, V.; Horowitz, L.; Mauzerall, D.L. Air pollution and associated human mortality: The role of air pollutant emissions, climate change and methane concentration increases from the preindustrial period to present. Atmos. Chem. Phys. 2013, 13, 1377–1394. [Google Scholar] [CrossRef]
- Grigg, J. Particulate matter exposure in children: Relevance to chronic obstructive pulmonary disease. Proc. Am. Thorac. Soc. 2009, 6, 564–569. [Google Scholar] [CrossRef]
- Ranft, U.; Schikowski, T.; Sugiri, D.; Krutmann, J.; Krämer, U. Long-term exposure to traffic-related particulate matter impairs cognitive function in the elderly. Environ. Res. 2009, 109, 1004–1011. [Google Scholar] [CrossRef] [PubMed]
- Dondi, A.; Carbone, C.; Manieri, E.; Zama, D.; Del Bono, C.; Betti, L.; Biagi, C.; Lanari, M. Outdoor air pollution and childhood respiratory disease: The role of oxidative stress. Int. J. Mol. Sci. 2023, 24, 4345. [Google Scholar] [CrossRef]
- Aithal, S.S.; Sachdeva, I.; Kurmi, O.P. Air quality and respiratory health in children. Breathe 2023, 19, 230040. [Google Scholar] [CrossRef]
- Karimi, B.; Samadi, S. Long-term exposure to air pollution on cardio-respiratory, and lung cancer mortality: A systematic review and meta-analysis. J. Environ. Health Sci. Eng. 2024, 22, 75–95. [Google Scholar] [CrossRef]
- Zhou, C.; Chen, J.; Wang, S. Examining the effects of socioeconomic development on fine particulate matter (PM2.5) in China’s cities using spatial regression and the geographical detector technique. Sci. Total. Environ. 2018, 619, 436–445. [Google Scholar] [CrossRef]
- Zhang, M.; Song, Y.; Cai, X.; Zhou, J. Economic assessment of the health effects related to particulate matter pollution in 111 Chinese cities by using economic burden of disease analysis. J. Environ. Manag. 2008, 88, 947–954. [Google Scholar] [CrossRef]
- Bell, M.L.; Ebisu, K. Environmental inequality in exposures to airborne particulate matter components in the United States. Environ. Health Perspect. 2012, 120, 1699–1704. [Google Scholar] [CrossRef] [PubMed]
- Pérez-Padilla, R.; Schilmann, A.; Riojas-Rodriguez, H. Respiratory health effects of indoor air pollution. Int. J. Tuberc. Lung Dis. 2010, 14, 1079–1086. [Google Scholar] [PubMed]
- Hoskins, J.A. Health effects due to indoor air pollution. In Survival and Sustainability: Environmental Concerns in the 21st Century; Springer: Berlin/Heidelberg, Germany, 2011; pp. 665–676. [Google Scholar]
- Samet, J.M.; Marbury, M.C.; Spengler, J.D. Health effects and sources of indoor air pollution. Part I. Am. Rev. Respir. Dis. 1987, 136, 1486–1508. [Google Scholar] [CrossRef]
- Sun, Y. Environmental correlates of mortality: How does air pollution contribute to geographic disparities in cardiovascular disease mortality? Popul. Environ. 2024, 46, 1. [Google Scholar] [CrossRef]
- Colmer, J.; Hardman, I.; Shimshack, J.; Voorheis, J. Disparities in PM2.5 air pollution in the United States. Science 2020, 369, 575–578. [Google Scholar] [CrossRef]
- Tessum, C.W.; Apte, J.S.; Goodkind, A.L.; Muller, N.Z.; Mullins, K.A.; Paolella, D.A.; Polasky, S.; Springer, N.P.; Thakrar, S.K.; Marshall, J.D.; et al. Inequity in consumption of goods and services adds to racial–ethnic disparities in air pollution exposure. Proc. Natl. Acad. Sci. USA 2019, 116, 6001–6006. [Google Scholar] [CrossRef]
- Harrison, W.A.; Lary, D.; Nathan, B.; Moore, A.G. The neighborhood scale variability of airborne particulates. J. Environ. Prot. 2015, 6, 464. [Google Scholar] [CrossRef]
- Lanza, J.; Sánchez, L.; Muñoz, L.; Galache, J.A.; Sotres, P.; Santana, J.R.; Gutiérrez, V. Large-Scale Mobile Sensing Enabled Internet-of-Things Testbed for Smart City Services. Int. J. Distrib. Sens. Netw. 2015, 11, 785061. [Google Scholar] [CrossRef]
- Nam, T.; Pardo, T.A. Conceptualizing smart city with dimensions of technology, people, and institutions. In Proceedings of the 12th Annual International Digital Government Research Conference: Digital Government Innovation in Challenging Times, College Park, MD, USA, 12–15 June 2011; pp. 282–291. [Google Scholar]
- Bonilla, V.; Campoverde, B.; Yoo, S.G. A Systematic Literature Review of LoRaWAN: Sensors and Applications. Sensors 2023, 23, 8440. [Google Scholar] [CrossRef]
- Chaudhari, B.S.; Zennaro, M.; Borkar, S. LPWAN technologies: Emerging application characteristics, requirements, and design considerations. Future Internet 2020, 12, 46. [Google Scholar] [CrossRef]
- Mekki, K.; Bajic, E.; Chaxel, F.; Meyer, F. A comparative study of LPWAN technologies for large-scale IoT deployment. ICT Express 2019, 5, 1–7. [Google Scholar] [CrossRef]
- Iqbal, M.; Abdullah, A.Y.M.; Shabnam, F. An application based comparative study of LPWAN technologies for IoT environment. In Proceedings of the 2020 IEEE Region 10 Symposium (TENSYMP), Dhaka, Bangladesh, 5–7 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1857–1860. [Google Scholar]
- Van den Abeele, F.; Haxhibeqiri, J.; Moerman, I.; Hoebeke, J. Scalability analysis of large-scale LoRaWAN networks in ns-3. IEEE Internet Things J. 2017, 4, 2186–2198. [Google Scholar] [CrossRef]
- El Chall, R.; Lahoud, S.; El Helou, M. LoRaWAN network: Radio propagation models and performance evaluation in various environments in Lebanon. IEEE Internet Things J. 2019, 6, 2366–2378. [Google Scholar] [CrossRef]
- Yousuf, A.M.; Rochester, E.M.; Ghaderi, M. A low-cost LoRaWAN testbed for IoT: Implementation and measurements. In Proceedings of the 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), Singapore, 5–8 February 2018; pp. 361–366. [Google Scholar] [CrossRef]
- Microchip Technology Inc. SAM D21/DA1 Family, Low-Power, 32-bit Cortex-M0+ MCU with Advanced Analog and PWM; DS40001882G; Microchip Technology Inc.: Chandler, AZ, USA, 2021. [Google Scholar]
- RisingHF. RHF-DS01500, RHF76-052 LoRaWAN Module; Rev.03 2015-01-25; RisingHF: Shenzhen, China, 2015. [Google Scholar]
- Petäjäjärvi, J.; Mikhaylov, K.; Pettissalo, M.; Janhunen, J.; Iinatti, J. Performance of a low-power wide-area network based on LoRa technology: Doppler robustness, scalability, and coverage. Int. J. Distrib. Sens. Netw. 2017, 13, 1550147717699412. [Google Scholar] [CrossRef]
- Augustin, A.; Yi, J.; Clausen, T.; Townsley, W.M. A study of LoRa: Long range & low power networks for the internet of things. Sensors 2016, 16, 1466. [Google Scholar] [CrossRef]
- Quectel Wireless Solutions Co., Ltd. L70 GPS, Protocol Specification; Quectel Wireless Solutions Co., Ltd.: Shanghai, China, 2015. [Google Scholar]
- Linear Technology Corporation. LT3652, Power Tracking 2A Battery Charger for Solar Power; LT 1215 REV E; Linear Technology Corporation: Milpitas, CA, USA, 2010. [Google Scholar]
- Texas Instruments. TPL5110 Nano-Power System Timer for Power Gating; SNAS650–JANUARY 2015–REVISED JANUARY 2015; Texas Instruments: Dallas, TX, USA, 2015. [Google Scholar]
- Piera Systems. 305/3100/525/5100/7100, Photon Counting Intelligent Particle Sensor for Accurate Air Quality Monitoring Product Specification; IPS Datasheet V1.1.8; Piera Systems: Mississauga, ON, Canada, 2021. [Google Scholar]
- Bosch Sensortec Gmbh. BME280, Combined Humidity and Pressure Sensor; BST-BME280-DS002-15; Bosch Sensortec Gmbh: Reutlingen, Germany, 2018. [Google Scholar]
- Node-RED. Available online: https://nodered.org/about/ (accessed on 28 June 2022).
- InfluxDB. Available online: https://www.influxdata.com/products/influxdb/ (accessed on 28 June 2022).
- Grafana. Available online: https://grafana.com/grafana/ (accessed on 28 June 2022).
- Lowther, S.D.; Jones, K.C.; Wang, X.; Whyatt, J.D.; Wild, O.; Booker, D. Particulate matter measurement indoors: A review of metrics, sensors, needs, and applications. Environ. Sci. Technol. 2019, 53, 11644–11656. [Google Scholar] [CrossRef]
- Di Antonio, A.; Popoola, O.A.; Ouyang, B.; Saffell, J.; Jones, R.L. Developing a relative humidity correction for low-cost sensors measuring ambient particulate matter. Sensors 2018, 18, 2790. [Google Scholar] [CrossRef]
- Wijeratne, L.O.; Kiv, D.R.; Aker, A.R.; Talebi, S.; Lary, D.J. Using Machine Learning for the Calibration of Airborne Particulate Sensors. Sensors 2020, 20, 99. [Google Scholar] [CrossRef]
- Kliengchuay, W.; Cooper Meeyai, A.; Worakhunpiset, S.; Tantrakarnapa, K. Relationships between meteorological parameters and particulate matter in Mae Hong Son province, Thailand. Int. J. Environ. Res. Public Health 2018, 15, 2801. [Google Scholar] [CrossRef]
- Tian, G.; Qiao, Z.; Xu, X. Characteristics of particulate matter (PM10) and its relationship with meteorological factors during 2001–2012 in Beijing. Environ. Pollut. 2014, 192, 266–274. [Google Scholar] [CrossRef] [PubMed]
- Farooqui, Z.; Biswas, J.; Saha, J. Long-Term Assessment of PurpleAir Low-Cost Sensor for PM2.5 in California, USA. Pollutants 2023, 3, 477–493. [Google Scholar] [CrossRef]
- Masic, A. City-Scale Air Quality Network of Low-Cost Sensors. Atmosphere 2024, 15, 798. [Google Scholar] [CrossRef]
- Lary, M.A.; Allsopp, L.; Lary, D.J.; Sterling, D.A. Using machine learning to examine the relationship between asthma and absenteeism. Environ. Monit. Assess. 2019, 191, 332. [Google Scholar] [CrossRef] [PubMed]
- Thu, M.Y.; Htun, W.; Aung, Y.L.; Shwe, P.E.E.; Tun, N.M. Smart air quality monitoring system with LoRaWAN. In Proceedings of the 2018 IEEE International Conference on Internet of Things and Intelligence System (IOTAIS), Bali, Indonesia, 1–3 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 10–15. [Google Scholar]
- Johnston, S.J.; Basford, P.J.; Bulot, F.M.; Apetroaie-Cristea, M.; Easton, N.H.; Davenport, C.; Foster, G.L.; Loxham, M.; Morris, A.K.; Cox, S.J. City scale particulate matter monitoring using LoRaWAN based air quality IoT devices. Sensors 2019, 19, 209. [Google Scholar] [CrossRef]
- Howerton, J.M.; Schenck, B.L. The deployment of a LoRaWAN-based IoT air quality sensor network for public good. In Proceedings of the 2020 Systems and Information Engineering Design Symposium (SIEDS), Charlottesville, VA, USA, 24–24 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Jabbar, W.A.; Subramaniam, T.; Ong, A.E.; Shu’Ib, M.I.; Wu, W.; De Oliveira, M.A. LoRaWAN-based IoT system implementation for long-range outdoor air quality monitoring. Internet Things 2022, 19, 100540. [Google Scholar] [CrossRef]
- Candia, A.; Represa, S.N.; Giuliani, D.; Luengo, M.A.; Porta, A.A.; Marrone, L.A. Solutions for SmartCities: Proposal of a monitoring system of air quality based on a LoRaWAN network with low-cost sensors. In Proceedings of the 2018 Congreso Argentino de Ciencias de la Informática y Desarrollos de Investigación (CACIDI), Buenos Aires, Argentina, 28–30 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
Figure 1.
External Physical Design: The physical structure of the system comprises five distinct modules, excluding the mounting backplate. These modules are the solar module, main module, battery module, air module, and antenna module. The solar module includes a pair of 3 W solar panels. The battery module houses a 3.7 V, 6600 mAh Lithium-Ion Battery Pack, which is enclosed within a fireproof bag inside a metal box for added safety. The main module contains the Power Control Unit (PCU), the Main Control Unit (MCU), and a low-power timer (TPL5110). The air module features the External Sensing Unit (ESU), which is mounted within a solar radiation shield to protect the sensors from environmental conditions. A solar radiation shield is an advanced stevenson screen that protects sensors from climatic factors while ensuring accurate measurements by promoting airflow around the sensing elements. Lastly, the antenna module consists solely of the LoRaWAN antenna, positioned in the upper-right corner of the mounting backplate.
Figure 1.
External Physical Design: The physical structure of the system comprises five distinct modules, excluding the mounting backplate. These modules are the solar module, main module, battery module, air module, and antenna module. The solar module includes a pair of 3 W solar panels. The battery module houses a 3.7 V, 6600 mAh Lithium-Ion Battery Pack, which is enclosed within a fireproof bag inside a metal box for added safety. The main module contains the Power Control Unit (PCU), the Main Control Unit (MCU), and a low-power timer (TPL5110). The air module features the External Sensing Unit (ESU), which is mounted within a solar radiation shield to protect the sensors from environmental conditions. A solar radiation shield is an advanced stevenson screen that protects sensors from climatic factors while ensuring accurate measurements by promoting airflow around the sensing elements. Lastly, the antenna module consists solely of the LoRaWAN antenna, positioned in the upper-right corner of the mounting backplate.
![Air 03 00009 g001]()
Figure 2.
System Design Architecture: LoRaWAN sensing units are comprised of a Main Control Unit (MCU), a Power Control Unit (PCU), an External Sensing Unit (ESU), two power sensors (INA219s), a low-power timer (TPL5110), a solar panel, and a battery. The figure illustrates the flow of power and data within the system. The gray arrows represent the power flow and the blue arrows represent the data flow. The sensing system is designed to collect sensor measurements and transmit data efficiently and sustainably. Its operation is divided into discrete life cycles, each lasting 15 min and representing one complete run of the system’s firmware. Although the system operates continuously 24/7, it evaluates power availability at the start of each cycle to determine how long it will remain active before entering sleep mode for the rest of the cycle. The dotted purple arrow symbolizes the signal sent by the main control unit to transition the system into sleep mode for the remainder of the current life cycle.
Figure 2.
System Design Architecture: LoRaWAN sensing units are comprised of a Main Control Unit (MCU), a Power Control Unit (PCU), an External Sensing Unit (ESU), two power sensors (INA219s), a low-power timer (TPL5110), a solar panel, and a battery. The figure illustrates the flow of power and data within the system. The gray arrows represent the power flow and the blue arrows represent the data flow. The sensing system is designed to collect sensor measurements and transmit data efficiently and sustainably. Its operation is divided into discrete life cycles, each lasting 15 min and representing one complete run of the system’s firmware. Although the system operates continuously 24/7, it evaluates power availability at the start of each cycle to determine how long it will remain active before entering sleep mode for the rest of the cycle. The dotted purple arrow symbolizes the signal sent by the main control unit to transition the system into sleep mode for the remainder of the current life cycle.
![Air 03 00009 g002]()
Figure 3.
Operation: The system is designed with power management in mind and initially makes power readings to select a predefined power mode. Depending on the power mode, the system will either halt or manage sensing operations and measurement frequency. The overall process is illustrated in this figure, where SPC denotes the “Sensing Period Check”.
Figure 3.
Operation: The system is designed with power management in mind and initially makes power readings to select a predefined power mode. Depending on the power mode, the system will either halt or manage sensing operations and measurement frequency. The overall process is illustrated in this figure, where SPC denotes the “Sensing Period Check”.
Figure 4.
Power Modes: In the beginning of each life cycle, the MCU determines a power mode for the system to be in, and the mode is solely determined by the output voltage from the solar panels and the current battery voltage. An illustration of how each power mode is set is presented in this figure.
Figure 4.
Power Modes: In the beginning of each life cycle, the MCU determines a power mode for the system to be in, and the mode is solely determined by the output voltage from the solar panels and the current battery voltage. An illustration of how each power mode is set is presented in this figure.
Figure 5.
Network Architecture: In addition to LoRaWAN end nodes (
a), the network consists of a LoRaWAN gateway embedded within an integrated sensing suite called the ‘
Central Node’ (
b), a LoRaWAN cloud (
c) utilising the open-source ChirpStack LoRaWAN Network Server, the
SharedAirDFW public portal (
d) that provides access to up-to-date Sensor Data, as well as a comprehensive analytical toolbox (
e) using influxDB, Grafana, and Node-RED.
Figure 5.
Network Architecture: In addition to LoRaWAN end nodes (
a), the network consists of a LoRaWAN gateway embedded within an integrated sensing suite called the ‘
Central Node’ (
b), a LoRaWAN cloud (
c) utilising the open-source ChirpStack LoRaWAN Network Server, the
SharedAirDFW public portal (
d) that provides access to up-to-date Sensor Data, as well as a comprehensive analytical toolbox (
e) using influxDB, Grafana, and Node-RED.
Figure 6.
Node-RED Interface: Node-RED web interface: A series of processing nodes are defined and connected sequentially from left to right, forming a processing flow. The flow starts with the collection of MQTT packets on the left, which are then converted into JSON dictionaries. These dictionaries are parsed and subsequently injected into InfluxDB.
Figure 6.
Node-RED Interface: Node-RED web interface: A series of processing nodes are defined and connected sequentially from left to right, forming a processing flow. The flow starts with the collection of MQTT packets on the left, which are then converted into JSON dictionaries. These dictionaries are parsed and subsequently injected into InfluxDB.
Figure 7.
InfluxDB Data Explorer: A sample query is generated to select PM2.5 data from multiple LoRaWAN Nodes.
Figure 7.
InfluxDB Data Explorer: A sample query is generated to select PM2.5 data from multiple LoRaWAN Nodes.
Figure 8.
Grafana Dashboard: This dashboard presents real-time data collected from a LoRaWAN sensor at UT Dallas, Richardson, for the week of 14 October to 20 October 2024. First Row: The left panel displays particle counts categorized by size, the middle panel shows the most recent particle counts across size bins ranging from to m, and the right panel indicates the sensor’s geographical location on a map. Second Row: The left panel presents a time series of particulate matter concentrations ranging from to m, while the middle panel shows the most recent readings for particulate matter concentrations within the same range. The right panel presents the latest measurements of atmospheric temperature, pressure, humidity, dew point, solar voltage, solar power, battery voltage, and battery power. Third Row: A time series of climate data encompassing atmospheric temperature, pressure, humidity, and dew point. Fourth Row: A time series of power consumption data, outlining solar voltage, solar power, battery voltage, and battery power.
Figure 8.
Grafana Dashboard: This dashboard presents real-time data collected from a LoRaWAN sensor at UT Dallas, Richardson, for the week of 14 October to 20 October 2024. First Row: The left panel displays particle counts categorized by size, the middle panel shows the most recent particle counts across size bins ranging from to m, and the right panel indicates the sensor’s geographical location on a map. Second Row: The left panel presents a time series of particulate matter concentrations ranging from to m, while the middle panel shows the most recent readings for particulate matter concentrations within the same range. The right panel presents the latest measurements of atmospheric temperature, pressure, humidity, dew point, solar voltage, solar power, battery voltage, and battery power. Third Row: A time series of climate data encompassing atmospheric temperature, pressure, humidity, and dew point. Fourth Row: A time series of power consumption data, outlining solar voltage, solar power, battery voltage, and battery power.
![Air 03 00009 g008]()
Figure 9.
Shared Air DFW Public Portal: Shared Air DFW is an initiative operating out of a University of Texas at Dallas laboratory, dedicated to deploying monitoring systems designed and built by university students. Particulate matter data collected from these monitors, as well as information from EPA and DFW Purple Air monitors, are displayed in real time on a digital map accessible by anyone at
www.sharedairdfw.com (accessed on 18 May 2022). The project is sponsored by the National Science Foundation, Earth Day Texas, the US Army, Downwinders at Risk, the City of Plano, and the US Environmental Protection Agency.
Figure 9.
Shared Air DFW Public Portal: Shared Air DFW is an initiative operating out of a University of Texas at Dallas laboratory, dedicated to deploying monitoring systems designed and built by university students. Particulate matter data collected from these monitors, as well as information from EPA and DFW Purple Air monitors, are displayed in real time on a digital map accessible by anyone at
www.sharedairdfw.com (accessed on 18 May 2022). The project is sponsored by the National Science Foundation, Earth Day Texas, the US Army, Downwinders at Risk, the City of Plano, and the US Environmental Protection Agency.
Figure 10.
A sensor deployed at the University of Texas at Dallas: In addition to this sensor, the University of Texas at Dallas operates a diverse array of sensing systems leveraging LoRaWAN technology, all seamlessly connected to a central gateway strategically located on the tallest building on campus.
Figure 10.
A sensor deployed at the University of Texas at Dallas: In addition to this sensor, the University of Texas at Dallas operates a diverse array of sensing systems leveraging LoRaWAN technology, all seamlessly connected to a central gateway strategically located on the tallest building on campus.
Figure 11.
Sensor Calibration Process: This figure presents the critical set points in the calibration process of the LoRaWAN Node. Before implementing Machine Learning Calibration, a humidity correction is applied to the input data collected from the
IPS7100, utilizing the climate sensor (
BME280). The machine learning model is trained using target data sourced from a Federal Equivalent Method (FEM) Beta Attenuation Monitor (
BAM 1022) located in Fort Worth, Texas. This device is collocated with a
MINTS node, and both systems continuously collect data 24/7 for eight months. Following the training of the machine learning model, the raw data from the
IPS7100 and the
BME280 sensors on the LoRaWAN nodes are utilized to produce more accurate data products. The blue arrows represents particulate matter data influenced by the
IPS7100 and the green arrows represent the data coming from the climate sensor (
BME280). The red arrow represent the particulate matter data provided by the Federal Equivalent Method (FEM) Beta Attenuation Monitor (
BAM 1022). The purple arrow represents the deployment of the trained machine learning model.
The MINTS node is an advanced sensing system developed at the University of Texas at Dallas as part of the MINTS research program, which stands for Multi-Scale Integrated Intelligent Interactive Sensing (https://mints.utdallas.edu/). This node operates with direct internet access via an ethernet cable, eliminating bandwidth limitations. It is equipped with multiple sensors, including the IPS7100 for particulate matter measurement and the BME280 for environmental monitoring. The LoRaWAN nodes examined in this study are also a product of the same research program.
Figure 11.
Sensor Calibration Process: This figure presents the critical set points in the calibration process of the LoRaWAN Node. Before implementing Machine Learning Calibration, a humidity correction is applied to the input data collected from the
IPS7100, utilizing the climate sensor (
BME280). The machine learning model is trained using target data sourced from a Federal Equivalent Method (FEM) Beta Attenuation Monitor (
BAM 1022) located in Fort Worth, Texas. This device is collocated with a
MINTS node, and both systems continuously collect data 24/7 for eight months. Following the training of the machine learning model, the raw data from the
IPS7100 and the
BME280 sensors on the LoRaWAN nodes are utilized to produce more accurate data products. The blue arrows represents particulate matter data influenced by the
IPS7100 and the green arrows represent the data coming from the climate sensor (
BME280). The red arrow represent the particulate matter data provided by the Federal Equivalent Method (FEM) Beta Attenuation Monitor (
BAM 1022). The purple arrow represents the deployment of the trained machine learning model.
The MINTS node is an advanced sensing system developed at the University of Texas at Dallas as part of the MINTS research program, which stands for Multi-Scale Integrated Intelligent Interactive Sensing (https://mints.utdallas.edu/). This node operates with direct internet access via an ethernet cable, eliminating bandwidth limitations. It is equipped with multiple sensors, including the IPS7100 for particulate matter measurement and the BME280 for environmental monitoring. The LoRaWAN nodes examined in this study are also a product of the same research program.
![Air 03 00009 g011]()
Figure 12.
This figure presents the results of a multivariate, non-linear, non-parametric machine learning regression for PM2.5. In (a), the relationship between PM2.5 measurements from the BAM 1022 Beta Attenuation Mass Monitor (x-axis) and the PM2.5 levels predicted by the machine learning-calibrated IPS7100 + BME280 instruments (y-axis) is shown. Training data is represented by blue circles, while green plus signs indicate independent validation data. The red line represents the ideal response. (b) displays the quantile–quantile plot for the machine learning independent validation data. Here, the x-axis represents percentiles from the PM2.5 distribution of the BAM 1022, and the y-axis shows percentiles of the machine learning-calibrated PM2.5 distribution from the IPS7100 + BME280 sensor combination. The dotted red line indicates the ideal response. (c) illustrates the relative importance of input variables in the machine learning calibration of the low-cost setup, with the top three variables highlighted in green and subsequent variables in blue.
Figure 12.
This figure presents the results of a multivariate, non-linear, non-parametric machine learning regression for PM2.5. In (a), the relationship between PM2.5 measurements from the BAM 1022 Beta Attenuation Mass Monitor (x-axis) and the PM2.5 levels predicted by the machine learning-calibrated IPS7100 + BME280 instruments (y-axis) is shown. Training data is represented by blue circles, while green plus signs indicate independent validation data. The red line represents the ideal response. (b) displays the quantile–quantile plot for the machine learning independent validation data. Here, the x-axis represents percentiles from the PM2.5 distribution of the BAM 1022, and the y-axis shows percentiles of the machine learning-calibrated PM2.5 distribution from the IPS7100 + BME280 sensor combination. The dotted red line indicates the ideal response. (c) illustrates the relative importance of input variables in the machine learning calibration of the low-cost setup, with the top three variables highlighted in green and subsequent variables in blue.
![Air 03 00009 g012]()
Figure 13.
Comparison of LoRaWAN Node PM2.5 Data: This figure presents PM2.5 data collected from a sensor at UT Dallas in Richardson for October 2024. The green time series represents the raw PM2.5 data from the IPS7100 sensor on the LoRaWAN nodes. The blue time series shows humidity-corrected PM2.5 values, while the red time series depicts PM2.5 values corrected using the machine learning mode.
Figure 13.
Comparison of LoRaWAN Node PM2.5 Data: This figure presents PM2.5 data collected from a sensor at UT Dallas in Richardson for October 2024. The green time series represents the raw PM2.5 data from the IPS7100 sensor on the LoRaWAN nodes. The blue time series shows humidity-corrected PM2.5 values, while the red time series depicts PM2.5 values corrected using the machine learning mode.
Table 1.
Sensing Cycles for each Sensor.
Table 1.
Sensing Cycles for each Sensor.
Power Mode | Sensing Cycles | Sensing Period (min) |
---|
IPS7100 | BME280 | INA219s | L70 |
---|
0 | 0 | - | - | - | - |
1 | 1 | - | - | - | - |
2 | 1 | - | - | - | - |
3 | 2 | 2 | 2 | 4 | 4 |
4 | 2 | 2 | 2 | 4 | 4 |
5 | ∞ | 1 | 1 | 3 | 3 |
Table 2.
LoRaWAN Data Packets.
Table 2.
LoRaWAN Data Packets.
Sensor ID | FPORT | Packet Size (Bytes) | Parameters Measured |
---|
IPS7100 | 15 | 56 | Standard particulate matter mass fractions and particulate counts for particulate diameters 0.1, 0.3, 0.5, 1.0, 2.5, 5.0 and 10 m. |
BME280 | 21 | 12 | Climate readings of atmospheric temperature, pressure and humidity. |
INA219s | 3 | 32 | Voltages, currents, and power readings for both the battery pack and solar panels. |
LP70 | 5 | 55 | Longitude, latitude, altitude, as well as the UTC time stamps. |
Table 3.
Performance Comparison of Machine Learning Models on Coefficient of Determination (R2) and Root Mean Square Error (RMSE) Metrics.
Table 3.
Performance Comparison of Machine Learning Models on Coefficient of Determination (R2) and Root Mean Square Error (RMSE) Metrics.
Machine Learning Model | DoF | Training | Independent Validation | Combined | Rank |
---|
R2 | RMSE | R2 | RMSE | R2 | RMSE |
HPO Random Forest | ≈4,999,334 | 0.97 | 1.72 | 0.82 | 11.88 | 0.92 | 5.58 | 1 |
Random Forest | ≈1,246,877 | 0.97 | 1.84 | 0.82 | 12.25 | 0.91 | 5.81 | 2 |
HPO Neural Network | 9857 | 0.86 | 9.53 | 0.74 | 17.36 | 0.83 | 11.36 | 3 |
Neural Network | 2881 | 0.61 | 26.41 | 0.54 | 30.36 | 0.59 | 27.26 | 4 |
HPO Support Vector Machine | ≈25,145 | 0.53 | 31.97 | 0.48 | 34.75 | 0.51 | 32.61 | 5 |
Support Vector Machine | ≈29,697 | 0.48 | 34.88 | 0.44 | 37.05 | 0.47 | 35.35 | 6 |
Linear Regression | 12 | 0.28 | 48.68 | 0.23 | 51.11 | 0.27 | 49.03 | 7 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).