A Reproducible QA/QC, Imputation and Robust-Series Workflow for Air-Quality Monitoring Time Series

Fernández Palomares, Nuria; Álvarez de Prado, Laura; Menéndez García, Luis Alfonso; Fernández López, David; Buján, Sandra; Bernardo Sánchez, Antonio

doi:10.3390/app16073396

Open AccessArticle

A Reproducible QA/QC, Imputation and Robust-Series Workflow for Air-Quality Monitoring Time Series

by

Nuria Fernández Palomares

¹,

Laura Álvarez de Prado

^1,*

,

Luis Alfonso Menéndez García

²

,

David Fernández López

³

,

Sandra Buján

¹

and

Antonio Bernardo Sánchez

^1,*

¹

Department of Mining Technology, Topography and Structures, University of Leon, 24071 Leon, Spain

²

Department of Mathematics, University of Oviedo, 33007 Oviedo, Spain

³

INREMIN S.L., 24008 Leon, Spain

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2026, 16(7), 3396; https://doi.org/10.3390/app16073396

Submission received: 3 March 2026 / Revised: 26 March 2026 / Accepted: 28 March 2026 / Published: 31 March 2026

(This article belongs to the Section Environmental Sciences)

Download

Browse Figures

Versions Notes

Abstract

This study develops a reproducible and auditable workflow to prepare regulatory air-quality monitoring time series for subsequent temporal analysis, including observational PRE/POST applications around coal-fired power plant closures in northwestern Spain. The dataset comprises daily concentrations from 28 monitoring stations (2006–2023) for PM₁₀, PM_2.5, NO, NO₂, NO_x, O₃, SO₂, and CO, affected by missingness, structural inconsistencies, and extreme values. The contribution of this study lies in integrating standardized data ingestion and QA/QC chained-equation imputation with Bayesian Ridge regression, hold-out validation, physicochemical consistency checks, and robust extreme-value handling within a traceable processing workflow. Missing values are reconstructed per pollutant using plant-level multi-station pooling to improve stability. Performance is evaluated using a 5% masked hold-out and summarized with MAE, RMSE, R², and bias, complemented by an operational fit-quality label. Post-imputation controls enforce NO–NO₂–NO_x consistency and the physical constraint PM_2.5 ≤ PM₁₀, while extreme values are screened through a hierarchical robustness framework combining a Hampel filter, winsorization, and a Tukey IQR criterion. The workflow outputs documented diagnostics and robust daily series while preserving the traceability of observed values, flags, edits, and final decisions.

Keywords:

air quality; regulatory monitoring; time series; quality assurance/quality control (QA/QC); missing data; chained-equation imputation; Bayesian Ridge regression; hold-out validation; outlier detection; regulatory data preprocessing

1. Introduction

Air-quality monitoring networks provide long-term time series that enable the assessment of environmental changes and support observational analyses linked to energy transitions and decarbonization processes under Spain’s National Energy and Climate Plan (PNIEC) [1,2]. In practice, however, regulatory series frequently contain missing values, structural inconsistencies (e.g., incomplete months or irregular calendars), and extreme observations associated with instrument downtime, maintenance, operational incidents, transmission failures, or genuine pollution episodes [3,4,5,6]. If these issues are not handled through transparent and consistent preprocessing, they can reduce temporal comparability, introduce bias, increase uncertainty, and weaken the robustness of downstream temporal analyses, including PRE/POST designs [7].

A broad body of the literature addresses missing-data imputation and outlier detection in environmental and air-quality time series, ranging from classical statistical approaches to more recent time-series and machine-learning methods [7,8,9,10]. However, published studies often address only part of the preprocessing problem—for example, focusing exclusively on imputation or exclusively on extreme values—or rely on procedures that are difficult to reproduce, audit, and maintain when scaled to heterogeneous regulatory networks [6,7,11,12]; recent work has also emphasized the importance of traceable and reproducible data streams in environmental science [13]. In observational settings, this makes three requirements particularly relevant: first, ensuring traceability and data governance through explicit QA/QC procedures and documented validation within the European monitoring framework [14,15,16]; second, quantitatively assessing imputation performance using masking schemes and widely used error/agreement metrics to avoid overinterpretation of fit [17,18,19]; and, third, preserving physicochemical and operational consistency among related variables, including internal consistency within the NO–NO₂–NO_x system and physical constraints such as PM_2.5 ≤ PM₁₀, to avoid implausible values after imputation or extreme-value treatment [20,21].

This article presents and validates a reproducible procedure to transform daily series from regulatory networks into datasets ready for temporal analysis. The contribution of the study lies not in proposing a new imputation algorithm or a new outlier detector, but in integrating the required preprocessing stages within a coherent, reproducible, and auditable workflow. The workflow integrates (i) quality assurance and quality control (QA/QC) checks and structural harmonization through systematic cleaning and verification processes, in line with European frameworks for the validation, comparability, and reporting of air quality data [15,16], complemented by applied evidence on incident detection and anomalous behavior in monitoring networks [6]; (ii) multivariate missing-data reconstruction using an iterative chained-equation imputation scheme with Bayesian Ridge regression, implemented within a MICE-based framework [22,23,24,25,26]; (iii) hold-out validation using MAE, RMSE, R², and bias to quantify reconstruction performance and reduce biased interpretations of fit [7,19]; (iv) post-imputation controls based on physicochemical consistency among related variables, including the NO–NO₂–NO_x system and the physical constraint PM_2.5 ≤ PM₁₀ [20,21]; and (v) robust handling of extreme values and generation of a final daily series with traceable decisions for retention, adjustment, or removal, supported by criteria commonly used in environmental data processing [12,27,28].

The procedure is applied to a case study focused on the areas of influence for coal-fired power plants in northwest Spain, within a context of energy transition and the progressive cessation of coal-based electricity generation [1,2]. This study adopts a PRE/POST observational design, supported by the ability of long-term monitoring networks to detect environmental change provided that the underlying series are made comparable through reproducible QA/QC, imputation and extreme-value treatment [29,30]. The analysis considers daily concentrations of the main-criteria pollutants of regulatory interest—PM₁₀, PM_2.5, NO, NO₂, NO_x, O₃, SO₂, and CO—whose monitoring is prioritized in European air-quality assessment frameworks because of their environmental and regulatory relevance [31]. In Spain, these pollutants are covered by national air-quality legislation and by the official instruments used for atmospheric pollution assessment and control [32,33]. This article is organized as follows: Section 2 describes the data source and the proposed methodological workflow (spatial selection, QA/QC, hold-out validation, imputation, and robust cleaning); Section 3 presents validation results and derived products; Section 4 discusses methodological implications, limitations, and transferability; and Section 5 summarizes the main conclusions.

2. Materials and Methods

2.1. Station Selection and Spatial Assignment to Power Plants

Stations were selected using a reproducible spatial criterion based on proximity to each coal-fired thermal power plant. The selection was implemented in a GIS environment by creating 10 km buffers around each plant and performing a spatial intersection with the station layer using QGIS Desktop 3.40.12, thereby identifying monitoring stations located within that radius and generating an initial inventory of candidate stations per plant (MITECO Pollutant Release and Transfer Register, PRTR-Spain) [34]. A 10 km radius was adopted as an operational threshold intended to represent the local (near-field) environment of each point source while maintaining a minimum level of station support for the observational design. This distance was retained as a study-specific compromise between three practical requirements: proximity to the emitting facility, sufficient station availability, and control of excessive spatial overlap between neighboring plant buffers. Shorter distances led to very sparse station coverage, whereas larger distances increased overlap and reduced the spatial specificity of plant-level assignment, particularly in the Asturian sector. This operational choice is broadly compatible with near-source framing used in atmospheric modeling guidance and applied technical documents discussing local-scale source influence [35,36]. The resulting approach operationalizes the PRE/POST observational design with an explicit, reproducible, and verifiable spatial rule, while preserving the traceability of station inclusion and exclusion.

Starting from an initial universe of 12 coal-fired power plants within the study area, only those with at least one station within the 10 km buffer were included in the analysis; plants with no stations within the established radius (N stations ≤ 10 km = 0) were excluded (Table 1). As a result, the analytical set comprised ten plants, while two plants (Narcea and Anllares) were excluded from the analysis because they did not have stations within the established radius.

After applying the spatial criterion, the final set comprised 28 regulatory stations with daily time series over 2006–2023, whose characterization and spatial assignment are documented in Table 2 (ID_MAPA, station code, typology, distance to the assigned plant, and number of plants within the radius). Due to overlapping areas of influence, two stations were identified as being simultaneously included in more than one 10 km buffer (N_CT_10 km = 2): ID_MAPA 22 (COD_LOCAL 33031029) and ID_MAPA 24 (COD_LOCAL 33037012) (Table 2). In these cases, multiple membership was retained to preserve assignment traceability and to enable both plant-specific and aggregated area-level analyses. The number of supporting stations also varied across plants, including cases with only one station within the selected radius. Figure 1 shows the locations of the plants, the 10 km buffers, and the selected stations, allowing the verification of the spatial assignment used in the study.

All official monitoring stations located within the 10 km buffer were subsequently carried forward into the cleaning, analytical validation, imputation, and robustification workflow developed in this study. Throughout this process, all series were treated uniformly, without applying specific weighting according to station type, area setting, or distance to the thermal power plant. To preserve traceability and support later context-aware interpretation, the final processed dataset retained descriptors such as station type, area setting, and distance to the thermal power plant (Table 2), so that future air-quality analyses may distinguish, where relevant, among industrial, traffic, and background contexts.

2.2. Data Source and Database Structure

This study uses time series from official air-quality monitoring networks, collected under European quality assurance and quality control (QA/QC) frameworks and comparability and reporting requirements [15,16]. The data were obtained upon request from the Environmental Information Office of the Ministry for the Ecological Transition and the Demographic Challenge (MITECO) as official series aggregated at daily resolution (hourly → daily) for the 2006–2023 period [37].

The analytical dataset includes daily concentrations of PM₁₀, PM_2.5, NO, NO₂, NO_x, O₃, SO₂, and CO. These official series derive from the regulatory monitoring framework, in which gaseous and particulate pollutants are measured using reference methods or methods demonstrated as equivalent under the applicable European and Spanish legislation [15,16]. For the pollutants considered here, the reference measurement principles correspond to ultraviolet fluorescence for SO₂, chemiluminescence for NO/NO₂/NO_x, ultraviolet photometry for O₃, non-dispersive infrared spectroscopy for CO, and gravimetric determination for particulate matter. In operational network practice, official PM series may also derive from equivalent automated methods; in this study, technique codes are used explicitly only when assessing PM_2.5–PM₁₀ comparability for coherence checks (Section 2.7.4).

The processing unit is defined as the pair (station, pollutant), preserving the original reporting scale and its numerical precision.

2.3. Input Structure and Temporal Harmonization

Records were received in a wide tabular format (year/month with daily columns D01–D31) and were harmonized to ensure consistency with the real calendar and to avoid structural inconsistencies that could propagate to subsequent stages. In particular, we verified (i) the validity of days per month, (ii) the uniqueness of dates, and (iii) the absence of duplicates after transformation into continuous series.

Next, the data were transformed into a long daily format (one row per date), preserving the observed value and the PUNTO_MUESTREO identifier to maintain the traceability of the historical record [10]. Discontinuities associated with instrument failures, maintenance downtime, or transmission problems were encoded as missing values and recorded using control indicators (flags) without overwriting the original data.

2.4. Pre-Imputation QA/QC

The QA/QC approach implemented in this study should be distinguished from the primary QA/QC procedures of the official monitoring networks. Starting from already-compiled official daily series, our pre-imputation QA/QC was designed as a reproducible analytical preprocessing step to reduce the propagation of inconsistencies into subsequent reconstruction and robustification stages. Accordingly, it was applied after the official network validation process established within the regulatory air-quality monitoring framework [38]. Station control logs were therefore not incorporated as an additional analytical input in the present workflow.

Applied uniformly across all station–pollutant series, this phase included (i) integrity and consistency checks of metadata and operational identifiers (including PUNTO_MUESTREO) and (ii) basic numerical plausibility controls, with the flagging of negative or otherwise implausible values prior to imputation [6,15]. As a result, each daily observation was associated with a status flag that distinguishes, at minimum, between (i) valid observed value, (ii) missing value, (iii) value excluded due to structural inconsistency, and (iv) value flagged for plausibility. This coding constitutes the traceability backbone of the full workflow, allowing the reconstruction of the original dataset and auditing of any subsequent change.

2.5. Imputation Validation Design (Hold-Out)

To evaluate imputation performance while preventing information leakage, a validation scheme was implemented by masking (hold-out) a fixed fraction of originally valid observations. For each (plant, pollutant) block, 5% of originally observed entries were randomly selected, replaced with missing values, and reserved as the validation set. Imputation was then run on the masked series, and the imputed values at the reserved dates were compared with the original observations to estimate reconstruction error. This design enables performance quantification under controlled conditions that are comparable across series. Because hold-out dates are selected at random, the evaluation approximates a Missing Completely At Random (MCAR)-like setting and therefore mainly reflects reconstruction performance under relatively simple missing-data conditions [4]. In addition, months with severe missingness (≥50%) were excluded before imputation because, at that level of absence, the monthly segment no longer provides sufficient observational support for robust multivariate reconstruction and post-imputation coherence checks. Retaining such periods would have moved the reconstruction closer to weakly supported extrapolation and reduced the homogeneity of internal validation across analytical blocks. Accordingly, the adopted validation design should be interpreted as an internal and operational benchmark rather than as a complete representation of all missing-data scenarios encountered in practice.

Let

x_{t}

be the original observed daily value on day

t

. Let

m_{t}

be a binary masking indicator, with

m_{t} = 1

if day

t

belongs to the hold-out subset and

m_{t} = 0

otherwise. The input series for imputation is defined as shown in Equation (1):

x_{t}^{(m a s k)} = \{\begin{matrix} x_{t} & m_{t} = 0 \\ NA & m_{t} = 1 \end{matrix}

(1)

After imputation, let

{\hat{x}}_{t}

be the imputed value. Quantitative evaluation was summarized using standard metrics computed exclusively on the masked subset

\{t : m_{t} = 1\}

, with

n

reserved observations. Specifically, MAE and RMSE are defined in Equations (2) and (3):

M A E = (\frac{1}{n}) \cdot \sum_{t : m_{t} = 1} |x_{t} - \hat{x_{t}}|

(2)

R M S E = \sqrt{\frac{1}{n} \sum_{t : m_{t} = 1} {(x_{t} - \hat{x_{t}})}^{2}}

(3)

Mean bias (Bias) was computed as the mean signed error, defined in Equation (4):

B i a s = (\frac{1}{n}) \cdot \sum_{t : m_{t} = 1} (\hat{x_{t}} - x_{t})

(4)

The coefficient of determination was also computed on the masked subset, according to Equation (5):

R^{2} = 1 - \frac{\sum_{t : m_{t} = 1} {(x_{t} - \hat{x_{t}})}^{2}}{\sum_{t : m_{t} = 1} {(x_{t} - \bar{x})}^{2}}

(5)

where

\bar{x} = \frac{1}{n} \sum_{t : m_{t} = 1} x_{t}

For operational interpretation and homogeneous comparison across pollutants, stations, and plants, these continuous metrics were further summarized into an ordinal fit-quality classification (1–4) based primarily on the

R^{2}

criterion (Table 3). MAE, RMSE,

R^{2}

, and Bias are also reported as continuous results in Section 3.

In addition to the primary

R^{2}

-based class, complementary diagnostics were inspected to contextualize the results and identify potential issues that may warrant sensitivity analyses: absolute bias

∣ B i a s ∣

, which reflects systematic offset (Equation (6)), and the heavy-tail indicator

H = R M S E / M A E

, which helps to identify the presence of occasional large reconstruction errors (Equation (7)). These diagnostics are reported alongside the continuous metrics but were not used to assign the ordinal class.

|B i a s| = a b s (B i a s)

(6)

H = \frac{R M S E}{M A E}

(7)

This ordinal classification is subsequently used in the plant–pollutant comparison (), while MAE, RMSE,

R^{2}

, and Bias values are also reported as continuous results in Section 3.

Table 3. Criteria used to classify imputation performance (“fit quality”) based on the 5% hold-out validation scheme. The ordinal fit-quality score (1–4) is assigned using the primary

R^{2}

criterion.

Table 3. Criteria used to classify imputation performance (“fit quality”) based on the 5% hold-out validation scheme. The ordinal fit-quality score (1–4) is assigned using the primary

R^{2}

criterion.

Fit-Quality Category	R² Criterion (Primary)
Excellent (4)	R² ≥ 0.65
Good (3)	0.55 ≤ R² < 0.65
Acceptable (2)	0.45 ≤ R² < 0.55
Low (1)	R² < 0.45

R^{2}

is computed on the masked observations (5% hold-out subset; see Figure 2). If

R^{2}

is negative or undefined (near-zero variance), the fit is classified as Low (1). Fit-quality classes are intended as operational indicators for network management and are not equivalent to precision metrics in predictive modeling.

R^{2} \geq 0.65

is classified as Excellent (4) for regulatory data completeness, not model prediction accuracy.

Figure 2. Phase A of the data-processing workflow: pre-imputation QA/QC, 5% hold-out validation, and chained-equation imputation. The pipeline produces traceable daily time series after hourly-to-daily aggregation, applying pre-imputation checks, masking-based validation, and iterative chained-equation gap filling, followed by basic plausibility constraints prior to Phase B (outlier processing). The main decision paths associated with exclusions, flagging, validation masking, and post-imputation traceability are shown explicitly.

2.6. Imputation of Missing Values Using a Chained-Equation Iterative Scheme (Bayesian Ridge)

Missing values were reconstructed using an iterative chained-equation imputation scheme with Bayesian Ridge regression, implemented in a MICE-based framework [10,23,24,25,26,39]. In the present study, the procedure was used to generate a single completed dataset for each analytical block, rather than a formal multiple-imputation design for uncertainty propagation across several completed datasets.

In a preliminary stage, extending a strictly within-station approach to the full dataset showed limited stability across a part of the network. To improve robustness and reduce between-station variance, we adopted a plant-level multi-station pooling strategy by aggregating observations from all stations located within the 10 km buffer of each power plant. This design was intended to capture the shared temporal signal within each plant-specific spatial context and to stabilize reconstruction in sparse series by borrowing strength from nearby stations. Pooling was implemented within each pollutant, without mixing chemical species, so that each station was treated as a variable (column) in a daily plant-level matrix for the corresponding pollutant. This configuration allows the imputation model to exploit shared temporal covariation among nearby stations while preserving pollutant-specific structure [3,9,40].

No meteorological covariates were included as predictors; reconstruction therefore relies on shared temporal covariation among nearby stations within each plant-specific block. The complete workflow for preprocessing, hold-out validation, and imputation is summarized in Figure 2.

2.6.1. Scope and Imputation Unit: (Plant, Pollutant)

The operational imputation unit was defined as the pair (plant, pollutant). For a plant

c

and a pollutant

p

, a daily matrix was constructed in which each associated station acts as a variable (column). Denoting by

x_{t, s}^{(c, p)}

, the daily concentration on day

t

at station

s

, the imputation block is defined as in Equation (8):

X^{(c, p)} = {[x_{t, s}^{(c, p)}]}_{t = 1 \dots T; s = 1 \dots S_{c}}

(8)

In MICE, each column with missingness is modeled conditionally on the others. For each station

s

, the conditional model is expressed as in Equation (9), using the remaining stations as predictors:

x_{t, s}^{(c, p)} = f_{s} (X_{t, - s}^{(c, p)}) + ε_{t, s}

(9)

In this study, imputation was applied by pollutant and no other pollutants were used as predictors [10,39]. The data were kept within the temporal coverage reported by MITECO for each station, with no extrapolation beyond official coverage. Non-existent calendar days were excluded by construction, and dates without measurements remained coded as missing values.

2.6.2. Preprocessing and Workflow Constraints

Before imputation, months with severe missingness (≥50%) were excluded as an operational constraint to avoid reconstructing extended missing blocks with very limited observational support, a situation that can increase uncertainty and degrade the interpretability of temporal aggregations [7]. This rule was adopted as a conservative preprocessing decision within the workflow. In addition, no auxiliary variables were incorporated as predictors, maintaining a conservative configuration focused on the shared signal across stations.

2.6.3. Imputer Specification (Bayesian Ridge Regression)

Bayesian Ridge regression was used as the estimator within the chained-equation iterative imputation scheme due to its stability in the presence of collinearity and its robust behavior when the available information is partial [23,25]. For a day

t

and a target station

s

, the linear model is expressed as in Equation (10):

x_{t, s} = β_{0} + β^{T} X_{t, - s} + ε_{t}, w h e r e ε_{t} ~ N (0, σ^{2})

(10)

The chained-equation procedure alternates between fitting and imputing each variable with missingness over several iterations. The iterative update of the chained-equations scheme is represented generically as in Equation (11):

x_{t, s}^{(k)} \leftarrow g_{s} (X_{t, - s}^{(k)}), for s = 1, \dots, S_{c} and k = 1, \dots, K

(11)

Here,

k

indicates the iteration within the procedure. In this study,

k = 10

iterations per block and a fixed global seed (random_state) were set for reproducibility. Imputed values were stored in a derived field valor_mice, without overwriting the original observed value.

2.6.4. Scale Transformations (Sensitivity Test)

Positive skewness and heteroscedasticity are frequently observed in environmental series, such that a logarithmic transformation can facilitate imputation by approximating multiplicative relationships as additive and stabilizing variability. This behavior has been documented for pollutant distributions and in applications that explicitly use log transformations to correct bias and improve the statistical behavior of variables [41]. In general terms, linear models commonly apply variable transformations to better approximate the assumptions of homoscedasticity and residual normality [42] and, in the context of multiple imputation, it has been noted that treating highly skewed variables as normally distributed can introduce bias and yield implausible values (e.g., negative values). A prior transformation followed by back-transformation has been recommended [43].

Therefore, within the proposed reproducible QA/QC and imputation workflow, an optional sensitivity step based on a logarithmic transformation prior to imputation is included. Here, log denotes the natural logarithm (ln). The transformation and its back-transformation are defined in Equations (12) and (13):

z = l o g (1 + x)

(12)

x = e x p (z) - 1

(13)

In this study, this transformation was not applied in the final configuration nor in the reported results, as imputation on the original scale provided satisfactory performance. In addition, back-transformation may introduce bias on the original scale and smooth extremes; therefore, its use is reserved for cases where it yields a clear improvement in fit (R²) without increasing bias (Bias) [42,43,44].

2.6.5. Post-Imputation Controls and Traceability

After imputation (and, when applicable, after back-transformation), fixed postprocessing constraints were applied to prevent physically implausible results. In particular, negative values were discarded according to the rule given in Equation (14):

{\hat{x}}_{t} < 0 \to D R O P_N A N

(14)

Traceability was preserved by keeping the original observed value immutable and explicitly recording the imputed data in dedicated fields, together with indicators (flags) that allow observed, missing, and reconstructed values to be distinguished. This scheme enables the reconstruction of the original dataset and auditing of any modification throughout the processing workflow.

2.7. Detection and Conservative Treatment of Extreme Values (Outliers) and Construction of Robust Series

After completing the imputation (Section 2.6), a specific stage was applied to identify and treat extreme values in order to reduce the influence of spurious records (e.g., instrumental errors, transmission failures, or validation incidents) without removing real high-concentration episodes [45]. The detection of extreme values is particularly relevant because even a small proportion of atypical observations can distort estimates and affect statistical inference; therefore, their identification and documentation should be part of the preliminary screening of the data [46].

This phase was designed with a conservative and fully traceable approach: the original observed value is not overwritten, extreme-value candidates are flagged through control indicators, and derived series intended for robust analyses are generated [47]. This approach is consistent with common QA/QC practices in regulatory networks and with methodological proposals for quality control and anomaly detection in environmental time series [6,28,48].

2.7.1. Unit of Application and Operational Principle

Phase B was applied to daily series by a station–sampling point–pollutant combination. For each day

t

with observation

x (t)

(field VALOR), the procedure generates outlier and coherence diagnostics and assigns an audited decision (decision_final, razon_final) that leads to the analytical product VALOR_robusto, while always keeping the original value immutable.

2.7.2. Robust Global Detection of Candidates ()

Primary detection was based on the interquartile range (IQR) method, using a conservative operational threshold with k = 3 to flag severe extremes. As a first screening layer, a robust criterion based on the interquartile range (IQR), originally proposed by Tukey [49], was applied, allowing outlier detection without assuming normality [50]. For each daily series (station–sampling point–pollutant unit), the first and third quartiles,

Q_{1}

(25th percentile) and

Q_{3}

(75th percentile), were computed, and the interquartile range was defined as shown in Equation (15):

I Q R = Q_{3} - Q_{1}

(15)

Based on the IQR, global robust bounds were established as shown in Equations (16) and (17):

L = Q_{1} - k \cdot I Q R

(16)

U = Q_{3} + k \cdot I Q R

(17)

where

k

controls the severity of the criterion. In this work,

k = 3

was set, and any day with observation

x (t)

that satisfied Equation (18) was labeled as an IQR outlier candidate when

(x (t) < L) \lor (x (t) > U)

(18)

The choice of k = 3 in Equations (16) and (17) aims for a conservative approach that prioritizes the detection of clearly extreme deviations, compatible with both real pollution episodes and isolated measurement artefacts, while minimizing false positives in series with high intrinsic variability. This type of robust threshold has been used in quality control of comparable environmental series and in monitoring-network studies (e.g., for ozone), where the distribution may be skewed and heavy-tailed [30,51].

Records flagged by IQR were retained as contextual flags (is_outlier_IQR) for subsequent integration into the final decision logic (Figure 3), without modifying the original value at this stage.

2.7.3. Local Diagnosis and Conservative Treatment (Hampel + Winsorization)

To complement the global IQR screening (Section 2.7.2), a local and robust diagnosis based on the Hampel filter was applied [52]. This approach evaluates each daily observation against its nearby temporal context and is suitable for series with skewness, seasonality, and heavy tails, which are common in atmospheric pollutants. The diagnosis is constructed from a moving temporal reference and a robust scale.

Let

x_{t}

be the daily concentration on day

t

. Let

{\tilde{x}}_{t}

be the centered rolling median (31-day window; center = True), and let

{M A D}_{t}

be the median absolute deviation from

{\tilde{x}}_{t}

, defined in Equation (19):

{M A D}_{t} = m e d i a n (|x_{t + i} - {\tilde{x}}_{t}|) for i in the local window

(19)

The local robust scale was defined as in Equation (20):

s_{t} = 1.4826 \cdot {M A D}_{t}

(20)

The factor 1.4826 is the usual consistency coefficient under normality and expresses

s_{t}

on a scale comparable to a robust standard deviation. The Hampel statistic (local robust z-score) was defined as in Equation (21):

z_{t} = \frac{x_{t} - {\tilde{x}}_{t}}{s_{t}}

(21)

A value was considered a local extreme candidate when it met the threshold shown in Equation (22):

|z_{t}| > λ

(22)

In this work,

λ = 6

was used to identify clearly extreme deviations under a conservative approach [52,53].

The rolling median

{\tilde{x}}_{t}

and

M A D_{t}

were computed requiring a minimum of 15 valid observations within the window (min_periods = 15). If this minimum is not reached or if

s_{t} = 0

, the diagnosis is considered non-informative for that day and winsorization is not applied.

Instead of automatically removing Hampel-detected extremes, a conservative winsorization step was applied [47]. This step limits the influence of isolated peaks without altering the temporal structure of the series.

For each day

t

, a symmetric local acceptance interval around

{\tilde{x}}_{t}

was defined, as indicated in Equations (23) and (24):

L_{H (t)} = {\tilde{x}}_{t} - 6 \cdot s_{t}

(23)

U_{H (t)} = {\tilde{x}}_{t} + 6 \cdot s_{t}

(24)

When the day meets Equation (22) (with

λ = 6

) and exceeds the applicable coherence barriers (Section 2.7.4), the winsorized value is computed by clipping according to Equation (25):

{x^{*}}_{t} = c l i p (x_{t}, L_{H (t)}, U_{H (t)}), with c l i p (a, L, U) = m i n (m a x (a, L), U)

(25)

This clipping is symmetric and avoids directional bias, while limiting only the most extreme departures from the local reference [46,54].

The winsorized value is stored in the auxiliary field valor_winsor and did not overwrite VALOR. The construction of the final analytical product was defined as in Equation (26):

{V A L O R}_{r o b u s t o} (t) = x_{t} if d e c i s i o n_f i n a l = K E E P

{V A L O R}_{r o b u s t o} (t) = {x^{*}}_{t} if d e c i s i o n_f i n a l = K E E P - E X T R E M O

(26)

{V A L O R}_{r o b u s t o} (t) = N A if d e c i s i o n_f i n a l = D R O P_N A N

This design maintains full traceability between the observed data, the diagnostic flags, and the final decision.

2.7.4. Physicochemical and Hierarchical Consistency Checks (NO–NO₂–NO_x/PM)

After flagging outlier candidates using robust criteria (IQR/Hampel) and, where applicable, conservative accommodation through winsorization, internal consistency checks between related species were applied to detect physicochemical and hierarchical inconsistencies. These checks were implemented as barriers within the final decision logic of the analytical product VALOR_robusto, using binary indicators (flags) and auditable reasons, without overwriting the original observed value.

1.: Non-negativity (physical–metrological control)

Ambient concentrations cannot be negative; however, negative values may appear in instrumental networks due to noise, drift, or baseline corrections. Therefore, the condition in Equation (27) was verified:

x_{t} \geq 0

(27)

In the analyzed dataset, no negative values were detected; therefore, this control required no intervention (no truncation, recoding, or removal). It is retained as a standard QA/QC check.

2.: Algebraic–operational consistency of the NO–NO₂–NO_x system

Operationally, NO_x represents the set of nitrogen oxides associated with NO and NO₂ (with possible traces arising from the measurement technique). At daily aggregation, NO_x should be, at minimum, consistent with the NO and NO₂ values. The inequality in Equation (28) was evaluated:

N O x_{t} \geq \max (N O_{t}, N O_{2, t}) - ε

(28)

where ε is an instrumental tolerance that absorbs measurement uncertainty, temporal-averaging mismatches, and small rounding errors. In this study, ε = 5 μg m⁻³ was adopted as a conservative threshold. Days that violate Equation (28) are flagged as incoherencia_NO_x = True.

3.: Hierarchical consistency of particulate-matter fractions PM_2.5–PM₁₀

To reduce false positives, a tolerant rule defined in Equation (29) was applied:

PM_2.5,t ≤ PM_10,t + max (5, 0.10 · PM_10,t)

(29)

This tolerance combines an absolute margin (5 µg·m⁻³), which is relevant at low concentrations, and a relative margin (10% of PM₁₀), which is more appropriate at mid–high levels. Days that violate Equation (29) are flagged as incoherencia_PM = True.

4.: Applicability of PM consistency and exceptions due to instrumental non-comparability

PM_2.5–PM₁₀ consistency is only interpretable when both magnitudes are instrumentally comparable. Therefore, the PM barrier was applied only when (a) PM_2.5 and PM₁₀ came from the same PUNTO_MUESTREO, or (b) both measurements belonged to the same instrumental subfamily PM_MASS (TEOM/BAM/GRAV; Table 4). In non-comparable cases, the data were preserved and the exception was recorded as flag_incoherencia_PM_excepcion = True, maintaining traceability and avoiding biases due to instrumental changes. This approach is consistent with conservative QA/QC in real networks, where isolated discrepancies may reflect instrumental changes, different techniques, or measurement uncertainty, and therefore it is preferable to label rather than impose automatic corrections [55,56,57,58,59].

2.7.5. External Plausibility and Regulatory Tagging (Contextual Flags)

In addition to the robust detectors (IQR/Hampel) and internal consistency checks, contextual flags were incorporated to (1) place high values within ranges observed in European monitoring networks and (2) label potential exceedances with respect to regulatory thresholds. These indicators do not modify the daily value and do not act as automatic exclusion rules; their purpose is to support interpretation and traceability.

1.: External plausibility (EU reports/annual thresholds)

Annual “plausible high” thresholds were defined based on European air-quality reports Table 5 [60,61]. Evaluation was performed by station–pollutant–year, provided that annual coverage was sufficient.

In particular, note the following:

Minimum coverage: ≥ 75% of valid days in the year. If not reached, the metric is coded as NA and the information on (N_valid_days and coverage) is retained.
PM₁₀: the annual high regime was computed as p90.4 of daily means, implemented conservatively as the 36^th-highest value (no interpolation). FLAG_PLAUS_PM₁₀_P90_4_GT_75 = True is triggered if p90.4 > 75 µg·m⁻³.
NO₂: the annual mean of daily values was computed. FLAG_PLAUS_NO₂_MEAN_GT_100 = True is triggered if the annual mean > 100 µg·m⁻³.
PM_2.5: the annual mean of daily values was computed. FLAG_PLAUS_PM25_MEAN_GT_30 = True is triggered if the annual mean > 30 µg·m⁻³.
O₃: the usual reference threshold is defined on the daily maximum of 8 h running means; when only daily means are available, this control is considered non-operational and is documented as NA.

External plausibility results were stored as contextual flags (FLAG_PLAUS_*) together with supporting fields (N_DIAS_VALIDOS, COBERTURA_ANUAL, and annual metrics). In the analyzed dataset, the annual plausible-high tags were not triggered; therefore, they were retained only as contextual information and did not alter the outlier-handling logic.

2.: Daily regulatory tagging (context): traceability and output products

Regulatory tagging fields were incorporated to place elevated daily values against legislated thresholds (Directive 2008/50/EC and its national transposition, Royal Decree 102/2011 and subsequent amendments) [31,33,38]. The reference values used and their interpretability with 24 h daily means are summarized in Table 6. Specifically, note the following:

exceso_normativo_diario: indicates that the daily value exceeds a reference threshold applied at daily scale (when interpretable).
normativa_no_evaluable_diario: indicates that evaluation is not applicable at daily scale (e.g., criteria defined on percentiles or annual averages).

These flags are stored together with decision_final and razon_final as contextual and traceability information, but they do not act as automatic rules in the construction of VALOR_robusto.

2.8. Coherence Checks and Final Decision Logic to Construct VALOR_robusto

After imputation (Section 2.6) and the conservative diagnosis/treatment of extremes (Section 2.7), a final coherence-and-decision stage was applied to construct the daily analytical series VALOR_robusto. This stage was designed to meet two objectives: physicochemical plausibility and full traceability. In no case is the original record overwritten: VALOR is kept immutable, and any diagnosis or modification is recorded in separate fields.

To ensure auditability and reproducibility, each daily record incorporates four information blocks: (i) outlier diagnosis, via indicators derived from robust methods (e.g., flag_IQR, z_Hampel, is_extremo_Hampel); (ii) internal coherences, to identify physicochemical or hierarchical incompatibilities (incoherencia_NO_x, incoherencia_PM) and, when instrumental comparability is not guaranteed, the corresponding non-applicability tag (flag_incoherencia_PM_excepcion); (iii) contextual tagging, which labels conditions relevant to interpreting high values without acting as an automatic exclusion rule (exceso_normativo_diario, normativa_no_evaluable_diario, and FLAG_PLAUS_* when sufficient annual coverage exists); (iv) final decision and justification, explicitly recorded via decision_final and razon_final, together with the resulting value stored in VALOR_robusto. In all cases, the original value VALOR remains immutable and the analytical product is obtained exclusively through derived fields. Figure 3 summarizes the logic of this phase and its relationship to the output fields used in subsequent analyses.

2.8.1. Final Decision Rule and Operational Definition of VALOR_robusto

This section defines the final daily rule that transforms the observed value VALOR =

x_{t}

into the analytical value VALOR_robusto by combining (i) robust outlier diagnostics (global IQR and local Hampel diagnosis), (ii) physicochemical coherence checks (NO–NO₂–NO_x and PM_2.5–PM₁₀), and (iii) an audited final decision recorded in the output fields. The semantics, minimum activation conditions, and the effect on VALOR, valor_winsor, and VALOR_robusto are summarized in Table 7, which serves as the single decision dictionary (KEEP, KEEP_EXTREMO, DROP_NAN).

The daily rule is applied following the priority order in Table 7. Preliminary checks are applied: if the day is structurally non-evaluable (calendar/coverage constraints) or if VALOR is missing (NaN), the record is preserved as missing (KEEP as “missing”) and VALOR_robusto is set to NaN.

Next, physical barriers and internal coherences are applied. DROP_NAN is assigned when any of the following conditions are detected (and the corresponding rule is applicable): (a) violated non-negativity (Equation (27)); (b) inconsistency in the NO–NO₂–NO_x system, evaluated with tolerance ε (Equation (28)); and (c) hierarchical PM_2.5–PM₁₀ inconsistency (only if applicable), evaluated using the tolerant rule (Equation (29)). To avoid false positives, the PM_2.5–PM₁₀ coherence check is applied only when instrumental comparability is adequate, given method-dependent uncertainty and potential systematic offsets (e.g., TEOM/BAM/gravimetry).

The local Hampel diagnosis is then evaluated and, when applicable, conservative accommodation via winsorization. The local statistic is defined as in Equation (21), with robust scale

s_{t}

defined in Equation (20) and the local-extreme threshold in Equation (22) with

λ = 6

. When a local extreme is triggered and no incoherences are present, the winsorized value is computed according to Equation (25) (equivalent to clipping between Equations (23) and (24)) and decision_final = KEEP_EXTREMO is assigned. In this case, VALOR_robusto takes the winsorized value according to Equation (26).

Finally, if the record is IQR-only (outside the global bounds, Equation (18), but with no local extreme signal

∣ z_{t} ∣ \leq 6

and no incoherences), the observed value is retained (KEEP) and VALOR_robusto = VALOR (Equation (26)). In the remaining non-extreme cases, the record is kept (KEEP) and VALOR_robusto = VALOR.

Contextual flags (annual external plausibility and daily regulatory tagging) are used only as interpretative and traceability context: they do not trigger automatic exclusions nor modify daily thresholds. The overall process logic is summarized in Figure 3 and allows a reproducible reconstruction of which observations were kept, which were winsorized, and which were discarded, without overwriting the original data [28,47,51].

2.8.2. Output Products and QA/QC Control Plots

After applying the daily decision rule and constructing VALOR_robusto (Section 2.8.1), traceable output products are generated to support (i) subsequent statistical analyses, (ii) auditing of postprocessing, and (iii) visual change control (QA/QC). In all cases, the original value VALOR is kept immutable and results are stored in derived fields and/or output files. Operationally, the information is structured into four auditable blocks (Table 8): outlier diagnostics, internal coherences, contextual tagging, and final decision, which enables reconstruction of the reasoning applied to each observation without overwriting the original data.

In addition, as a visual quality-control step, VALOR vs. VALOR_robusto comparison plots are generated to verify that changes are concentrated in isolated episodes, that winsorization does not introduce artefacts, and that discards respond to physical incoherences. This approach is consistent with conservative QA/QC in real monitoring networks [55,56,57,58,59].

3. Results

The results presented below assess the proposed workflow as an integrated preprocessing framework, focusing on effective coverage after QA/QC, reconstruction performance under hold-out validation, and the extent to which the resulting daily series remain sufficiently traceable, coherent, and robust for subsequent long-term temporal analyses in regulatory air-quality contexts.

3.1. Effective Coverage and Preprocessing Outcome (QA/QC) Prior to Imputation

After converting hourly observations to daily resolution (2006–2023) and applying the QA/QC workflow prior to imputation (Phase A; Section 2.6), the final dataset retained high overall completeness. Months with severe missingness were excluded (threshold: ≥50% missing data; Section 2.6.2), so that subsequent imputation operates mainly as gap-filling rather than extensive signal reconstruction.

Pre-imputation completeness is summarized using the weighted percentage of missingness, defined in Equation (30).

{% M i s s i n g}_{w} = 100 \cdot \frac{\sum M i s s i n g_{d a y s}}{\sum T o t a l_{d a y s}}

(30)

where the sums were computed from monthly missingness summaries at daily resolution. To avoid double counting, overall completeness by pollutant (Table 9) was calculated on unique station–measurement-configuration series (station × measurement configuration/parameterization). In addition, completeness by coal-fired power plant (Figure 4) was estimated through a controlled spatial expansion, in which stations located ≤10 km from more than one plant contribute to each relevant plant, consistent with the study’s spatial design.

At the plant level, the weighted percentage of missingness was moderate and heterogeneous (Figure 4), with values typically in the single-digit range (approximately ~3–9%). This pattern indicates that the final dataset is mostly observed and that imputation acts on localized discontinuities.

At the pollutant level, incompleteness was generally moderate for gases and PM₁₀, whereas PM_2.5 concentrated the largest relative missingness (Table 9). Specifically, PM_2.5 showed the highest weighted percentage (18.43%) and the largest between-series variability (P75 = 32.24%, maximum = 70.55%), consistent with more limited historical availability and/or the later implementation of PM_2.5 measurements across part of the regulatory network. PM₁₀ showed an intermediate level (weighted percentage 6.35%, maximum 53.77%). In contrast, gaseous pollutants remained in a low and relatively narrow range (weighted percentage: CO 5.70%, O₃ 5.13%, NO₂ 4.81%, SO₂ 4.71%, NO 4.56%, NO_x 4.21%), reinforcing that the dataset entering the imputation stage is predominantly complete.

3.2. Hold-Out Validation: Imputation Performance (Phase A)

At the pollutant level (pooled across all power plants), hold-out validation indicates overall stable imputation performance (Table 10). Using the primary R²-based criterion in Table 3, CO, O₃, NO₂ and NO_x achieve an Excellent (4) rating, while PM₁₀, PM_2.5 and NO fall into Good (3). SO₂ exhibits comparatively lower performance (Acceptable (2)), consistent with its intermittency and episodic behavior and the typically weaker inter-station coherence of locally driven primary pollutants. Across most pollutants, Bias remains small in absolute terms, indicating that the reconstruction step does not introduce strong systematic offsets into the completed series. This is relevant for downstream long-term temporal analyses, including observational PRE/POST comparisons, where minimizing artificial level shifts is as important as achieving acceptable pointwise fit.

This pollutant-level summary is spatially disaggregated in the plant–pollutant comparison (Figure 5), where score 3 predominates and quality degradations are bounded and readily localizable to specific plant–pollutant combinations. These localized drops provide a practical basis to identify candidate series for sensitivity analyses and to contextualize subsequent robustification steps.

The R² pattern by pollutant in Table 10 is consistent with differences in the spatio-temporal structure and the primary/secondary character of each species. Ozone (O₃) exhibits a regional component and relatively smooth, predictable variability, modulated by meteorology and seasonality [62]. An O₃ lifetime in the free troposphere on the order of weeks has been described, enabling hemispheric-scale transport and favoring a spatially correlated signal [62]; in graph-based approaches, this spatio-temporal correlation is explicitly assumed to propagate information across neighboring nodes [18]. By contrast, short-lived primary pollutants dominated by local sources—particularly NO and SO₂—tend to exhibit steep near-source gradients and brief peak episodes, reducing inter-station coherence and penalizing reconstruction when emission peaks are randomly masked. In the case of NO₂, despite its generally local character, the pooled performance remains high, which is consistent with the fact that a substantial fraction of the daily variability is still structured (e.g., by weekday–weekend patterns and seasonality), while the most localized extremes remain harder to recover. Nevertheless, NO₂ has been reported to vary over very short distances governed by traffic density, which accentuates spatial heterogeneity and can complicate the reconstruction of sharp episodes at individual stations [28]. Carbon monoxide (CO), with a comparatively long atmospheric lifetime (weeks to months), reflects more persistent, shared signals associated with transport processes, which favors higher explained variance at the daily scale [63]. Overall, these results align with the imputation literature: fit improves under stronger autocorrelation and inter-station covariation, and deteriorates in the presence of spikes, heavy tails, or nonlinear processes, where linear models tend to underestimate extreme values [10,18].

As a qualitative check complementing the aggregated summary (Table 10; Figure 5), a station-level example is included (Figure 6) to verify that imputed values remain within plausible ranges and preserve the dominant temporal structure. For CT_VELILLA, given that it is a single-station plant, the example is directly representative of the aggregated performance: despite the low rating for NO₂ and PM₁₀ (score = 1) (Figure 5), imputations remain within reasonable ranges and follow the general dynamics, suggesting that the degraded fit is driven mainly by isolated discrepancies (e.g., episodes) rather than by out-of-domain imputations. In this sense, the example illustrates that even under degraded fit, the workflow preserves the overall representativeness of the daily series and reduces the risk that isolated discrepancies disproportionately affect subsequent long-term temporal analyses.

3.3. Outlier Screening and Construction of the Robust Series (Phase B)

After imputation and its validation (Phase A; Table 10; Figure 5), Phase B was applied to (i) identify IQR candidates, (ii) characterize Hampel extremes, and (iii) construct the daily series VALOR_robusto with per-observation traceability (Figure 3).

In this first layer, based on a global robust threshold (IQR), 11,051 observations were flagged out of 748,367 observed values (1.48%). This low proportion indicates a targeted intervention concentrated in extreme episodes. The distribution of IQR candidates by plant and pollutant is summarized in Table 11. The CT_VELILLA example (Figure 7) illustrates contrasts across pollutants under the same global criterion, with residual detection in O₃ and higher incidence in species with heavier tails.

The incidence of IQR candidates was clearly pollutant-dependent: it concentrated in NO (3.46%), SO₂ (2.32%), and CO (2.57%), whereas PM₁₀ (0.70%), PM_2.5 (0.54%), and NO₂ (0.37%) showed lower proportions, and O₃ was negligible across species (Table 11).

As a second layer, the local Hampel diagnosis identified Hampel extremes based on the temporal context. Results by plant and pollutant are reported in Table 12, with an example for CT_VELILLA in Figure 8.

Because IQR (global) and Hampel (local) rely on different criteria, they are not necessarily nested and may yield pollutant-specific discrepancies even for the same number of observations. This pattern is evident for CT_VELILLA, where Hampel (Table 12) detects relatively more episodes in PM₁₀ (sensitivity to isolated deviations from the local level), whereas IQR (Table 11) identifies more candidates in NO (more pronounced global tails); detection for O₃ remains residual.

Next, we quantify how these candidates translated into final decisions (KEEP/KEEP_EXTREMO/DROP_NAN) and, in particular, how many observations were accommodated through winsorization versus discarded, following the priority decision rule in Table 7 and as summarized by pollutant in Table 13.

3.4. Final Decision and Generation of VALOR_robusto (Phase B)

After identifying extreme candidates using the global IQR criterion (Table 11; Figure 7) and characterizing local anomalies with the Hampel diagnosis (Table 12; Figure 8), a deterministic and traceable decision rule was applied to construct the final analytical series VALOR_robusto (Figure 3). This stage integrates the statistical evidence of extremes, conservative accommodation when applicable (winsorization), and physicochemical coherence checks across species, so that each observation is associated with auditable “decision_final/razon_final” fields, without overwriting the original record.

In aggregate terms, Phase B affected only a small fraction of the observation set: KEEP = 98.83%, KEEP_EXTREMO = 0.84% (winsorized values), and DROP_NAN = 0.32% (discarded records) of the total daily records analyzed. Percentages are computed over the full set of analyzed daily records across pollutants and may not sum to 100% due to rounding. This limited intervention rate indicates that Phase B acts as a targeted robustness layer rather than as a broad transformation of the reconstructed series. In practical terms, this helps to reduce the influence of high-leverage anomalies on downstream temporal summaries while preserving the overall structure of the daily series.

At the pollutant level, final outcomes are summarized in Table 13 (KEEP (%) can be obtained by complement). Three patterns are useful to interpret the effect of Phase B:

Winsorization dominates in species with a higher extreme burden: NO (KEEP_EXTREMO = 2.0386%) and SO₂ (2.022%), with DROP_NAN ~0.33% in both cases (Table 13).
A closer balance between winsorization and discarding is observed for PM: PM₁₀ (0.4112% vs. 0.3096%) and PM_2.5 (0.4128% vs. 0.2838%) (Table 13).
Discarding exceeds winsorization in species with residual winsorization: O₃ (0.0263% vs. 0.4051%) and, to a lesser extent, NO₂ (0.1943% vs. 0.3566%) (Table 13).

Internal coherence barriers were integrated as constraints in the final decision to avoid statistically admissible but physically inconsistent adjustments (Table 8; Section 2.8). This includes coherence within the NO/NO₂/NO_x family and PM_2.5–PM₁₀ coherence. For PM, exception handling is explicitly incorporated when instrumental comparability is not guaranteed, preventing spurious flags driven by method-related offsets rather than true physical inconsistency. Contextual flags (annual plausibility and daily regulatory tagging) are retained as interpretive layers and do not act as automatic exclusion rules (Section 2.8).

3.5. Station-Level Example: Traceability and Contextual Plausibility (34080004_CT_VELILLA)

All Phase B products and outputs were generated systematically for the 28 study stations; however, to maintain coherence and comparability with the rest of the article, only the illustrative example of 34080004_CT_VELILLA is presented here, following the same approach applied in the previous sections.

To assess the local effect of Phase B, station 34080004_CT_VELILLA is examined. The comparison between VALOR and VALOR_robusto (showing only days with changes) indicates that the intervention was sporadic and concentrated in isolated episodes, without altering the overall temporal structure (seasonality and background level) (Figure 9).

For the 2015–2023 period, 19,360 daily records (six pollutants) were evaluated. At this station, Phase B produced no discards (DROP_NAN = 0). All modifications were resolved exclusively through winsorization (KEEP_EXTREMO = 57; 0.29%), while the remaining records were kept as KEEP (19,303; 99.71%). By pollutant, changes were concentrated in PM₁₀ (42/3318; 1.27%), followed by SO₂ (6/3259; 0.187%) and NO (5/3287; 0.15%). For NO_x, changes were residual (3/2922; 0.1027%). For NO₂, a single adjustment was observed (1/3287; 0.03%). For O₃, no changes were recorded (0/3287; 0.00%). This summary is reported in Table 14.

In PM₁₀, the changes correspond to high-magnitude peaks. The original daily maximum reached 404 µg·m⁻³, whereas the robust value was capped at 74.71 µg·m⁻³ in the most extreme episode. These adjustments are consistent with a conservative intervention that limits the influence of isolated peaks without modifying the background level (Figure 9).

As a contextual layer, flag_PM₁₀_gt_75 was triggered on five days during the study period. In all cases, Phase B applied KEEP_EXTREMO. The corresponding dates and (VALOR, VALOR_robusto) pairs are reported in Table 15.

Regarding internal coherence, no violations of the NO_x barrier were detected on days with simultaneous measurements (2922 days with NO_x, NO, and NO₂ available). PM_2.5–PM₁₀ coherence was not evaluable at this station due to the absence of PM_2.5 during the analyzed period.

Overall, 34080004_CT_VELILLA confirms the operational goal of Phase B: to avoid data loss, concentrate intervention on a small number of days, and reduce the influence of extreme episodes on subsequent metrics, while maintaining traceability and a strict separation between VALOR and VALOR_robusto. In this sense, the CT_VELILLA example illustrates that the robustification stage is intended not to suppress genuine temporal variability, but to preserve series representativeness by reducing the disproportionate influence of isolated anomalies on subsequent long-term temporal interpretation.

While the CT_VELILLA example provides station-level qualitative evidence of traceability and conservative intervention, broader applicability is constrained by data resolution and modeling choices; these aspects are discussed in Section Limitations and Transferability.

4. Discussion

This study proposes a conservative, audit-oriented preprocessing layer for regulatory daily air-quality time series, designed to support PRE/POST observational analyses around coal power plant closures. Rather than optimizing for marginal gains in single-metric predictive accuracy, the workflow prioritizes data governance: preservation of the original record (VALOR) as immutable, explicit and reproducible rules for each intervention, and the storage of flags and justifications that allow every modification to be traced and audited. This design choice is particularly relevant for long-term environmental monitoring networks, where comparability across stations, periods, and pollutants is as critical as pointwise reconstruction accuracy.

At the pollutant level (Table 10), the 5% hold-out validation shows generally stable performance under daily aggregation, with differences across species consistent with their spatio-temporal structure and primary/secondary character. Figure 5 further disaggregates this summary by plant–pollutant combinations, where degradations remain bounded and readily localizable, providing a practical basis to identify candidate series for targeted sensitivity analyses.

Across several pollutants, MICE shows a modest negative Bias (Table 10 and Table S1), consistent with a conservative smoothing of episodic peaks under pooled reconstruction; this should be considered when interpreting peak-driven PRE/POST indicators.

Method selection should therefore be interpreted in relation to the overall workflow architecture, and not only to marginal differences in single-metric performance under the adopted hold-out benchmark. As an additional sanity check, Supplementary Table S1 and Figure S1 benchmark the proposed imputation step against a univariate linear-interpolation baseline on the same 5% random hold-out mask. As expected under MCAR-like short-gap masking, linear interpolation can match or outperform multivariate models for several pollutants, particularly under short and relatively simple missing-data conditions, as also reported in previous comparative studies [10]. In this work, MICE is retained primarily for its role within the broader audit-ready workflow (QA/QC + physicochemical consistency checks + traceable robustification) and because the benchmark adopted here does not fully represent the longer and more complex missingness patterns that may occur in regulatory monitoring networks. In this sense, method selection must also be interpreted in light of traceability and reproducibility requirements within environmental data workflows [13].

Limitations and Transferability

Several limitations should be noted. First, the analysis is conducted at daily resolution, which is appropriate for long-term PRE/POST designs but does not allow a direct evaluation of regulatory indicators that depend on hourly data or moving averages (e.g., 8 h metrics). However, daily aggregation provides a temporally homogeneous basis for long-term comparisons, which is consistent with the intended downstream use of the processed series in extended observational analyses. Second, the imputation models do not include meteorological covariates; while this choice improves portability and reduces external data dependencies, it may limit reconstruction during meteorology-driven episodes (e.g., secondary formation or stagnation events). Although meteorological covariates were not included, the plant-level pooling strategy is intended to partially mitigate this limitation by capturing shared regional covariation across stations within each buffer. Third, the 5% random hold-out validation approximates a Missing Completely At Random (MCAR) masking mechanism, which does not explicitly model the Missing Not At Random (MNAR) missingness that may occur in real networks (e.g., during extreme events or instrument downtime). Consequently, the benchmark should be interpreted primarily as a controlled internal validation for short random gaps, rather than as a full characterization of performance under all missingness mechanisms.

Despite these limitations, the workflow is highly transferable because interventions are encoded through explicit rules, flags, and stored justifications, enabling reproducible application to other monitoring networks. Key parameters (e.g., Hampel threshold λ, IQR factor k, plausibility/coherence rules, and decision logic) can be tuned to local network characteristics without changing the overall structure of the pipeline. This supports method reuse across regions while preserving auditability and comparability.

5. Conclusions

We present a reproducible, audit-ready workflow to preprocess regulatory daily air-quality time series for subsequent long-term temporal analyses, including observational PRE/POST applications around coal power plant closures. The core contribution lies not in the individual preprocessing algorithms, but in a traceability-first processing architecture that preserves the original record as immutable (VALOR) while generating fully documented derived products—including a robust daily series (VALOR_robusto)—so that each edit is associated with an explicit decision label and stored justification.

Methodologically, the framework integrates multivariate imputation (MICE with Bayesian Ridge, implemented per pollutant with plant-level multi-station pooling) with physicochemical consistency checks (e.g., NO–NO₂–NO_x and PM_2.5–PM₁₀, when applicable) and a traceable robustification stage (Phase B). Empirically, robustification operates as a targeted intervention rather than a systematic transformation of the signal: across the full analyzed daily dataset (all plants and pollutants), 98.83% of daily records remain unchanged, 0.84% are conservatively accommodated via winsorization (KEEP_EXTREMO), and 0.32% are discarded (DROP_NAN) due to physical implausibility and/or coherence violations. A station-level illustration (CT_VELILLA) further supports this behavior, showing that modifications concentrate on isolated episodes while preserving background levels and seasonal structure. Overall, the proposed approach provides a transferable preprocessing layer for regulatory networks affected by missingness, structural inconsistencies, and extreme values, reducing the risk that data gaps, inconsistencies, or high-leverage anomalies condition the interpretation of the processed series in later temporal applications.

Finally, this audit-ready preprocessing layer provides the traceability, coherence, and robustness required for subsequent long-term air-quality analyses in the context of energy transition and decarbonization. Future work will extend the framework toward (i) higher temporal resolution (hourly and 8 h metrics), (ii) the inclusion of meteorological covariates and alternative missingness scenarios beyond MCAR masking (e.g., prolonged outages), and (iii) sensitivity analyses of key thresholds and baseline comparisons to further quantify robustness across diverse regulatory networks.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app16073396/s1, Table S1: Minimal benchmark of the imputation step (Phase A) on the 5% hold-out subset (pooled across all power plants): MICE (Bayesian Ridge) versus a univariate linear-interpolation baseline. Figure S1: RMSE and

R^{2}

by pollutant for the same benchmark.

Author Contributions

Conceptualization, N.F.P., L.Á.d.P. and A.B.S.; methodology, N.F.P., L.Á.d.P., L.A.M.G. and A.B.S.; software, N.F.P., L.Á.d.P., S.B. and A.B.S.; validation, N.F.P., L.Á.d.P. and A.B.S.; formal analysis, L.A.M.G., D.F.L. and S.B.; investigation, N.F.P., L.Á.d.P., S.B. and A.B.S.; resources, N.F.P., L.Á.d.P. and A.B.S.; data curation, N.F.P., L.Á.d.P., D.F.L. and A.B.S.; writing—original draft preparation, N.F.P.; writing—review and editing, L.Á.d.P. and A.B.S.; visualization, L.A.M.G., D.F.L. and S.B.; supervision, L.Á.d.P., L.A.M.G. and A.B.S.; project administration, L.Á.d.P., D.F.L. and A.B.S.; funding acquisition, A.B.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study were obtained from the Spanish Ministry for the Ecological Transition and the Demographic Challenge (MITECO) upon request. Derived data supporting the findings of this study are available from the corresponding authors upon reasonable request.

Conflicts of Interest

Author David Fernández López is employed by the company INREMIN S.L. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Gobierno de España Plan Nacional Integrado de Energía y Clima (PNIEC) 2021–2030. Available online: https://www.miteco.gob.es/content/dam/miteco/images/es/pnieccompleto_tcm30-508410.pdf (accessed on 27 January 2026).
Gobierno de España Plan Nacional Integrado de Energía y Clima (PNIEC): Actualización 2023–2030. Available online: https://www.miteco.gob.es/content/dam/miteco/es/energia/files-1/pniec-2023-2030/PNIEC_2024_240924.pdf (accessed on 27 January 2026).
Gómez-Carracedo, M.P.; Andrade, J.M.; López-Mahía, P.; Muniategui, S.; Prada, D. A Practical Comparison of Single and Multiple Imputation Methods to Handle Complex Missing Data in Air Quality Datasets. Chemom. Intell. Lab. Syst. 2014, 134, 23–33. [Google Scholar] [CrossRef]
Junger, W.L.; Ponce de Leon, A. Imputation of Missing Data in Time Series for Air Pollutants. Atmos. Environ. 2015, 102, 96–104. [Google Scholar] [CrossRef]
Rodríguez, S.; López-Darias, J. Extreme Saharan Dust Events Expand Northward over the Atlantic and Europe, Prompting Record-Breaking PM₁₀ and PM_2.5 Episodes. Atmos. Chem. Phys. 2024, 24, 12031–12053. [Google Scholar] [CrossRef]
Wu, H.; Tang, X.; Wang, Z.; Wu, L.; Lu, M.; Wei, L.; Zhu, J. Probabilistic Automatic Outlier Detection for Surface Air Quality Measurements from the China National Environmental Monitoring Network. Adv. Atmos. Sci. 2018, 35, 1522–1532. [Google Scholar] [CrossRef]
Hadeed, S.J.; O’Rourke, M.K.; Burgess, J.L.; Harris, R.B.; Canales, R.A. Imputation Methods for Addressing Missing Data in Short-Term Monitoring of Air Pollutants. Sci. Total Environ. 2020, 730, 139140. [Google Scholar] [CrossRef] [PubMed]
Chen, M.; Zhu, H.; Chen, Y.; Wang, Y. A Novel Missing Data Imputation Approach for Time Series Air Quality Data Based on Logistic Regression. Atmosphere 2022, 13, 1044. [Google Scholar] [CrossRef]
Hua, V.; Nguyen, T.; Dao, M.-S.; Nguyen, H.D.; Nguyen, B.T. The Impact of Data Imputation on Air Quality Prediction Problem. PLoS ONE 2024, 19, e0306303. [Google Scholar] [CrossRef]
Junninen, H.; Niska, H.; Tuppurainen, K.; Ruuskanen, J.; Kolehmainen, M. Methods for Imputation of Missing Values in Air Quality Data Sets. Atmos. Environ. 2004, 38, 2895–2907. [Google Scholar] [CrossRef]
Menéndez García, L.A.; Menéndez Fernández, M.; Sokoła-Szewioła, V.; Álvarez de Prado, L.; Ortiz Marqués, A.; Fernández López, D.; Bernardo Sánchez, A. A Method of Pruning and Random Replacing of Known Values for Comparing Missing Data Imputation Models for Incomplete Air Quality Time Series. Appl. Sci. 2022, 12, 6465. [Google Scholar] [CrossRef]
Zimek, A.; Filzmoser, P. There and Back Again: Outlier Detection between Statistical Reasoning and Data Mining Algorithms. WIREs Data Min. Knowl. Discov. 2018, 8, e1280. [Google Scholar] [CrossRef]
Schmidt, L.; Schäfer, D.; Geller, J.; Lünenschloss, P.; Palm, B.; Rinke, K.; Rebmann, C.; Rode, M.; Bumberger, J. System for Automated Quality Control (SaQC) to Enable Traceable and Reproducible Data Streams in Environmental Science. Environ. Model. Softw. 2023, 169, 105809. [Google Scholar] [CrossRef]
European Environment Agency. Air Quality E-Reporting Submission Procedures for Reporting to Eionet CDR. Available online: https://www.eionet.europa.eu/aqportal/doc/AQ_IPR_submission_procedure_2018.pdf (accessed on 27 January 2026).
European Commission. 2011/850/EU: Commission Implementing Decision of 12 December 2011 laying down rules for Directives 2004/107/EC and 2008/50/EC of the European Parliament and of the Council as regards the reciprocal exchange of information and reporting on ambient air quality (notified under document C(2011) 9068). Off. J. Eur. Union 2011, L 335, 86–106. [Google Scholar]
European Commission. Directive (EU) 2015/1480 of 28 August 2015 amending several annexes to Directives 2004/107/EC and 2008/50/EC of the European Parliament and of the Council laying down the rules concerning reference methods, data validation and location of sampling points for the assessment of ambient air quality. Off. J. Eur. Union 2015, L 226, 4–11. [Google Scholar]
Liu, X.; Wang, X.; Zou, L.; Xia, J.; Pang, W. Spatial Imputation for Air Pollutants Data Sets via Low Rank Matrix Completion Algorithm. Environ. Int. 2020, 139, 105713. [Google Scholar] [CrossRef]
Betancourt, C.; Li, C.W.Y.; Kleinert, F.; Schultz, M.G. Graph Machine Learning for Improved Imputation of Missing Tropospheric Ozone Data. Environ. Sci. Technol. 2023, 57, 18246–18258. [Google Scholar] [CrossRef]
Willmott, C.J.; Matsuura, K. Advantages of the Mean Absolute Error (MAE) over the Root Mean Square Error (RMSE) in Assessing Average Model Performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
Reche, C.; Querol, X.; Alastuey, A.; Viana, M.; Pey, J.; Moreno, T.; Rodríguez, S.; González, Y.; Fernández-Camacho, R.; de la Rosa, J.; et al. New Considerations for PM, Black Carbon and Particle Number Concentration for Air Quality Monitoring across Different European Cities. Atmos. Chem. Phys. 2011, 11, 6207–6227. [Google Scholar] [CrossRef]
World Health Organization. WHO Global Air Quality Guidelines: Particulate Matter (PM2.5 and PM10), Ozone, Nitrogen Dioxide, Sulfur Dioxide and Carbon Monoxide: Executive Summary; World Health Organization: Geneva, Switzerland, 2021; ISBN 978-92-4-003443-3. [Google Scholar]
Little, R.J.A.; Rubin, D.B. Statistical Analysis with Missing Data, 3rd ed.; Wiley: Hoboken, NJ, USA, 2019; ISBN 978-1-118-59569-5. [Google Scholar] [CrossRef]
van Buuren, S. Flexible Imputation of Missing Data, 2nd ed.; Chapman and Hall/CRC: New York, NY, USA, 2018. [Google Scholar] [CrossRef]
Azur, M.J.; Stuart, E.A.; Frangakis, C.; Leaf, P.J. Multiple Imputation by Chained Equations: What Is It and How Does It Work? Int. J. Methods Psychiatr. Res. 2011, 20, 40–49. [Google Scholar] [CrossRef]
Raghunathan, T.E.; Lepkowski, J.M.; van Hoewyk, J.; Solenberger, P. A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models. Surv. Methodol. 2001, 27, 85–95. [Google Scholar]
White, I.R.; Royston, P.; Wood, A.M. Multiple Imputation Using Chained Equations: Issues and Guidance for Practice. Stat. Med. 2011, 30, 377–399. [Google Scholar] [CrossRef]
Dai, X.; Jin, L.; Shi, A.; Shi, L. Outlier Detection and Accommodation in General Spatial Models. Stat. Methods Appl. 2016, 25, 453–475. [Google Scholar] [CrossRef]
van Zoest, V.M.; Stein, A.; Hoek, G. Outlier Detection in Urban Air Quality Sensor Networks. Water Air Soil Pollut. 2018, 229, 111. [Google Scholar] [CrossRef]
European Environment Agency. Air Quality Data Validation: Guidance for Monitoring Networks; EEA: Copenhagen, Denmark, 2020. [Google Scholar]
O’Leary, B.; Reiners, J.J.; Xu, X.; Lemke, L.D. Identification and Influence of Spatio-Temporal Outliers in Urban Air Quality Measurements. Sci. Total Environ. 2016, 573, 55–65. [Google Scholar] [CrossRef] [PubMed]
The European Parliament and the Council Parliament of the European Union. Directive 2008/50/EC of the European Parliament and of the Council of 21 May 2008 on ambient air quality and cleaner air for Europe. Off. J. Eur. Union 2008, L 152, 1–44. [Google Scholar]
Gobierno de España. Plan Nacional de Calidad del Aire 2017–2019 (Plan Aire II); Ministerio de Agricultura, Pesca, Alimentación y Medioambiente: Madrid, España, 2017. [Google Scholar]
Gobierno de España. Real Decreto 102/2011, de 28 de enero, relativo a la mejora de la calidad del aire. Boletín Of. Estado 2011, 25, 9574–9626. [Google Scholar]
Ministerio para la Transición Ecológica y el Reto Demográfico Inventario de Instalaciones—Inventario Completo|PRTR España. Available online: https://prtr-es.miteco.gob.es/Informes/InventarioInstalacionesIPPC.aspx (accessed on 26 January 2026).
Golder Associates Dispersion Modelling Guidance: Determining the Need for Industrial PM10 Offsets Under the National Environmental Standards for Air Quality. Available online: https://www.envirolink.govt.nz/assets/Envirolink/1285-HBRC184-Practical-Guidance-on-Dispersion-Modelling-Determining-the-need-for-PM10-offsets-under-the-NES.pdf (accessed on 1 February 2026).
Environmental Protection Agency Air Dispersion Modelling from Industrial Installations Guidance Note (AG4). Available online: https://www.epa.ie/publications/compliance--enforcement/air/air-guidance-notes/EPA-Air-Dispersion-Modelling-Guidance-Note-(AG4)-2020.pdf (accessed on 27 January 2026).
Datos Horarios de Calidad del Aire—Datos Abiertos MITECO. Available online: https://catalogo.datosabiertos.miteco.gob.es/catalogo/dataset/19458583-9953-4fe7-a494-e2cc26e89e58 (accessed on 26 January 2026).
Gobierno de España. Real Decreto 34/2023, de 24 de Enero, por el que se modifican el Real Decreto 102/2011, de 28 de Enero, relativo a la mejora de la calidad del aire; el Reglamento de emisiones industriales y de desarrollo de la Ley 16/2002, de 1 de Julio, de prevención y control integrados de la contaminación, aprobado mediante el Real Decreto 815/2013, de 18 de Octubre; y el Real Decreto 208/2022, de 22 de Marzo, sobre las garantías financieras en materia de residuos. Boletín Of. Estado 2023, 21, 10326–10348. [Google Scholar]
Quinteros, M.E.; Lu, S.; Blazquez, C.; Cárdenas-R, J.P.; Ossa, X.; Delgado-Saborit, J.-M.; Harrison, R.M.; Ruiz-Rudolph, P. Use of Data Imputation Tools to Reconstruct Incomplete Air Quality Datasets: A Case-Study in Temuco, Chile. Atmos. Environ. 2019, 200, 40–49. [Google Scholar] [CrossRef]
Lu, P.; Deng, S.; Li, G.; Tuheti, A.; Liu, J. Regional Transport of PM2.5 from Coal-Fired Power Plants in the Fenwei Plain, China. Int. J. Environ. Res. Public Health 2023, 20, 2170. [Google Scholar] [CrossRef]
Alsaber, A.R.; Pan, J.; Al-Hurban, A. Handling Complex Missing Data Using Random Forest Approach for an Air Quality Monitoring Dataset: A Case Study of Kuwait Environmental Data (2012 to 2018). Int. J. Environ. Res. Public Health 2021, 18, 1333. [Google Scholar] [CrossRef]
Rodríguez-Barranco, M.; Tobías, A.; Redondo, D.; Molina-Portillo, E.; Sánchez, M.J. Standardizing Effect Size from Linear Regression Models with Log-Transformed Variables for Meta-Analysis. BMC Med. Res. Methodol. 2017, 17, 44. [Google Scholar] [CrossRef]
Sterne, J.A.C.; White, I.R.; Carlin, J.B.; Spratt, M.; Royston, P.; Kenward, M.G.; Wood, A.M.; Carpenter, J.R. Multiple Imputation for Missing Data in Epidemiological and Clinical Research: Potential and Pitfalls. BMJ 2009, 338, b2393. [Google Scholar] [CrossRef] [PubMed]
Duan, N. Smearing Estimate: A Nonparametric Retransformation Method. J. Am. Stat. Assoc. 1983, 78, 605–610. [Google Scholar] [CrossRef]
Pearson, R.K. Outliers in Process Modeling and Identification. IEEE Trans. Control Syst. Technol. 2002, 10, 55–63. [Google Scholar] [CrossRef]
Osborne, J.W.; Overbay, A. The power of outliers (and why researchers should ALWAYS check for them). Pract. Assess. Res. Eval. 2004, 9, 6. [Google Scholar] [CrossRef]
Agathokleous, E.; Xu, T.; Yu, L. Outlier Management in Data Analysis: A Checklist for Authors and Reviewers. J. For. Res. 2025, 37, 28. [Google Scholar] [CrossRef]
Čampulová, M.; Čampula, R.; Holešovský, J. An R Package for Identification of Outliers in Environmental Time Series Data. Environ. Model. Softw. 2022, 155, 105435. [Google Scholar] [CrossRef]
Tukey, J.W. Exploratory Data Analysis; Addison-Wesley: Reading, MA, USA, 1977. [Google Scholar]
Sancho Val, J.; Hernando, C.C.; de Baños, L.M. Functional Data Analysis of Air Quality Time Series in Madrid Using FPCA and Splines. Atmos. Environ. 2026, 367, 121741. [Google Scholar] [CrossRef]
Zuur, A.F.; Ieno, E.N.; Elphick, C.S. A Protocol for Data Exploration to Avoid Common Statistical Problems. Methods Ecol. Evol. 2010, 1, 3–14. [Google Scholar] [CrossRef]
Hampel, F.R.; Ronchetti, E.M.; Rousseeuw, P.J.; Stahel, W.A. Robust Statistics: The Approach Based on Influence Functions; Wiley: Hoboken, NJ, USA, 1986. [Google Scholar] [CrossRef]
Roos-Hoefgeest Toribio, M.; Garnung Menéndez, A.; Roos-Hoefgeest Toribio, S.; Álvarez García, I. A Novel Approach to Speed Up Hampel Filter for Outlier Detection. Sensors 2025, 25, 3319. [Google Scholar] [CrossRef]
Arat, M.M. Detection of Anomalous Nitrogen Dioxide (NO₂) Concentration of A District in Ankara: A Reconstruction-Based Approach. J. Polytech. 2025, 28, 101. [Google Scholar] [CrossRef]
Lagler, F.; Belis, C.; Borowiak, A. A Quality Assurance and Control Program for PM2.5 and PM10 Measurements in European Air Quality Monitoring Networks; Publications Office of the European Union: Luxembourg, 2011; JRC65176, EUR 24851 EN. [Google Scholar] [CrossRef]
Alastuey, A.; Minguillón, M.C.; Pérez, N.; Querol, X.; Viana, M.; Leeuw, F. PM10 Measurement Methods and Correction Factors: 2009 Status Report; European Topic Centre on Air Pollution and Climate Change Mitigation (ETC/ACM): Bilthoven, The Netherlands, 2011. [Google Scholar]
Aggarwal, S.G.; Kumar, S.; Mandal, P.; Sarangi, B.; Singh, K.; Pokhariyal, J.; Mishra, S.K.; Agarwal, S.; Sinha, D.; Singh, S.; et al. Traceability Issue in PM2.5 and PM10 Measurements. MAPAN 2013, 28, 153–166. [Google Scholar] [CrossRef]
Kuhlbusch, T.A.J.; Quincey, P.; Fuller, G.W.; Kelly, F.; Mudway, I.; Viana, M.; Querol, X.; Alastuey, A.; Katsouyanni, K.; Weijers, E.; et al. New Directions: The Future of European Urban Air Quality Monitoring. Atmos. Environ. 2014, 87, 258–260. [Google Scholar] [CrossRef]
Benschop, N.D.; Zewotir, T.; Naidoo, R.N.; North, D. A New Data-Standardization Procedure for Comprehensive Outlier Detection in Correlated Meteorological Sensor Data. Adv. Stat. Climatol. Meteorol. Oceanogr. 2025, 11, 133–158. [Google Scholar] [CrossRef]
European Environment Agency. Air Quality in Europe: 2020 Report. Available online: https://data.europa.eu/doi/10.2800/786656 (accessed on 6 January 2026).
European Environment Agency. Air Quality in Europe—2019 Report; EEA Report No. 10/2019; Publications Office of the European Union: Luxembourg, 2019; ISBN 978-92-9480-088-6. [Google Scholar] [CrossRef]
Monks, P.S.; Archibald, A.T.; Colette, A.; Cooper, O.; Coyle, M.; Derwent, R.; Fowler, D.; Granier, C.; Law, K.S.; Mills, G.E.; et al. Tropospheric Ozone and Its Precursors from the Urban to the Global Scale from Air Quality to Short-Lived Climate Forcer. Atmos. Chem. Phys. 2015, 15, 8889–8973. [Google Scholar] [CrossRef]
Chen, Y.; Ma, Q.; Lin, W.; Xu, X.; Yao, J.; Gao, W. Measurement Report: Long-Term Variations in Carbon Monoxide at a Background Station in China’s Yangtze River Delta Region. Atmos. Chem. Phys. 2020, 20, 15969–15982. [Google Scholar] [CrossRef]

Figure 1. Spatial selection of the study network in NW Spain: 10 coal-fired power plants (CT) with 10 km buffers and 28 regulatory monitoring stations (daily data, 2006–2023). Stations are labeled by ID_MAPA and cross-referenced in Table 2; overlapping buffers imply N_CT_10 km > 1 for some stations. Map produced in QGIS Desktop 3.40.12. Basemap: Bing Maps.

Figure 3. Phase B of the data-processing workflow: robust outlier screening, consistency checks, contextual regulatory tagging, and final decision logic for the construction of VALOR_robusto. The workflow integrates global IQR-based screening, local Hampel diagnosis, conservative winsorization, NO–NO₂–NO_x and PM_2.5–PM₁₀ consistency checks, external plausibility indicators, and contextual regulatory flags, while preserving full traceability of observed values, flags, exceptions, and final decisions.

Figure 4. Pre-imputation data completeness by power plant. Bars show weighted missingness (%Missing_w; Equation (30)).

Figure 5. Overall imputation quality score by power plant and pollutant (5% hold-out validation). Cells show the four-level score (1 = Low, 4 = Excellent) assigned using the primary R² criterion in Table 3.

Figure 6. Station-level visual plausibility checks of MICE imputations (example). Colored lines represent the observed daily values for the pollutant shown in each panel, whereas red crosses denote the corresponding MICE-imputed values. The figure shows the CT_VELILLA single-station example, including low-scoring pollutants (NO₂ and PM₁₀), over the available station record (2015–2023). Plant-level validation is reported in Table 10 and summarized in Figure 5.

Figure 7. IQR screening example for CT_VELILLA shown over the available station record (daily, 2015–2023). Panels show O₃, PM₁₀ and NO; red crosses indicate IQR outliers, and horizontal lines show Q1/Q3 and the IQR bounds.

Figure 8. Hampel screening example for CT_VELILLA (daily, 2015–2023). Red crosses indicate Hampel extremes (|z| > 6); the black line is the 31-day rolling median and dashed lines show ±6·1.4826·MAD thresholds.

Figure 9. CT_VELILLA (34080004)—Original vs. robust daily series (changes only) for PM₁₀, NO, and O₃ (2015–2023). Colored lines show the original daily series; black dots mark original values on days modified in Phase B, and red crosses show the corresponding robust values after winsorization.

Table 1. Coal-related facilities considered in the study and number of monitoring stations within 10 km.

CT_ID	Region (CCAA)	N Station ≤ 10 km	Included
CT_AS_PONTES	Galicia	2	Yes
CT_SABON	Galicia	7	Yes
CT_MEIRAMA	Galicia	2	Yes
CT_COMPOSTILLA	Castilla y León	1	Yes
CT_LA_ROBLA	Castilla y León	2	Yes
CT_VELILLA	Castilla y León	1	Yes
CT_SOTO_RIBERA	Asturias	4	Yes
CT_LA_PEREDA	Asturias	1	Yes
CT_LADA	Asturias	4	Yes
CT_ABONO	Asturias	4	Yes
CT_ANLLARES	Castilla y León	0 *	No
CT_NARCEA	Asturias	0 *	No

* Facilities with N stations ≤ 10 km = 0 were excluded from the analytical dataset.

Table 2. Regulatory monitoring stations selected within 10 km of the included coal-related facilities (daily data, 2006–2023).

ID_MAPA	COD_LOCAL	Station Type	Area Type	CT_ID	DIST_CT_km	N_CT_10 km *
1	15005011	Industrial	Rural	CT_SABON	2.31	1
2	15005012	Industrial	Suburban	CT_SABON	3.34	1
3	15041001	Industrial	Rural	CT_SABON	9.20	1
4	15030021	Industrial	Urban	CT_SABON	6.59	1
5	15030027	Background	Suburban	CT_SABON	9.30	1
6	15030028	Industrial	Suburban	CT_SABON	7.20	1
7	15030001	Traffic	Urban	CT_SABON	7.57	1
8	15059004	Industrial	Rural	CT_MEIRAMA	7.57	1
9	15024001	Industrial	Suburban	CT_MEIRAMA	5.34	1
10	15070010	Industrial	Rural	CT_AS_PONTES	4.54	1
11	15070002	Industrial	Suburban	CT_AS_PONTES	1.90	1
12	24115015	Industrial	Suburban	CT_COMPOSTILLA	7.75	1
13	24134007	Industrial	Rural	CT_LA_ROBLA	1.58	1
14	24134006	Industrial	Suburban	CT_LA_ROBLA	1.47	1
15	34080004	Industrial	Urban	CT_VELILLA	2.76	1
16	33044033	Industrial	Suburban	CT_SOTO_RIBERA	8.56	1
17	33044029	Traffic	Urban	CT_SOTO_RIBERA	5.13	1
18	33044030	Traffic	Urban	CT_SOTO_RIBERA	6.88	1
19	33044032	Background	Urban	CT_SOTO_RIBERA	6.69	1
20	33031032	Background	Urban	CT_LADA	2.20	1
21	33031030	Industrial	Urban	CT_LADA	0.71	1
22	33031029	Industrial	Suburban	CT_LADA	0.65	2 *
23	33060003	Background	Suburban	CT_LADA	9.05	1
24	33037012	Traffic	Urban	CT_LA_PEREDA	3.29	2 *
25	33024032	Background	Suburban	CT_ABONO	4.31	1
26	33024031	Background	Urban	CT_ABONO	5.86	1
27	33024027	Traffic	Urban	CT_ABONO	6.45	1
28	33024025	Traffic	Urban	CT_ABONO	4.76	1

* N_CT_10 km indicates the number of facilities whose 10 km buffer includes the station; N_CT_10 km = 2 reflects buffer overlap.

Table 4. Measurement technique codes used to assess PM_2.5–PM₁₀ comparability, grouped into PM_MASS and SCATTERING families for coherence checks.

ID	Technique	Proposed Family	Use
46	Differential Optical/Optical Scattering	SCATTERING	PM (surrogate, not mass)
47	Oscillating Microbalance (TEOM)	TEOM → PM_MASS	PM mass
49	Beta Attenuation Monitor (BAM)	BAM → PM_MASS	PM mass
50	Gravimetry (filter)	GRAV → PM_MASS	PM mass
54	Nephelometry	SCATTERING	PM (surrogate, not mass)
M	Manual (gravim.)	GRAV → PM_MASS	PM mass

Table 5. External plausibility reference thresholds (EU reports) used as contextual flags (not as exclusion criteria).

Pollutant	Averaging Period/Statistic	Plausible High Reference	Unit
PM₁₀	Annual p90.4 of daily mean (36th highest)	>75	µg·m⁻³
NO₂	Annual mean	>100	µg·m⁻³
PM_2.5	Annual mean	>30	µg·m⁻³
O₃	p93.2 of daily maximum 8 h mean	>160	µg·m⁻³
SO₂	Alert threshold (3 consecutive hours)	500	µg·m⁻³
CO	Daily maximum 8 h running mean	>15	mg·m⁻³

Values from EEA Air Quality in Europe reports [60,61].

Table 6. Regulatory reference values used for daily contextual flags (not formal compliance).

Pollutant	Regulatory Reference (Statistic)	Threshold	Unit	Evaluable	Contextual Flag
PM₁₀	Daily limit value (24 h mean)	50	µg·m⁻³	Yes	exceso_normativo_diario
SO₂	Daily limit value (24 h mean)	125	µg·m⁻³	Yes	exceso_normativo_diario
NO₂	Annual limit value (annual mean)	40	µg·m⁻³	No	normativa_no_evaluable_diario
NO₂	Hourly limit value (1 h)	200	µg·m⁻³	No	normativa_no_evaluable_diario
O₃	Target value (daily maximum of 8 h running mean)	120	µg·m⁻³	No *	normativa_no_evaluable_diario
O₃	Information threshold (1 h)	180	µg·m⁻³	No	normativa_no_evaluable_diario
O₃	Alert threshold (1 h)	240	µg·m⁻³	No	normativa_no_evaluable_diario
CO	Limit value (daily maximum of 8 h running mean)	10	mg·m⁻³	No	normativa_no_evaluable_diario
SO₂	Alert threshold (3 h)	500	µg·m⁻³	No *	normativa_no_evaluable_diario
NO₂	Alert threshold (3 h)	400	µg·m⁻³	No	normativa_no_evaluable_diario

* Not evaluable when only 24 h daily means are available (the legal criterion requires hourly data and/or 8 h running means). These fields are used as contextual daily tags and do not constitute a formal compliance assessment.

Table 7. Priority decision rules to derive VALOR_robusto from daily records (KEEP/KEEP_EXTREMO/DROP_NAN).

Priority	Minimum Trigger (Condition)	DECISION	Final Value	Robust Value
1	Negative or physically impossible value (x_t < 0)	DROP_NAN	NaN	NaN
2	NO_x inconsistency (Equation (28))	DROP_NAN	NaN	NaN
3	PM inconsistency (if applicable *): (Equation (29))	DROP_NAN	NaN	NaN
4	Hampel extreme with ∣z_t∣ > 6 and any inconsistency (NO_x or PM)	DROP_NAN	NaN	NaN
5	Hampel with (\|z_t\| > 6) and no inconsistencies (NO_x or PM)	KEEP_EXTREMO	valor_ winsor	valor_ winsor
6	IQR-only outlier: flag_IQR = True and ∣z_t∣ ≤ 6 and no applicable inconsistencies (NO_x or PM)	KEEP	VALOR	VALOR
7	Missing observation: VALOR is NaN (absence preserved)	KEEP (absence preserved)	NaN	NaN
8	All other cases (non-extreme, no inconsistencies)	KEEP	VALOR	VALOR

* PM applicability: the PM_2.5–PM₁₀ coherence check is evaluated only when PM₁₀ and PM_2.5 share the same PUNTO_MUESTREO or belong to the same PM_MASS subfamily (TEOM/BAM/GRAV). Otherwise, the PM rule is not applied, and an exception flag is stored (e.g., flag_incoherencia_PM_excepcion/excepcion_PM_por_cambio_punto).

Table 8. Auditable field blocks recorded per daily observation: diagnostics, coherence checks, contextual tags, and final decision.

Block	Fields (Examples)	Operational Purpose
Outlier diagnostics	flag_IQR, z_Hampel, is_extremo_Hampel	Identify outlier candidates/extremes using robust criteria (global and local).
Internal coherence checks	incoherencia_NO_x, incoherencia_PM, flag_incoherencia_PM_excepcion	Detect physico-chemical/hierarchical inconsistencies and document non-applicability due to instrumental comparability constraints.
Contextual tagging	exceso_normativo_diario, normativa_no_evaluable_diario, FLAG_PLAUS_ *	Tag regulatory context and external plausibility; not used as an automatic exclusion rule.
Final decision and output	decision_final, razon_final, valor_winsor, VALOR_robusto	Record the audited decision and the resulting value used in the analysis.

* FLAG_PLAUS_ denotes pollutant-specific annual plausibility flags.

Table 9. Global pre-imputation missingness by pollutant (daily data, 2006–2023). Percentiles (P25, P75) are computed across station–parameter series.

Pollutant	N_ Series	Ttal_ Days	Missing_ Days	%_Median_ Missing	%_P25_Missing	%_P75_ Missing	%_Max_ Missing	%_Weighted_ Missing
PM_2.5	17	51,137	9423	5.45	4.23	32.24	70.55	18.43
PM₁₀	30	96,091	6105	4.72	3.68	8.46	53.77	6.35
CO	15	57,489	3275	4.27	3.32	5.71	17.79	5.70
O₃	21	84,590	4342	3.93	3.11	4.54	17.44	5.13
NO₂	28	103,843	4990	3.51	3.16	4.20	17.90	4.81
SO₂	27	101,318	4772	3.52	3.17	4.29	17.53	4.71
NO	28	100,803	4601	3.60	3.28	4.35	13.69	4.56
NO_x	30	115,563	4870	3.52	2.95	4.19	14.20	4.21

Table 10. Hold-out validation performance by pollutant… MAE, RMSE, R², and Bias are computed on masked observations; fit-quality classes follow the primary R²-based criterion in Table 3.

Pollutant	N Validation Pairs	MAE	RMSE	R²	Bias	Overall Fit Quality
NO_x	5357	7.16	13.8	0.691	−1.79	Excellent (4)
SO₂	5445	1.82	4.09	0.533	−0.48	Acceptable (2)
NO	5410	2.62	6.38	0.614	−0.59	Good (3)
NO₂	5074	3.68	5.56	0.691	−0.69	Excellent (4)
CO	2806	0.05	0.09	0.756	0.00	Excellent (4)
O₃	4417	8.00	10.58	0.656	−0.67	Excellent (4)
PM₁₀	4904	4.25	6.46	0.601	−0.76	Good (3)
PM_2.5	1760	2.36	3.48	0.605	−0.37	Good (3)

Table 11. IQR-flagged outlier rates by pollutant and coal power plant (10 km buffer). For each pollutant and plant, the outlier rate is reported as the percentage of observed daily records (%). The table also reports

N_{o u t}

(number of IQR-flagged outliers) and

N_{o b s}

(number of observed daily records); percentages are computed as

100 \cdot N_{o u t} / N_{o b s}

. Plant codes: ABN = CT_ABONO; ASP = CT_AS_PONTES; COM = CT_COMPOSTILLA; LAD = CT_LADA; MEI = CT_MEIRAMA; PER = CT_LA_PEREDA; ROB = CT_LA_ROBLA; SAB = CT_SABON; SOR = CT_SOTO_RIBERA; VEL = CT_VELILLA. “—” indicates that the pollutant was not monitored for that plant.

Table 11. IQR-flagged outlier rates by pollutant and coal power plant (10 km buffer). For each pollutant and plant, the outlier rate is reported as the percentage of observed daily records (%). The table also reports

N_{o u t}

(number of IQR-flagged outliers) and

N_{o b s}

(number of observed daily records); percentages are computed as

100 \cdot N_{o u t} / N_{o b s}

. Plant codes: ABN = CT_ABONO; ASP = CT_AS_PONTES; COM = CT_COMPOSTILLA; LAD = CT_LADA; MEI = CT_MEIRAMA; PER = CT_LA_PEREDA; ROB = CT_LA_ROBLA; SAB = CT_SABON; SOR = CT_SOTO_RIBERA; VEL = CT_VELILLA. “—” indicates that the pollutant was not monitored for that plant.

Pollutant	ABN	ASP	COM	LAD	MEI	PER	ROB	SAB	SOR	VEL
CO	0.66	0.02	—	0.30	—	0.18	—	8.57	0.17	—
N_out	52	1	—	33	—	2	—	1439	30	—
N_obs	7942	6028	—	10,957	—	1096	—	167,99	17,864	—
NO	3.88	1.20	5.61	3.91	2.34	3.79	1.49	4.24	3.14	1.40
N_out	492	123	164	642	137	249	49	1478	561	46
N_obs	12,690	10,225	2922	16,435	5844	6574	3287	34,846	17,895	3287
NO₂	0.06	1.10	1.10	0.26	0.29	0.03	0.55	0.39	0.13	0.37
N_out	6	116	32	42	17	2	22	145	24	12
N_obs	9495	10,591	2922	164,35	5844	6574	4017	37,283	17,895	3287
NO_x	1.68	1.31	1.86	1.43	0.60	1.26	1.19	1.27	1.65	1.13
N_out	213	133	52	235	35	83	39	438	296	33
N_obs	12,690	10,169	2802	16,435	5844	6574	3287	34,540	17,895	2922
O₃	0.00	0.02	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
N_out	0	1	0	0	0	0	0	0	0	0
N_obs	9769	6574	2922	16,405	3287	6544	3287	23,281	17,895	3287
PM₁₀	0.55	2.57	1.40	0.50	1.33	0.34	1.52	0.73	0.27	0.78
N_out	70	93	41	76	34	22	63	241	49	26
N_obs	12,690	3622	2922	15,280	2557	6574	4139	32,846	17,895	3318
PM_2.5	0.12	1.54	—	0.54	—	—	—	0.50	0.28	—
N_out	11	91	—	29	—	—	—	47	17	—
N_obs	9495	5905	—	5419	—	—	—	9343	6150	—
SO₂	0.82	5.51	3.63	0.25	2.76	0.37	0.27	3.41	2.07	0.40
N_out	80	584	106	41	161	24	11	1276	371	13
N_obs	9769	10,591	2922	16,435	5844	6574	4017	37,463	17,895	3259

Table 12. Hampel extreme-value rates by pollutant and coal power plant (10 km buffer). Hampel extremes are defined as days with

∣ z_{t} ∣ > 6

(Section 2.7.3) and are reported as a percentage of observed daily records (%). For each plant–pollutant combination, the table also reports

N_{o u t}

(number of Hampel extremes) and

N_{o b s}

(number of observed daily records); percentages are computed as

100 \cdot N_{o u t} / N_{o b s}

. Plant codes (see Table 11 for full names): ABN, ASP, COM, LAD, MEI, PER, ROB, SAB, SOR, VEL. “—“ indicates that the pollutant was not monitored for that plant.

Table 12. Hampel extreme-value rates by pollutant and coal power plant (10 km buffer). Hampel extremes are defined as days with

∣ z_{t} ∣ > 6

(Section 2.7.3) and are reported as a percentage of observed daily records (%). For each plant–pollutant combination, the table also reports

N_{o u t}

(number of Hampel extremes) and

N_{o b s}

(number of observed daily records); percentages are computed as

100 \cdot N_{o u t} / N_{o b s}

. Plant codes (see Table 11 for full names): ABN, ASP, COM, LAD, MEI, PER, ROB, SAB, SOR, VEL. “—“ indicates that the pollutant was not monitored for that plant.

Pollutant	ABN	ASP	COM	LAD	MEI	PER	ROB	SAB	SOR	VEL
CO	0.31	0.50	—	0.52	—	0.09	—	0.99	0.37	—
N_out	27	30	—	67	—	1	—	185	68	—
N_obs	7942	6028	—	10,957	—	1096	—	16,799	17,864	—
NO	1.82	1.43	1.03	1.05	1.14	0.94	1.73	3.75	1.17	0.24
N_out	233	143	30	206	68	62	57	1328	224	8
N_obs	12,690	10,225	2922	19,599	5844	6574	3287	34,846	17,895	3287
NO₂	0.09	1.12	0.10	0.18	0.15	0.03	0.24	0.20	0.09	0.03
N_out	9	45	3	34	9	2	9	82	14	1
N_obs	9495	4017	2922	19,599	5844	6574	4017	37,283	17,895	3287
NO_x	0.21	0.47	0.14	0.39	0.13	0.30	0.12	0.54	0.26	0.07
N_out	27	42	4	75	8	20	4	211	49	2
N_obs	12,690	10,169	2802	19,568	5844	6574	3287	34,540	17,895	2922
O₃	0.00	0.00	0.07	0.05	0.03	0.00	0.03	0.04	0.05	0.03
N_out	0	0	2	10	1	0	1	10	7	1
N_obs	9769	6574	2922	19,569	3287	6544	3287	23,281	17,895	3287
PM₁₀	0.39	0.72	1.06	0.46	0.74	0.59	0.44	0.47	0.28	1.39
N_out	49	26	31	87	19	39	26	152	46	46
N_obs	12,690	3622	2922	18,291	2557	6574	4139	32,846	17,895	3318
PM_2.5	0.18	1.35	—	0.59	—	—	—	0.78	0.16	—
N_out	17	80	—	51	—	—	—	49	11	—
N_obs	9495	5905	—	8430	—	—	—	9343	6150	—
SO₂	1.01	2.11	1.20	0.83	3.28	0.70	0.87	4.13	0.82	0.15
N_out	99	243	35	162	186	46	40	1530	159	5
N_obs	9769	10,591	2922	19,599	5844	6574	4017	37,463	17,895	3259

Table 13. Phase B final decision outcomes by pollutant (daily records). Counts and percentages of KEEP, KEEP_EXTREMO (winsorized), and DROP_NAN after applying the priority decision rule.

Pollutant	N (Daily Records)	KEEP_EXTREMO (%)	DROP_NAN (%)
CO	23,955	0.6178	0.0918
NO	46,258	2.0386	0.3394
NO₂	43,742	0.1943	0.3566
NO_x	45,878	0.3945	0.3422
O₃	38,016	0.0263	0.4051
PM₁₀	41,343	0.4112	0.3096
PM_2.5	15,504	0.4128	0.2838
SO₂	46,588	2.022	0.3349

Note: KEEP (%) = 100 − KEEP_EXTREMO − DROP_NAN.

Table 14. Phase B decision summary for 34080004_CT_VELILLA (daily records, 2015–2023).

Pollutant	N	KEEP_ EXTREMO_N	KEEP_ EXTREMO_pct	KEEP _N	KEEP _pct
PM₁₀	3318	42	1.27	3276	98.73
NO_x	2922	3	0.10	2919	99.90
NO₂	3287	1	0.03	3286	99.97
NO	3287	5	0.15	3282	99.85
O₃	3287	0	0.00	3287	100.00
SO₂	3259	6	0.18	3253	99.82

Table 15. Positive context flag for 34080004_CT_VELILLA: flag_PM₁₀_gt_75 (PM₁₀ > 75; days only).

PUNTO_MUESTREO	Date	VALOR	VALOR_robusto	flag_PM₁₀_gt_75	Decision
34080004_10_49	23 February 2017	121	44.6868	TRUE	KEEP_EXTREMO
34080004_10_49	27 February 2020	85	28.7912	TRUE	KEEP_EXTREMO
34080004_10_49	15 March 2022	404	74.711	TRUE	KEEP_EXTREMO
34080004_10_49	16 March 2022	156	58.478	TRUE	KEEP_EXTREMO
34080004_10_49	5 October 2022	81	50.5824	TRUE	KEEP_EXTREMO

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fernández Palomares, N.; Álvarez de Prado, L.; Menéndez García, L.A.; Fernández López, D.; Buján, S.; Bernardo Sánchez, A. A Reproducible QA/QC, Imputation and Robust-Series Workflow for Air-Quality Monitoring Time Series. Appl. Sci. 2026, 16, 3396. https://doi.org/10.3390/app16073396

AMA Style

Fernández Palomares N, Álvarez de Prado L, Menéndez García LA, Fernández López D, Buján S, Bernardo Sánchez A. A Reproducible QA/QC, Imputation and Robust-Series Workflow for Air-Quality Monitoring Time Series. Applied Sciences. 2026; 16(7):3396. https://doi.org/10.3390/app16073396

Chicago/Turabian Style

Fernández Palomares, Nuria, Laura Álvarez de Prado, Luis Alfonso Menéndez García, David Fernández López, Sandra Buján, and Antonio Bernardo Sánchez. 2026. "A Reproducible QA/QC, Imputation and Robust-Series Workflow for Air-Quality Monitoring Time Series" Applied Sciences 16, no. 7: 3396. https://doi.org/10.3390/app16073396

APA Style

Fernández Palomares, N., Álvarez de Prado, L., Menéndez García, L. A., Fernández López, D., Buján, S., & Bernardo Sánchez, A. (2026). A Reproducible QA/QC, Imputation and Robust-Series Workflow for Air-Quality Monitoring Time Series. Applied Sciences, 16(7), 3396. https://doi.org/10.3390/app16073396

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Reproducible QA/QC, Imputation and Robust-Series Workflow for Air-Quality Monitoring Time Series

Abstract

1. Introduction

2. Materials and Methods

2.1. Station Selection and Spatial Assignment to Power Plants

2.2. Data Source and Database Structure

2.3. Input Structure and Temporal Harmonization

2.4. Pre-Imputation QA/QC

2.5. Imputation Validation Design (Hold-Out)

2.6. Imputation of Missing Values Using a Chained-Equation Iterative Scheme (Bayesian Ridge)

2.6.1. Scope and Imputation Unit: (Plant, Pollutant)

2.6.2. Preprocessing and Workflow Constraints

2.6.3. Imputer Specification (Bayesian Ridge Regression)

2.6.4. Scale Transformations (Sensitivity Test)

2.6.5. Post-Imputation Controls and Traceability

2.7. Detection and Conservative Treatment of Extreme Values (Outliers) and Construction of Robust Series

2.7.1. Unit of Application and Operational Principle

2.7.2. Robust Global Detection of Candidates ()

2.7.3. Local Diagnosis and Conservative Treatment (Hampel + Winsorization)

2.7.4. Physicochemical and Hierarchical Consistency Checks (NO–NO2–NOx/PM)

2.7.5. External Plausibility and Regulatory Tagging (Contextual Flags)

2.8. Coherence Checks and Final Decision Logic to Construct VALOR_robusto

2.8.1. Final Decision Rule and Operational Definition of VALOR_robusto

2.8.2. Output Products and QA/QC Control Plots

3. Results

3.1. Effective Coverage and Preprocessing Outcome (QA/QC) Prior to Imputation

3.2. Hold-Out Validation: Imputation Performance (Phase A)

3.3. Outlier Screening and Construction of the Robust Series (Phase B)

3.4. Final Decision and Generation of VALOR_robusto (Phase B)

3.5. Station-Level Example: Traceability and Contextual Plausibility (34080004_CT_VELILLA)

4. Discussion

Limitations and Transferability

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.7.4. Physicochemical and Hierarchical Consistency Checks (NO–NO₂–NO_x/PM)