Next Article in Journal
Tracking Consensus for Nonlinear Multi-Agent Systems Under Asynchronous Switching and Undirected Topology
Previous Article in Journal
Mobile App–Induced Mental Fatigue Affects Strength Asymmetry and Neuromuscular Performance Across Upper and Lower Limbs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Resampling Multi-Resolution Signals Using the Bag of Functions Framework: Addressing Variable Sampling Rates in Time Series Data

by
David Orlando Salazar Torres
*,
Diyar Altinses
and
Andreas Schwung
Department of Automation Technology and Learning Systems, South Westphalia University of Applied Sciences, 59494 Soest, Germany
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(15), 4759; https://doi.org/10.3390/s25154759
Submission received: 20 June 2025 / Revised: 29 July 2025 / Accepted: 31 July 2025 / Published: 1 August 2025
(This article belongs to the Section Intelligent Sensors)

Abstract

In time series analysis, the ability to effectively handle data with varying sampling rates is crucial for accurate modeling and analysis. This paper presents the MR-BoF (Multi-Resolution Bag of Functions) framework, which leverages sampling-rate-independent techniques to decompose time series data while accommodating signals with differing resolutions. Unlike traditional methods that require uniform sampling frequencies, the BoF framework employs a flexible encoding approach, allowing for the integration of multi-resolution time series. Through a series of experiments, we demonstrate that the BoF framework ensures the precise reconstruction of the original data while enhancing resampling capabilities by utilizing decomposed components. The results show that this method offers significant advantages in scenarios involving irregular sampling rates and heterogeneous acquisition systems, making it a valuable tool for applications in fields such as finance, healthcare, industrial monitoring, IoT networks, and sensor networks.

Graphical Abstract

1. Introduction

The processing and analysis of signals acquired at varying sampling rates present fundamental challenges across multiple scientific and engineering disciplines, including biomedical monitoring, environmental sensing, industrial diagnostics, IoT networks, and sensor networks. The variability in sampling rates arises due to differences in acquisition hardware, adaptive sensing strategies, and application-specific requirements. For instance, mobile health monitoring devices often adjust acquisition frequencies to optimize battery life, while environmental sensors might increase acquisition frequency during critical events [1]. Standardizing these multi-resolution signals via resampling techniques facilitates cross-comparison, integration, and coherent analysis of heterogeneous datasets [2]. However, resampling introduces challenges, including spectral distortion, aliasing, and phase misalignment, which can significantly affect signal integrity and downstream processing tasks [3,4].
A promising approach for signal modeling and reconstruction is the Bag of Functions (BoF) framework. BoF represents time series as a linear combination of continuous basis functions, such as sinusoids, linear polynomials, step functions, Gaussians, or exponentials, transforming discrete samples into a continuous time signal representation [5,6]. This enables seamless resampling at any resolution, mitigates interpolation artifacts like aliasing, and provides interpretable signal components through a compact parametric form. Unlike classical methods, BoF is data-driven and adaptive, capturing the signal’s underlying structure without rigid sampling assumptions. However, existing BoF approaches assume uniform sampling across the dataset, which limits their applicability in common real-world scenarios involving heterogeneous sampling rates.
In this paper, we propose an adaptive decomposition framework that represents and reconstructs signals without direct interpolation. Our approach decomposes signals into interpretable continuous components while preserving both spectral and temporal characteristics across arbitrary sampling schemes. Instead of interpolating to a fixed grid, we build a functional representation that allows for direct resampling and reconstruction at any desired resolution. This enables seamless integration of signals with different sampling rates and facilitates joint analysis across heterogeneous sources. Our key contributions can be summarized as follows:
1.
We develop an extension of the Bag of Functions approach to process data collected at varying sampling rates.
2.
We propose a scalable decomposition framework based on the Bag of Functions that enables the representation of discrete signals into continuous functions, ensuring the preservation of spectral and temporal characteristics.
3.
We introduce an unsupervised resampling mechanism that standardizes time series with different sampling rates, allowing for a unified representation that facilitates joint analysis and data fusion.
4.
We validate our approach on one synthetic and three real-world datasets, demonstrating its effectiveness in reconstructing signals while preserving statistical and spectral properties across different sampling rates.
The rest of this paper is organized as follows. Section 2 provides a comprehensive review of existing literature, highlighting the theoretical and practical gaps that motivate our contribution. Section 3 details the theoretical basis of the BoF framework, including its machine learning architecture and our proposed technique for handling multi-resolution time series data. Section 4 presents the experimental setup and discusses its performance on synthetic and real-world datasets. Finally, Section 5 summarizes the main findings and proposes possible future research.

2. Related Work

This section examines two research areas fundamental to our work: time series decomposition techniques and multi-resolution signal processing methods. Time series decomposition approaches extract constituent components, revealing essential patterns for analysis. Multi-resolution signal processing addresses the challenges of heterogeneous sampling rates, enabling the unification of signals acquired under varying temporal conditions. These research fields form the base of our proposed Bag of Functions framework.

2.1. Time Series Decomposition

Time series decomposition separates a signal into components such as trend, seasonality, and noise to facilitate analysis. Classical methods, including Seasonal–Trend decomposition using LOESS [7] and ARIMA-based approaches [8,9], are widely used for regularly sampled data due to their effectiveness in structured time series. However, they assume a fixed sampling rate and rely on rigid assumptions about periodicity and stationary, limiting their applicability to irregularly sampled data.
Frequency-domain methods, such as the Fourier and wavelet transforms, introduced multi-scale analysis and are fundamental tools for signal decomposition [10]. However, their application to non-uniformly sampled data requires specialized adaptations. Methods such as non-uniform discrete Fourier transform [11] and adaptive wavelet approaches [12] attempt to address these limitations, yet challenges remain in preserving spectral properties under varying sampling densities.
Empirical Mode Decomposition (EMD) [13] and its variants (e.g., Ensemble EMD, CEEMDAN) iteratively decompose complex, non-linear, and non-stationary signals into a set of intrinsic mode functions, enabling localized time–frequency analysis without requiring a predefined basis [14]. Numerous studies have demonstrated their effectiveness in practical applications, including biomedical signal analysis, fault diagnosis, and environmental data processing [15]. However, EMD can be sensitive to sampling irregularities, often requiring ensemble-based extensions to improve robustness [14].
Recent machine learning innovations have transformed decomposition. Supervised methods, particularly Long Short-Term Memory networks, excel at capturing long-term dependencies, making them ideal for modeling trends and seasonal patterns [16,17]. Nevertheless, these models operate under the assumption of uniform sampling rates. Hybrid models have also emerged, combining machine learning techniques with traditional statistical approaches to improve precision [18,19,20], while residual decomposition architectures further enhance robustness by iteratively refining each component to produce a stable representation of the original signal [6].
Machine learning methods offer an alternative to traditional decomposition techniques, enabling automated feature extraction and capturing non-linear patterns. However, processing time series data with multiple sampling rates remains a critical challenge. To tackle this, we extend the Bag of Functions framework for time series decomposition, incorporating an adaptive approach that allows for the representation and analysis of signals with heterogeneous sampling frequencies.

2.2. Multi-Resolution Signal Processing Methods

Multi-resolution signal processing provides a theoretical foundation for analyzing time series data collected at different sampling rates, a critical aspect of integrating heterogeneous signals for decomposition [21]. Classical resampling and interpolation techniques remain central to this field due to their straightforward conceptual basis and ability to standardize signals effectively in many practical scenarios [22].
Traditional methods rely on simple interpolation strategies, such as connecting data points with straight lines or fitting smooth polynomial curves to approximate signal behavior between samples. These techniques are valued for their ability to align regularly sampled data with minimal complexity, making them enduring tools for basic multi-resolution tasks [12]. Frequency-domain resampling, which adjusts signal representations using spectral transformations, offers another classical approach by preserving key frequency components during rate conversion [23]. More advanced filtering methods, based on finite impulse response principles, further refine this process by reducing aliasing and enhancing signal fidelity [24].
These classical techniques persist because they provide a reliable framework for handling uniform or mildly irregular signals, balancing theoretical simplicity with practical utility in fields like audio analysis and temporal data alignment. However, these approaches face significant limitations when applied to time series with highly variable sampling rates [25]. Linear or polynomial interpolation often assumes signal continuity that may not hold, introducing distortions, while frequency-based methods struggle to adapt to non-uniform sampling densities without compromising spectral integrity [26]. Recent theoretical developments have explored adaptive multi-resolution frameworks to address these challenges, focusing on capturing temporal patterns across diverse scales without relying on rigid pre-processing [27]. Such approaches enhance flexibility by modeling signals at multiple resolutions simultaneously, yet they often prioritize specific use cases, like prediction, over general resampling and decomposition needs, leaving gaps in handling arbitrary sampling frequencies.
To address these shortcomings, we propose to transform signals into a functional space where they are represented as continuous functions, rather than discrete points or frequency bands. This enables seamless resampling to any target frequency while preserving the signal’s intrinsic properties, overcoming the artifacts and assumptions inherent in classical and modern discrete methods.

3. Bag of Functions for Resampling Multi-Resolution Signals

Following the analysis of related work in Section 2, we now present our proposed method for multi-resolution signal resampling, designed to address the limitations identified in prior approaches. Therefore, we begin by defining the problem of resampling multi-resolution signals within the Bag of Functions framework. We then introduce our methodology for reconstructing signals sampled at different frequencies and aligning them to a common target rate. The section begins with a description of the neural network architecture, followed by our proposed approach for ensuring a coherent signal representation across different sampling rates.

3.1. Problem Definition

Let D = { ( x i , f i ) } i = 1 N denote a dataset of N multivariate time series signals, where each signal x i R L i × q is sampled at frequency f i , resulting in L i discrete samples of q sensors. Each discrete signal x i represents samples from an underlying continuous function x i ( t ) such that x i [ n ] = x i ( n T i ) , where T i = 1 / f i is the sampling period. The objective is to obtain from the discrete samples x i [ n ] a continuous representation x ^ i ( t ) that allows resampling at a common target frequency f * . This requires finding a transformation H that can adapt to different sampling frequencies f i and generate a consistent continuous representation x ^ i ( t ) = H ( x i ) . The transformation H must process signals with diverse temporal resolutions, correctly aligning information in the time domain while preserving their fundamental characteristics across different time scales. This continuous signal x ^ i ( t ) enables obtaining the desired discrete signal x i * [ m ] = x ^ i ( m T * ) with the target sampling period T * = 1 f * represented as a vector x i * with the dimensions x i * R L i * × q . To ensure that the transformation and resampling minimize distortion, the optimization objective is to minimize the loss L that evaluates the reconstruction quality and information preservation:
min H i = 1 N L x i , H ( x i ) [ n ] .

3.2. Neural Network Architecture

In this study, we investigate a model architecture designed for the structured decomposition of time series data into its key components while intentionally omitting the stochastic noise. Noise introduces irregular, non-structured fluctuations in the signal, which can mislead the model during resampling across different sampling rates.
The architecture follows a residual design, which incrementally extracts and reconstructs these interpretable signal components using specialized neural submodules. This hierarchical decomposition is achieved through the use of multiple Bag of Functions blocks, as illustrated in Figure 1.
The signal reconstruction proceeds in three BoF submodels. Each BoF block employs the residual information from previous submodels:
x ^ s i = BoF s ( x i )
x ^ e i = BoF e ( x i x ^ s i )
x ^ t i = BoF t ( x i x ^ s i x ^ e i )
This Bag of Functions approach [5], forms the foundation of the model. It enables the neural network to learn interpretable representations by parameterizing basis functions that are then used to synthesize the signal. The base BoF architecture is presented in Figure 2.
A fundamental BoF block operates by first extracting features from an input vector x i using a feature extractor f θ . This process yields a latent parameter vector, z i = f θ ( x i ) , which then parameterizes a predefined collection of A basis functions, denoted as { ϕ a } a = 1 A . Finally, the estimated component, x ^ i , is reconstructed by summing these parameterized functions over a given time interval vector t , as follows:
x ^ i = a = 1 A ϕ a ( t ; z i , a ) .
Here, the feature extractors E θ s , E θ e , and E θ t generate latent representations z s i , z e i , and z t i , respectively, each of which parameterizes a distinct set of basis functions. These basis functions ϕ j , ϕ k , and ϕ l span the space of seasonality, events, and trends.
z s i = E θ s ( x i ) and x ^ s i = j = 1 J ϕ j ( t ; z s i , j ) z e i = E θ e ( x i x ^ s i ) and x ^ e i = k = 1 K ϕ k ( t ; z e i , k ) z t i = E θ t ( x i x ^ s i x ^ e i ) and x ^ t i = l = 1 L ϕ l ( t ; z t i , l )
The final reconstructed signal is obtained by summing the outputs of the three BoF blocks:
x ^ i = j = 1 J ϕ j ( t ; z s i , j ) + k = 1 K ϕ k ( t ; z e i , k ) + l = 1 L ϕ l ( t ; z t i , l ) .
The Bag of Functions approach models signals by parameterizing a predefined set of basis functions, allowing the network to synthesize complex temporal patterns from interpretable components. The number of basis functions directly governs the model’s representational capacity. Increasing this number enhances flexibility and expressive power, but also leads to greater computational cost during both training and inference. Consequently, the selection of basis functions becomes a critical hyperparameter in the BoF framework, analogous to architectural choices in traditional neural networks. Importantly, this selection need not be arbitrary; incorporating domain knowledge can guide the design or restriction of the basis function set, yielding more efficient models that retain high performance while reducing redundancy and computational overhead.
This formulation allows the model to synthesize time series data via a composition of smooth, interpretable basis functions. The explicit exclusion of the noise component ensures that the reconstruction focuses solely on the structured patterns within the signal. Algorithm 1 presents the optimization procedure for the Residual Bag of Functions model with function parameterization.
Algorithm 1 Optimization of Residual BoF Model with function parametrization.
Require: Dataset D = { x i , f i } i = 1 N , basis function families { ϕ j } j = 1 J , { ϕ k } k = 1 K , { ϕ l } l = 1 L , learning rate η , number of epochs M
  1:
Initialize parameters θ s , θ e , θ t
  2:
Repeat until convergence:
  3:
for   t = 1 to M do
  4:
  for each sample ( x i , t i ) D  do                      ▹ 1. Seasonality Component
  5:
           z s i E θ s ( x i )
  6:
           x ^ s i j = 1 J ϕ j ( t i ; z s i , j )
2. Event Component (residual)
  7:
           r e i x i x ^ s i
  8:
           z e i E θ e ( r e i )
  9:
           x ^ e i k = 1 K ϕ k ( t i ; z e i , k )
3. Trend Component (second residual)
10:
           r t i x i x ^ s i x ^ e i
11:
           z t i E θ t ( r t i )
12:
           x ^ t i l = 1 L ϕ l ( t i ; z t i , l )
4. Signal Reconstruction and Loss Computation
13:
           x ^ i x ^ s i + x ^ e i + x ^ t i
14:
           L i x ^ i x i 2 2
15:
  end for                                  ▹ 5. Parameter Update
16:
   θ s θ s η · θ s i = 1 N L i
17:
   θ e θ e η · θ e i = 1 N L i
18:
   θ t θ t η · θ t i = 1 N L i
19:
 end for
20:
 return Trained parameters θ s , θ e , θ t
In the subsequent section, we address the adaptation of this model to input signals with varying sampling rates.

3.3. Resampling Strategy Within the Bag of Functions Framework

To effectively process time series with varying sampling frequencies within the Bag of Functions framework, we implement a strategy that combines interleaved padding and masking. This approach ensures that all input sequences are consistently represented at a target length, thereby facilitating uniform processing by the neural network.
To accommodate signals of different lengths and frequencies, we transform each input time series x i R L i into a tensor x i pad R L target of a predefined target length L target = D · f * N , where D is the maximum duration of the signals in the dataset and f * is the desired target sampling frequency. This is achieved through an interleaved padding process, where the original samples are placed in the tensor at positions corresponding to their original sampling times, and the remaining positions are filled with zeros.
Specifically, for each sample x i , we calculate the position p n of each sample x i [ n ] in the padded tensor using the following formula:
p n = n f i · L target D + 1 2
where n { 0 , , L i 1 } is the sample index, f i is the sampling frequency, and D is the maximum duration of the signal. The padded sequence x i pad R L target is then constructed such that
x i pad [ m ] = x i [ n ] if m = p n for some n { 0 , , L i 1 } 0 otherwise
To prevent the network from learning the padded elements, we generate a binary mask σ i { 0 , 1 } L target that indicates the positions of the original samples. This mask is defined as follows:
σ i [ m ] = 1 if m { p 0 , , p L i 1 } 0 otherwise
During the computation of the loss function L , we apply this mask σ i to the reconstructed output x ^ i R L target to ensure that the optimization process focuses only on the original data points:
L ( x i , x ^ i ) = x i pad σ i x ^ i 2 2
where ⊙ denotes element-wise multiplication and · 2 is the Euclidean norm. This masking strategy ensures that the network learns to reconstruct the original signal without being influenced by the padded zeros.
In cases where L i > L target , we apply downsampling by a factor d = L i / L target . The downsampled signal x i down R L target is defined as follows:
x i down [ m ] = x i [ m d ] , m { 0 , , L target 1 } .
The mask σ i for downsampled signals is a vector of ones:
σ i [ m ] = 1 , m { 0 , , L target 1 } .
This combined approach of interleaved padding, downsampling, and masking allows the network to effectively extract continuous time series signals with diverse sampling rates, ensuring consistent input dimensions and focused learning on the original characteristics.

4. Evaluation

Having detailed our multi-resolution resampling approach in Section 3, we now evaluate its performance against conventional methods using the diverse datasets to assess the effectiveness of the proposed reconstruction and resampling method. Therefore, we first provide a detailed description of the dataset used in our experiments, emphasizing its key characteristics. Next, we outline the experimental setup, specifying the training and evaluation protocols. Finally, we present the results and conduct a thorough analysis of the method’s performance.

4.1. Synthetic Data Generation

To generate synthetic time series, we follow the methodology proposed in [28], where each signal x i is constructed as the sum of four fundamental components: trend, seasonality, event, and noise. The additive model is defined as follows:
x i ( t ) = s i ( t ) + e i ( t ) + t i ( t ) + n i ( t ) .
The parameters of each component are randomly sampled from uniform distributions:
1.
Seasonality: Two sets of sinusoidal functions are used:
s i ( t ) = j = 1 2 a 1 ( j ) sin ( 2 π a 2 ( j ) t + a 3 ( j ) ) ,
with parameters sampled as a 1 ( 1 ) U ( 2 , 3 ) , a 2 ( 1 ) U ( 3 , 5 ) , a 3 ( 1 ) U ( 0 , 2 π ) and a 1 ( 2 ) U ( 3 , 4 ) , a 2 ( 2 ) U ( 1 , 2 ) , a 3 ( 2 ) U ( 0 , 2 π ) .
2.
Trend: The trend component is modeled as a linear function
t i ( t ) = b 1 t + b 2 ,
where the parameters are sampled as b 1 U ( 2 , 2 ) and b 2 U ( 2 , 2 ) .
3.
Event: This is defined as a Gaussian function
e i ( t ) = e 1 exp ( t e 2 ) 2 2 e 3 2 ,
where the parameters are sampled as e 1 U ( 1 , 1 ) , e 2 U ( 0 , 1 ) , e 3 U ( 0.5 , 1 ) .
4.
Noise: The noise component is modeled as a uniform distribution
n i ( t ) U ( 0.5 , 0.5 ) .
The dataset is constructed by sampling the previously defined continuous signals x i ( t ) at different frequencies. Each discrete signal x i = { x i [ n ] } n = 0 L i 1 consists of L i samples, where the sampling period is T i = 1 / f i . A total of 1000 time series were generated per sampling frequency, with 80% used for training and 20% for validation. The selected sampling frequencies are 5, 10, 20, 50, 100, and 500 Hz. Since all signals have a fixed duration of one second, the number of samples per signal varies accordingly. The resulting dataset consists of time series with different resolutions, as illustrated in Figure 3.
The samples exhibit a smooth, periodic oscillation with consistent peaks and troughs. This regularity suggests a well-defined, predictable pattern, making it ideal for testing resampling methods on structured signals. Together, these samples create a balanced dataset that challenges algorithms to preserve both the precision of periodic trends and the integrity of irregular dynamics, ensuring comprehensive validation for resampling tasks.

4.2. Real World Data

In addition to the synthetic dataset, we evaluate our approach on three real-world datasets to validate its practical performance. This ensures a complete assessment across both controlled and realistic scenarios.

4.2.1. PJM Hourly Energy Consumption

This dataset originates from PJM Interconnection, a regional transmission organization responsible for coordinating wholesale electricity transmission across 13 U.S. states. The dataset encompasses hourly electricity consumption values, expressed in MW, spanning the period from 31 December 1998 to 31 December 2001. The dataset is partitioned into N non-overlapping weekly segments, each represented as a column vector in the matrix W R 168 × N , such that W = [ w 1 , w 2 , , w N ] where each w i R 168 contains the hourly consumption values for the i-th week. Downsampling is applied to each w i at different sampling rates, producing reduced-resolution versions. Let D k R m k × N with k = 1, …, 5 denote the downsampled data for the k-th sampling frequency, where m k 4 , 8 , 12 , 42 is the number of samples retained per week. Figure 4 represents two randomly selected samples with different sampling rates.
Both samples exhibit characteristic energy consumption patterns with daily peaks and nightly troughs, reflecting typical grid behavior. This dataset demonstrates how different sampling rates capture varying levels of temporal detail in power systems, from gradual base load changes to rapid demand fluctuations.

4.2.2. Electricity Transformer Temperature

For a broader evaluation, we employed the ETTh1 dataset, a multivariate time series represented as X R T × 7 , where each row x t R 7 corresponds to an hourly observation of electricity transformer temperature and six power load features at time t, and T = 17 , 420 spans the two-year period from July 2016 to July 2018. We focused on the Load Unit Fluctuations per Load (LUFL) variable, denoted as the scalar time series y = X [ : , j ] R T , where j is the column index of LUFL in X . Following the same downsampling procedure as applied to the PJM dataset, we partitioned y into weekly segments { w i } i = 1 N , each of length 168 (hours/week). For each downsampling frequency m k { 4 , 8 , 12 , 42 } the low resolution version results in W ( k ) = [ w 1 ( k ) , , w N ( k ) ] R m k × N . Figure 5 represents two randomly selected samples with different sampling rates.
Both samples reveal characteristic daily patterns of electricity demand, with the first showing relatively stable daily peaks and the second exhibiting more volatile load fluctuations. The higher sampling rates (particularly 50 Hz) capture crucial rapid transients, sudden demand spikes, and quick ramps, which lower frequencies might miss or distort. This combination of regular patterns and irregular disturbances makes the PJM dataset particularly valuable for developing robust resampling techniques that maintain fidelity across temporal scales.

4.2.3. Thermal Power Prediction

Finally, we incorporated a third real-world dataset to evaluate our methodology on thermal power prediction. The dataset consists of multivariate time series observations recorded at a 15 min sampling frequency from 1 January 2016, to 15 September 2020. Let Z R T × d represent the full dataset, where each row z t R d contains measurements at time t for d variables (including outdoor temperature, flow temperature, return temperature, thermal power output, and water quantity). Here, T denotes the total number of 15 min samples over the 4.7-year period. Given the dominant influence of outdoor temperature and thermal power on heat consumption, we extracted the thermal power time series p = Z [ : , i ] R T where i is the column index corresponding to thermal power. We further refined p to focus on winter months, yielding a subset p winter R T with T < T . Consistent with prior datasets, we partitioned p winter into weekly segments { w i } i = 1 N , each comprising 672 samples (7 days at 15 min intervals). For each downsampling frequency m k { 4 , 8 , 12 , 42 } , the low-resolution version results in W ( k ) = [ w 1 ( k ) , , w N ( k ) ] R m k × N . Figure 6 represents one randomly selected sample with different sampling rates.
This dataset’s value lies in its authentic representation of both gradual thermal processes and sudden operational changes, from slow fuel-based heat accumulation to fast turbine adjustments. The presence of these multi-timescale phenomena makes it particularly suitable for evaluating resampling algorithms’ ability to reconstruct the full spectrum of thermal plant dynamics.

4.3. Experimental Setup

We evaluate our proposed MR-BoF model against traditional resampling algorithms, including filter-based methods, and a Feed-Forward Neural Network (FFNN) baseline across one synthetic and three real-world datasets. Our MR-BoF architecture employs a total of nine basis functions. These include three for seasonality (sine, three parameters each), three for trend (linear, two parameters each), and three for events (Gaussian, three parameters each). These functions are distributed among three encoders, each with a L max 50 25 10 dim ( z i ) structure with ReLU activations between all layers. We test MR-BoF models with one and two stages on the synthetic dataset and a three-stage model on the real-world datasets. As a key baseline, we utilize the MR-BoF’s multi-resolution resampling mechanism but replace the Bag of Functions (BoF) component with an FFNN. This FFNN is designed with a comparable number of trainable parameters for a fair architectural comparison. Specifically, its architecture for the synthetic dataset maps an input of 100 dimensions to an output of 100 dimensions via three hidden layers of 100 neurons each, while for the real-world datasets, it comprises two hidden layers of 168 neurons each between an input and output of 168 dimensions. Both FFNN architectures use ReLU activation functions between all layers. All neural models were trained for 1 × 10 2 epochs using the ADAM optimizer with a batch size of 32, and performance was evaluated after each epoch using the Mean Squared Error (MSE) loss function, L . To ensure a robust comparison and account for variability in training dynamics, this training procedure was repeated 10 times for each model on each dataset.

4.4. Results

In this section, we examine the effectiveness of time series decomposition and reconstruction using the MR-BoF framework with multi-resolution input sequences and compare it to the performance obtained using traditional resampling algorithms and an FFNN baseline. Therefore, to systematically evaluate our approach under varying sampling conditions, we design four distinct resampling scenarios: three upsampling cases (from 10 Hz, 20 Hz, and 50 Hz to a target rate of 100 Hz) and one downsampling case (from 500 Hz to 100 Hz). This framework allows us to assess performance across a wide spectrum of temporal resolutions, ranging from highly sparse to oversampled data, ensuring robustness in both interpolation and decimation tasks. We begin by analyzing the stability of the neural network models on the synthetic dataset. The distribution of the final MSE loss on the test set, obtained over 10 independent training runs for each model, is visualized using boxplots in Figure 7.
The boxplots in Figure 7 show that the MR-BoF models significantly outperform the FFNN baseline in the high-ratio upsampling scenarios (a, b). Notably, the single-stage MR-BoF 1 achieves this with approximately half the parameters of the FFNN, while the two-stage MR-BoF 2, with a comparable parameter count, demonstrates the strongest performance. In the less demanding scenarios with higher sampling frequencies (c, d), the performance of the models converges, with both the FFNN and the two-stage MR-BoF achieving low and comparable reconstruction errors. This indicates that the MR-BoF architecture provides a more robust solution overall, especially for challenging reconstruction tasks. To complement this error-based analysis, we next evaluate the models using the Pearson correlation coefficient, shown in Figure 8.
The Pearson correlation results in Figure 8 corroborate the findings from the MSE analysis. In the challenging upsampling cases (a, b), the MR-BoF models achieve median coefficients closer to 1 and with significantly less variance than the FFNN. For scenarios (c) and (d), all models perform quite well, with coefficients near unity. Taken together, the MSE and Pearson results indicate that the MR-BoF framework not only minimizes reconstruction error but also preserves the structural correlation of the time series more effectively, especially in demanding, low-resolution scenarios.
To facilitate a direct numerical comparison against the deterministic traditional algorithms, we selected the best-performing trial from the 10 independent runs for each neural network. The results of this head-to-head comparison are detailed in Table 1, which presents both Mean Squared Error and Pearson Correlation for all methods across the resampling scenarios.
The results in Table 1 confirm the superiority of the two-stage MR-BoF model in upsampling scenarios. For the most challenging 10 Hz case, it achieves an MSE of 0.682, surpassing the best traditional method (Polyphase FIR, 0.882). This trend continues for the 20 Hz and 50 Hz cases. In the downsampling scenario (500 Hz), however, the classic filter-based methods exhibit a marginally lower MSE. This suggests that while MR-BoF provides a dominant solution for resampling from sparse data, conventional filters remain highly effective for decimation tasks where sufficient data is available. To validate this further, we present four resampled signals in Figure 9.
The visual results in Figure 9 highlight the distinct behaviors of each resampling method. Linear interpolation inherently produces sharp, angular reconstructions, while cubic interpolation, although smoother, introduces oscillatory artifacts leading to significant overshoots and undershoots. The proposed MR-BoF model successfully avoids both of these issues, tracking the ground truth signal with more fidelity. This synthetic dataset is particularly insightful for such a comparison, as its well-defined composition allows for a targeted evaluation of how each model handles specific, known frequency components, such as seasonality. To quantify these observations, we perform a detailed error analysis on the first representative sample, shown in panel (a) of Figure 9. The results, presented in Figure 10, confirm the superiority of the proposed model by revealing its significantly lower error variance and spectral power in critical frequency bands.
The quantitative error analysis presented in Figure 10 confirms the superiority of the MR-BoF model. While a time-domain analysis shows that all methods are effectively unbiased with a mean error closed to zero, MR-BoF exhibits a significantly lower standard deviation than its counterparts, indicating a more consistently precise reconstruction.
However, the primary advantage of the MR-BoF approach is most evident in the frequency-domain analysis (Figure 10d). Within the critical seasonal frequency bands, MR-BoF achieves a drastic reduction in error power. In the 1–2 Hz band, its average spectral error is −14.0 dB, outperforming linear and cubic interpolation by more than 10 dB. This substantial advantage is mirrored in the 3–5 Hz band, where our model is again over 10.5 dB better than the baseline methods. It is crucial to note that a 10 dB improvement corresponds to a tenfold reduction in error power, underscoring the model’s advanced capability to faithfully reconstruct fundamental signal components where traditional methods fail.
Similarly, we evaluate the performance of our approach on the real-world PJM Hourly Energy Consumption dataset. Unlike synthetic or controlled datasets, real-world data introduces challenges such as irregular sampling rates, diverse sensor noises, and environmental disturbances, making it a crucial benchmark for assessing the robustness of our method.
Since downsampling poses no significant challenge compared to upsampling, we focus our evaluation on the more demanding upsampling tasks using real-world datasets. Specifically, we examine four progressively difficult scenarios: from extremely sparse signals ( R 4 ) through intermediate sampling rates ( R 8 , R 12 ) to relatively dense signals ( R 42 ), all targeting a final sampling rate of R 168 . This range allows us to test our method’s performance across varying degrees of signal sparsity, from near-critical sampling conditions to more favorable cases. Following the methodology of 10 independent training runs per model, the resulting distributions for MSE are presented in Figure 11.
The results from the PJM dataset confirm the effectiveness of the MR-BoF architecture for real-world applications. The MSE boxplots in Figure 11 show that the three-stage MR-BoF model (MRBF3) achieves a consistently lower median loss than the FFNN baseline across all four sparsity levels. The performance advantage is most pronounced in the extremely low-sampled scenario (a), underscoring the model’s strength in low-sampling-rate regimes. To complement this error-based analysis, we next examine the Pearson correlation coefficient results, shown in Figure 12.
The previous conclusion is reinforced by the Pearson correlation results in Figure 12, where the MR-BoF model again demonstrates superior performance. Across all four scenarios, the MR-BoF achieves a higher median coefficient than the FFNN, indicating a better reconstruction of the signal’s structural properties. A key observation relates to stability: while the MR-BoF exhibits high variance in the most challenging sparse scenarios (a, b), its performance becomes progressively more stable as the input signal density increases, achieving low variance in the densest case (d). Taken together, these results demonstrate the robustness of the MR-BoF framework for demanding, real-world upsampling tasks, confirming its ability to both minimize error and preserve signal correlation more effectively than a standard neural network baseline. To provide a direct numerical comparison for the PJM dataset, we selected the best-performing neural models from the 10 independent runs. The results of this comparison against the deterministic traditional algorithms are detailed in Table 2.
The results presented in Table 2 demonstrate how each approach adapts to real-world conditions. Consistent with our findings on the synthetic data, the MR-BoF model shows superior performance in the most critical scenarios involving very sparse inputs ( R 4 to R 12 ), achieving both a significantly lower MSE and a higher Pearson coefficient than all baselines. However, in the high-density scenario ( R 42 ), where signal information is abundant, traditional frequency-domain methods like FFT-based resampling achieve the best reconstruction accuracy. This is likely due to their ability to effectively exploit the signal’s spectral properties when a sufficient number of samples is available. Nevertheless, our approach remains competitive, yielding substantially better MSE and Pearson scores than cubic and linear interpolation.
To provide a qualitative illustration of these performance differences, Figure 13 visually compares the resampling results of three methods for the challenging R 12 R 168 scenario.
The BoF approach demonstrates clear advantages in handling this complex, real-world dataset. While linear interpolation produces abrupt, piecewise transitions that distort consumption patterns, and cubic interpolation creates unrealistic fluctuations in energy trends, the BoF method maintains smoother, more physically plausible trajectories. It better captures the underlying consumption behavior without introducing artificial peaks or troughs.
To test the generalization of our method on time series with different characteristics, we next evaluate it on the Electricity Transformer Temperature dataset. Following the same experimental protocol of 10 independent runs, the performance distributions for MSE and Pearson correlation are presented in Figure 14 and Figure 15, respectively.
The results on the Electricity Transformer Temperature dataset highlight the adaptability of the MR-BoF framework. The MSE analysis in Figure 14 confirms that our model is the superior choice for minimizing reconstruction error in the critical scenarios (a, b, c). While the simpler FFNN baseline is highly competitive in the highest-resolution case (d) and across the Pearson correlation results (Figure 15), this dynamic provides a key insight. It demonstrates that the MR-BoF’s performance remains robust and comparable to standard deep learning baselines even on datasets where its specialized inductive biases may be less critical.
Building on the insights from the boxplots, Table 3 provides a direct numerical comparison, benchmarking the best-performing trial from each neural network against the deterministic traditional algorithms.
Mirroring our previous results, the MR-BoF approach again shows dominant performance on the Electricity Transformer Temperature dataset for sparse sampling regimes ( R 4 R 12 ), overcoming the signal reconstruction challenges that limit conventional methods in low-data conditions. Conversely, in energy systems operating at high sampling densities ( R 42 ), spectral methods maintain their advantage. This consistent image of our core findings across domains confirms the BoF method’s generalizability for sparse time series reconstruction. Figure 16 visually compares the reconstruction quality of our MR-BoF method against two baseline approaches (linear interpolation and cubic spline) for upsampling R 12 to R 168 .
The visual results in Figure 16 confirm the superiority of the BoF approach in handling this complex, real-world dataset. While linear interpolation produces abrupt, piecewise transitions that distort consumption patterns, and cubic interpolation creates unrealistic fluctuations, the BoF method maintains smoother, more physically plausible trajectories. It better captures the underlying consumption behavior without introducing artificial peaks or troughs, which is crucial for downstream tasks such as thermal monitoring and predictive maintenance.
As a final validation, we assess the framework’s performance on the Thermal Power Prediction dataset, which contains heat demand data from district heating systems in Germany. This evaluation introduces distinct real-world complexities, including strong daily variability and abrupt demand shifts caused by weather fronts, providing a challenging test for the model’s adaptability. First, we examine the stability and performance of the neural networks using the MSE metric, with the results from 10 independent runs shown in Figure 17.
The MSE results in Figure 17 show a clear performance advantage for the MR-BoF architecture on this dataset. In all four scenarios, the MR-BoF achieves a significantly lower median loss than the FFNN. Notably, the entire distribution of the MR-BoF’s performance is superior; its 75th percentile for loss is consistently lower than the FFNN’s 25th percentile. This indicates that even the worst-performing MR-BoF trials yielded more accurate results than the best-performing FFNN trials, confirming the framework’s robust advantage for this task. These findings are corroborated by the Pearson correlation analysis shown in Figure 18.
The Pearson correlation analysis in Figure 18 reinforces the superiority of the MR-BoF model. Across all four scenarios, the MR-BoF consistently achieves a substantially higher median Pearson coefficient, indicating a far better reconstruction of the signal’s structural properties. Based on the preceding analysis, we selected the best trial for the MR-BoF model to benchmark it against the traditional algorithms. These results are detailed in Table 4.
The results demonstrate MR-BoF’s superior performance in resampling thermal power data, particularly for low resolution inputs ( R 4 R 12 ) where it achieves significantly lower MSE (0.802–0.488) and higher correlation (up to 0.614) than conventional methods. While filter-based approaches perform well at high resolution ( R 42 ), MR-BoF maintains competitive accuracy while excelling where it matters most, the challenging low-resolution cases. Traditional linear and cubic interpolation show poor correlation and higher errors, especially their extended variants. Fourier methods, though reasonable at high resolution, fail to match MR-BoF’s performance with sparse data. This advantage stems from MR-BoF’s multi-scale feature learning capability, which better captures the complex temporal dynamics compared to fixed mathematical interpolations or frequency-domain approaches. Figure 19 visually compares the upsampling ( R 12 R 168 ) results of our method against two baseline approaches.
These visualizations confirm the quantitative results, showing how the MR-BoF method maintains smoother, more plausible trajectories that better capture the underlying demand behavior without the jagged segments or unrealistic fluctuations produced by traditional methods. This analysis highlights how the optimal resampling strategy depends on the signal’s sampling density, with our MR-BoF method proving particularly valuable for challenging, low-resolution reconstruction tasks.

4.5. Discussion and Limitations

The proposed Multi-Resolution Bag of Functions framework provides a novel and effective approach to modeling time series with heterogeneous sampling rates. To comprehensively evaluate the approach, we discuss challenges and limitations, which in turn delineate directions for future research.
A foundational challenge lies in the selection of an appropriate basis function set. In scenarios where domain-specific knowledge is scarce, the choice of functions may become arbitrary, potentially leading to a sub-optimal representation that compromises model efficiency. However, we also note that the choice of seasonality, trend, and event functions is quite straightforward and typically requires low domain-specific knowledge.
Further, we discuss the framework’s computational cost. Using the synthetic dataset detailed in Section 4.1, we benchmarked the execution time of the same models whose performance was previously evaluated in Table 1. The results of this computational analysis are summarized in Table 5.
All performance benchmarks were conducted on a workstation equipped with an Intel Core i7-11700 CPU, 64 GB of RAM, and an NVIDIA RTX 5000 GPU. The software stack included PyTorch 1.12.1+cu113 for neural models and SciPy 1.13.1 for traditional methods. As noted in Table 5, neural models utilized GPU acceleration, while DSP algorithms were executed on the CPU. Each reported value is the average of 10 independent runs on fixed-length signals with a batch size of one.
As expected, the benchmark results highlight the significant computational cost of the MR-BoF architecture compared to conventional methods. Specifically, the single-stage MR-BoF 1 is much slower than highly efficient FFT-based resampling (approx. 940 ms vs. 20 ms). This contrasts sharply with traditional DSP algorithms, whose performance is either consistently fast (cubic interpolation) or predictably dependent on signal properties like the resampling ratio (FIR filter). However, it should be noted that the higher computational cost does not pose any restriction for the practical usability.
A particularly instructive comparison can be made with the FFNN, which has a comparable number of trainable parameters. The FFNN is faster than our models due to the highly optimized matrix multiplications in modern deep learning hardware. In contrast, the MR-BoF’s processing time is not optimized to date and is driven by the evaluation of basis functions. Hence, while the challenge of basis function selection and the associated computational costs are important considerations, they also define clear pathways for optimization. Potential strategies include developing methods for sparse basis function evaluation, model quantization, or knowledge distillation.

5. Conclusions

In this work, we presented a novel Bag of Functions framework for the adaptive decomposition and reconstruction of time series data with heterogeneous sampling rates. By transforming signals into a continuous functional representation, our approach overcomes key limitations of traditional interpolation and filtering methods, eliminating spectral distortions while enabling flexible handling of sparse and dense sampling regimes. Our evaluation demonstrates three key advances: (1) the BoF model achieves lower MSE and higher Pearson correlation than conventional methods in critical sparse-sampling scenarios, (2) validation across diverse datasets confirms consistent performance under diverse real-world conditions, and (3) the staged learning architecture yields lower output variance, proving especially valuable for industrial applications requiring reliable sparse-data recovery. Overall, this work introduces a sampling-rate-agnostic framework for time series modeling, offering a new paradigm with broad applicability in smart infrastructure, IoT systems, and beyond.

Author Contributions

Conceptualization, D.O.S.T. and D.A.; Data curation, D.O.S.T.; Formal analysis, D.O.S.T. and D.A.; Investigation, D.O.S.T. and D.A.; Methodology, D.O.S.T. and D.A.; Software, D.O.S.T. and D.A.; Supervision, A.S.; Validation, D.O.S.T. and D.A.; Visualization, D.O.S.T. and D.A.; Writing—original draft, D.O.S.T.; Writing—review and editing, D.O.S.T., D.A., and A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This Article is funded by the Open Access Publication Fund of South Westphalia University of Applied Sciences.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in PJM Data Miner 2 at https://dataminer2.pjm.com/feed/hrl_load_metered/definition (accessed on 10 January 2025) (PJM Hourly Energy Consumption) and in the ETDataset GitHub Repository at https://github.com/zhouhaoyi/ETDataset (accessed on 10 January 2025) (Electricity Transformer Temperature).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Tayarani-Najaran, M.H.; Schmuker, M. Event-based sensing and signal processing in the visual, auditory, and olfactory domain: A review. Front. Neural Circuits 2021, 15, 610446. [Google Scholar] [CrossRef]
  2. Altinses, D.; Schwung, A. Performance benchmarking of multimodal data-driven approaches in industrial settings. Mach. Learn. Appl. 2025, 21, 100691. [Google Scholar] [CrossRef]
  3. Yu, C.H. Resampling methods: Concepts, applications, and justification. Pract. Assess. Res. Eval. 2019, 8, 19. [Google Scholar]
  4. Li, T.; Bolic, M.; Djuric, P.M. Resampling methods for particle filtering: Classification, implementation, and strategies. IEEE Signal Process. Mag. 2015, 32, 70–86. [Google Scholar] [CrossRef]
  5. Klopries, H.; Schwung, A. ITF-GAN: Synthetic time series dataset generation and manipulation by interpretable features. Knowl.-Based Syst. 2024, 283, 111131. [Google Scholar] [CrossRef]
  6. Klopries, H.; Schwung, A. ITF-VAE: Variational Auto-Encoder using interpretable continuous time series features. IEEE Trans. Artif. Intell. 2025, 6, 2314–2326. [Google Scholar] [CrossRef]
  7. Li, W.; Jiang, X. Prediction of air pollutant concentrations based on TCN-BiLSTM-DMAttention with STL decomposition. Sci. Rep. 2023, 13, 4665. [Google Scholar] [CrossRef] [PubMed]
  8. Su, Y.; Feng, L.; Li, J.; Zhang, X.; Yang, Y. Analysis of watershed terrestrial water storage anomalies by Bi-LSTM with X-11 time series prediction combined model. Geosci. J. 2024, 28, 941–958. [Google Scholar] [CrossRef]
  9. Mélard, G. On some remarks about SEATS signal extraction. SERIEs 2016, 7, 53–98. [Google Scholar] [CrossRef]
  10. Srivardhan, V. Stratigraphic correlation of wells using discrete wavelet transform with fourier transform and multi-scale analysis. Geomech. Geophys. Geo-Energy Geo-Resour. 2016, 2, 137–150. [Google Scholar] [CrossRef]
  11. Keiner, J.; Waterhouse, B.J. Fast principal components analysis method for finance problems with unequal time steps. In Monte Carlo and Quasi-Monte Carlo Methods 2008; Springer: Berlin/Heidelberg, Germany, 2009; pp. 455–465. [Google Scholar]
  12. Rhif, M.; Ben Abbes, A.; Farah, I.R.; Martínez, B.; Sang, Y. Wavelet transform application for/in non-stationary time-series analysis: A review. Appl. Sci. 2019, 9, 1345. [Google Scholar] [CrossRef]
  13. Quinn, A.J.; Lopes-dos Santos, V.; Dupret, D.; Nobre, A.C.; Woolrich, M.W. EMD: Empirical mode decomposition and Hilbert-Huang spectral analyses in Python. J. Open Source Softw. 2021, 6, 2977. [Google Scholar] [CrossRef] [PubMed]
  14. Fang, Y.; Guan, B.; Wu, S.; Heravi, S. Optimal forecast combination based on ensemble empirical mode decomposition for agricultural commodity futures prices. J. Forecast. 2020, 39, 877–886. [Google Scholar] [CrossRef]
  15. Li, Y.; Xu, C.; Yi, L.; Fang, R. A data-driven approach for denoising GNSS position time series. J. Geod. 2018, 92, 905–922. [Google Scholar] [CrossRef]
  16. Dong, M.; Sun, J. Partial discharge detection on aerial covered conductors using time-series decomposition and long short-term memory network. Electr. Power Syst. Res. 2020, 184, 106318. [Google Scholar] [CrossRef]
  17. Amalou, I.; Mouhni, N.; Abdali, A. CNN-LSTM architectures for non-stationary time series: Decomposition approach. In Proceedings of the 2024 International Conference on Global Aeronautical Engineering and Satellite Technology (GAST), Marrakesh, Morocco, 24–26 April 2024; pp. 1–5. [Google Scholar]
  18. Cao, J.; Li, Z.; Li, J. Financial time series forecasting model based on CEEMDAN and LSTM. Phys. A Stat. Mech. Its Appl. 2019, 519, 127–139. [Google Scholar] [CrossRef]
  19. Niu, H.; Xu, K.; Wang, W. A hybrid stock price index forecasting model based on variational mode decomposition and LSTM network. Appl. Intell. 2020, 50, 4296–4309. [Google Scholar] [CrossRef]
  20. Arslan, S. A hybrid forecasting model using LSTM and Prophet for energy consumption with decomposition of time series data. PeerJ Comput. Sci. 2022, 8, e1001. [Google Scholar] [CrossRef]
  21. Tüske, Z.; Schlüter, R.; Ney, H. Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 4859–4863. [Google Scholar]
  22. Mangalraj, P.; Sivakumar, V.; Karthick, S.; Haribaabu, V.; Ramraj, S.; Samuel, D.J. A review of multi-resolution analysis (MRA) and multi-geometric analysis (MGA) tools used in the fusion of remote sensing images. Circuits, Syst. Signal Process. 2020, 39, 3145–3172. [Google Scholar] [CrossRef]
  23. Yan, K.; Long, C.; Wu, H.; Wen, Z. Multi-resolution expansion of analysis in time-frequency domain for time series forecasting. IEEE Trans. Knowl. Data Eng. 2024, 36, 6667–6680. [Google Scholar] [CrossRef]
  24. Agrawal, N.; Kumar, A.; Bajaj, V.; Singh, G.K. Design of bandpass and bandstop infinite impulse response filters using fractional derivative. IEEE Trans. Ind. Electron. 2018, 66, 1285–1295. [Google Scholar] [CrossRef]
  25. Oh, C.; Han, S.; Jeong, J. Time-series data augmentation based on interpolation. Procedia Comput. Sci. 2020, 175, 64–71. [Google Scholar] [CrossRef]
  26. Svilainis, L. Review on time delay estimate subsample interpolation in frequency domain. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2019, 66, 1691–1698. [Google Scholar] [CrossRef] [PubMed]
  27. Luo, W.; Li, Y.; Yao, F.; Wang, S.; Li, Z.; Zhan, P.; Li, X. Multi-resolution representation for streaming time series retrieval. Int. J. Pattern Recognit. Artif. Intell. 2021, 35, 2150019. [Google Scholar] [CrossRef]
  28. Torres, D.O.S.; Altinses, D.; Schwung, A. Data Imputation Techniques Using the Bag of Functions: Addressing Variable Input Lengths and Missing Data in Time Series Decomposition. In Proceedings of the 2025 IEEE International Conference on Industrial Technology (ICIT), Wuhan, China, 26–28 March 2025; pp. 1–7. [Google Scholar]
Figure 1. The residual Bag of Functions architecture [28].
Figure 1. The residual Bag of Functions architecture [28].
Sensors 25 04759 g001
Figure 2. The Bag of Functions core block [28].
Figure 2. The Bag of Functions core block [28].
Sensors 25 04759 g002
Figure 3. Two samples from the time series dataset with different sampling rates. The continuous light-colored (blue and yellow) signal shows the original time series sampled at 2 kHz. The markers, connected by dashed lines, represent the signal sampled to different frequencies: red (5 Hz), blue (10 Hz), green (20 Hz), and black (50 Hz).
Figure 3. Two samples from the time series dataset with different sampling rates. The continuous light-colored (blue and yellow) signal shows the original time series sampled at 2 kHz. The markers, connected by dashed lines, represent the signal sampled to different frequencies: red (5 Hz), blue (10 Hz), green (20 Hz), and black (50 Hz).
Sensors 25 04759 g003
Figure 4. Two samples from the PJM Hourly Energy Consumption dataset with different sampling rates. The time axes represent one week. The continuous black line is the original high-resolution signal ( w i R 168 ). The red, green, and blue markers represent the signal downsampled to resolutions of 4, 8, and 12 points per week, respectively.
Figure 4. Two samples from the PJM Hourly Energy Consumption dataset with different sampling rates. The time axes represent one week. The continuous black line is the original high-resolution signal ( w i R 168 ). The red, green, and blue markers represent the signal downsampled to resolutions of 4, 8, and 12 points per week, respectively.
Sensors 25 04759 g004
Figure 5. Two samples from the Electricity Transformer Temperature dataset with different sampling rates. The time axes represent one week. The continuous black line is the original high-resolution signal ( w i R 168 ). The red, green, and blue markers represent the signal downsampled to resolutions of 4, 8, and 12 points per week, respectively.
Figure 5. Two samples from the Electricity Transformer Temperature dataset with different sampling rates. The time axes represent one week. The continuous black line is the original high-resolution signal ( w i R 168 ). The red, green, and blue markers represent the signal downsampled to resolutions of 4, 8, and 12 points per week, respectively.
Sensors 25 04759 g005
Figure 6. One sample from the Thermal Power Prediction dataset with different sampling rates. The time axes represent one week. The continuous black line is the original high-resolution signal ( w i R 672 ). The red, green, and blue markers represent the signal downsampled to resolutions of 4, 8, and 12 points per week, respectively.
Figure 6. One sample from the Thermal Power Prediction dataset with different sampling rates. The time axes represent one week. The continuous black line is the original high-resolution signal ( w i R 672 ). The red, green, and blue markers represent the signal downsampled to resolutions of 4, 8, and 12 points per week, respectively.
Sensors 25 04759 g006
Figure 7. Distribution of the final MSE loss on the test set over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, single-stage MR-BoF (MRBF1), and two-stage MR-BoF (MRBF2) across four scenarios: (a) 10 Hz to 100 Hz. (b) 20 Hz to 100 Hz. (c) 50 Hz to 100 Hz. (d) 500 Hz to 100 Hz.
Figure 7. Distribution of the final MSE loss on the test set over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, single-stage MR-BoF (MRBF1), and two-stage MR-BoF (MRBF2) across four scenarios: (a) 10 Hz to 100 Hz. (b) 20 Hz to 100 Hz. (c) 50 Hz to 100 Hz. (d) 500 Hz to 100 Hz.
Sensors 25 04759 g007
Figure 8. Distribution of the Pearson coefficient on the test set over 10 independent runs. Within each box, the solid orange and dashed green lines denote the median and the mean, respectively. The results are shown across four scenarios: (a) 10 Hz to 100 Hz. (b) 20 Hz to 100 Hz. (c) 50 Hz to 100 Hz. (d) 500 Hz to 100 Hz.
Figure 8. Distribution of the Pearson coefficient on the test set over 10 independent runs. Within each box, the solid orange and dashed green lines denote the median and the mean, respectively. The results are shown across four scenarios: (a) 10 Hz to 100 Hz. (b) 20 Hz to 100 Hz. (c) 50 Hz to 100 Hz. (d) 500 Hz to 100 Hz.
Sensors 25 04759 g008
Figure 9. Resampling performance comparison for the synthetic dataset: Our proposed BoF approach (red) versus linear (blue) and cubic (yellow) interpolation methods for upsampling an input signal with a sampling rate of 10 Hz (green points) to a target sampling rate of 100 Hz (dashed black line). Subfigures (ad) show four different samples of this process.
Figure 9. Resampling performance comparison for the synthetic dataset: Our proposed BoF approach (red) versus linear (blue) and cubic (yellow) interpolation methods for upsampling an input signal with a sampling rate of 10 Hz (green points) to a target sampling rate of 100 Hz (dashed black line). Subfigures (ad) show four different samples of this process.
Sensors 25 04759 g009
Figure 10. Comparative analysis of the reconstruction error. (ac) Time-domain error distributions for the MR-BoF model ( μ = 0.12 , σ = 1.03 ), linear interpolation ( μ = 0.13 , σ = 1.98 ), and cubic interpolation ( μ = 0.07 , σ = 2.02 ). The red dashed line indicates the mean error. (d) Power Spectral Density (PSD) comparison of the error in the resampling methods MR-BoF (red), linear (blue), and cubic (yellow). Shaded regions highlight the primary seasonal frequency bands (0–2 Hz and 5–8 Hz).
Figure 10. Comparative analysis of the reconstruction error. (ac) Time-domain error distributions for the MR-BoF model ( μ = 0.12 , σ = 1.03 ), linear interpolation ( μ = 0.13 , σ = 1.98 ), and cubic interpolation ( μ = 0.07 , σ = 2.02 ). The red dashed line indicates the mean error. (d) Power Spectral Density (PSD) comparison of the error in the resampling methods MR-BoF (red), linear (blue), and cubic (yellow). Shaded regions highlight the primary seasonal frequency bands (0–2 Hz and 5–8 Hz).
Sensors 25 04759 g010
Figure 11. Distribution of the final MSE loss on the PJM Hourly Energy Consumption dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Figure 11. Distribution of the final MSE loss on the PJM Hourly Energy Consumption dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Sensors 25 04759 g011
Figure 12. Distribution of the Pearson coefficient on the PJM Hourly Energy Consumption dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Figure 12. Distribution of the Pearson coefficient on the PJM Hourly Energy Consumption dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Sensors 25 04759 g012
Figure 13. Resampling performance comparison for the PJM Hourly Energy Consumption dataset: Our proposed BoF approach (red) versus linear (blue) and cubic (yellow) interpolation methods for upsampling a R 12 input signal (green points) to a target R 168 sampling rate (dashed black line). Subfigures (ad) show four different samples of this process.
Figure 13. Resampling performance comparison for the PJM Hourly Energy Consumption dataset: Our proposed BoF approach (red) versus linear (blue) and cubic (yellow) interpolation methods for upsampling a R 12 input signal (green points) to a target R 168 sampling rate (dashed black line). Subfigures (ad) show four different samples of this process.
Sensors 25 04759 g013
Figure 14. Distribution of the final MSE loss on the Electricity Transformer Temperature dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Figure 14. Distribution of the final MSE loss on the Electricity Transformer Temperature dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Sensors 25 04759 g014
Figure 15. Distribution of the Pearson coefficient on the Electricity Transformer Temperature dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Figure 15. Distribution of the Pearson coefficient on the Electricity Transformer Temperature dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Sensors 25 04759 g015
Figure 16. Resampling performance comparison for the Electricity Transformer Temperature dataset: Our proposed BoF approach (red) versus linear (blue) and cubic (yellow) interpolation methods for upsampling a R 12 input signal (green points) to a target R 168 sampling rate (dashed black line). Subfigures (ad) show four different samples of this process.
Figure 16. Resampling performance comparison for the Electricity Transformer Temperature dataset: Our proposed BoF approach (red) versus linear (blue) and cubic (yellow) interpolation methods for upsampling a R 12 input signal (green points) to a target R 168 sampling rate (dashed black line). Subfigures (ad) show four different samples of this process.
Sensors 25 04759 g016
Figure 17. Distribution of the final MSE loss on the Thermal Power Prediction dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Figure 17. Distribution of the final MSE loss on the Thermal Power Prediction dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Sensors 25 04759 g017
Figure 18. Distribution of the Pearson coefficient on the Thermal Power Prediction dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Figure 18. Distribution of the Pearson coefficient on the Thermal Power Prediction dataset, with over 10 independent runs for the neural network models. Within each box, the solid orange line and dashed green line denote the median and the mean, respectively. The boxplots compare the performance of the FFNN, and three-stage MR-BoF (MRBF3) across four scenarios: (a) R 4 R 168 . (b) R 8 R 168 . (c) R 12 R 168 . (d) R 42 R 168 .
Sensors 25 04759 g018
Figure 19. Resampling performance comparison for the Thermal Power Prediction dataset: Our proposed BoF approach (red) versus linear (blue) and cubic (yellow) interpolation methods for upsampling a R 12 input signal (green points) to a target R 168 sampling rate (dashed black line). Subfigures (ad) show four different samples of this process.
Figure 19. Resampling performance comparison for the Thermal Power Prediction dataset: Our proposed BoF approach (red) versus linear (blue) and cubic (yellow) interpolation methods for upsampling a R 12 input signal (green points) to a target R 168 sampling rate (dashed black line). Subfigures (ad) show four different samples of this process.
Sensors 25 04759 g019
Table 1. Comparison of the performance of resampling methods from different sampling frequencies up to the target sampling frequency of 100 Hz using MSE ↓ and Pearson Correlation ↑ on the synthetic dataset. The best results for each metric are shown in bold.
Table 1. Comparison of the performance of resampling methods from different sampling frequencies up to the target sampling frequency of 100 Hz using MSE ↓ and Pearson Correlation ↑ on the synthetic dataset. The best results for each metric are shown in bold.
MethodMSE ↓Pearson ↑
10 Hz20 Hz50 Hz500 Hz10 Hz20 Hz50 Hz500 Hz
Linear Interpol.4.4501.3360.2240.1890.7310.9250.9870.989
Cubic Interpol.5.5531.4400.2420.2060.6920.9220.9860.988
FFT-based1.5310.5260.2440.1420.9180.9720.9860.992
Polyphase FIR0.8820.3570.2050.1180.9520.9800.9880.993
FIR Filter0.9330.3490.2000.1170.9490.9810.9890.993
Sinc Filter1.0860.3480.2010.1170.9400.9810.9890.993
FFNN2.3711.2070.1880.1570.9130.9560.9900.991
MR-BoF 1 Stage1.1800.6380.4420.4230.9390.9670.9770.978
MR-BoF 2 Stage0.6820.3270.1750.1830.9650.9840.9910.990
Table 2. Comparison of the performance of resampling methods from different sampling frequencies up to the target R m k R 168 with m k { 4 , 8 , 12 , 42 } using MSE ↓ and Pearson Correlation ↑ on the PJM Hourly Energy Consumption dataset. The best results for each metric are shown in bold.
Table 2. Comparison of the performance of resampling methods from different sampling frequencies up to the target R m k R 168 with m k { 4 , 8 , 12 , 42 } using MSE ↓ and Pearson Correlation ↑ on the PJM Hourly Energy Consumption dataset. The best results for each metric are shown in bold.
MethodMSE ↓Pearson ↑
R 4 R 8 R 12 R 42 R 4 R 8 R 12 R 42
Linear Interpol.1.3521.3450.9390.1860.0700.1160.3080.879
Linear Interpol. ext.1.5691.2990.8950.0560.1530.2160.4320.963
Cubic Interpol.1.5981.4341.1140.2010.0770.1220.2920.874
Cubic Interpol. ext.3.7531.8095.5370.1310.1910.1970.1210.919
FFT-based1.3651.3451.2330.0260.1600.1860.2970.983
Polyphase FIR1.2631.2951.1290.0330.1950.1990.3250.978
FIR Filter1.2551.2890.9840.0320.1930.1990.3760.978
Sinc Filter1.2471.2901.1120.0330.1920.1960.3120.978
FFNN0.9790.8160.6110.1310.4710.5870.7360.941
MR-BoF 3 Stages0.4510.3700.1600.0720.7240.7810.9110.957
Table 3. Comparison of the performance of resampling methods from different sampling frequencies up to the target R m k R 168 with m k { 4 , 8 , 12 , 42 } using MSE ↓ and Pearson Correlation ↑ on the Electricity Transformer Temperature dataset. The best results for each metric are shown in bold.
Table 3. Comparison of the performance of resampling methods from different sampling frequencies up to the target R m k R 168 with m k { 4 , 8 , 12 , 42 } using MSE ↓ and Pearson Correlation ↑ on the Electricity Transformer Temperature dataset. The best results for each metric are shown in bold.
MethodMSE ↓Pearson ↑
R 4 R 8 R 12 R 42 R 4 R 8 R 12 R 42
Linear Interpol.0.7360.7600.6800.2370.0760.0690.1810.736
Linear Interpol. ext.1.3720.9360.5400.2350.0390.1590.3490.754
Cubic Interpol.0.7290.8720.8030.2640.1080.0580.1660.727
Cubic Interpol. ext.4.0463.3042.7100.3050.0320.1100.1690.723
FFT-based0.9240.8010.6950.2230.0610.1790.2740.786
Polyphase FIR0.8850.7450.6430.2170.0840.2110.3050.789
FIR Filter0.8670.7380.6170.2140.0850.2080.3180.790
Sinc Filter0.8470.7370.6390.2170.0790.1930.2890.786
FFNN0.7020.6390.5470.2700.3450.4150.5150.760
MR-BoF 3 Stages0.6060.5320.4240.3170.3200.3780.5340.667
Table 4. Comparison of the performance of resampling methods from different sampling rates up to the target R m k R 168 with m k { 4 , 8 , 12 , 42 } using MSE ↓ and Pearson Correlation ↑ on the Thermal Power Prediction dataset. The best results for each metric are shown in bold.
Table 4. Comparison of the performance of resampling methods from different sampling rates up to the target R m k R 168 with m k { 4 , 8 , 12 , 42 } using MSE ↓ and Pearson Correlation ↑ on the Thermal Power Prediction dataset. The best results for each metric are shown in bold.
MethodMSE ↓Pearson ↑
R 4 R 8 R 12 R 42 R 4 R 8 R 12 R 42
Linear Interpol.1.0211.0210.9390.4970.1450.2050.2550.623
Linear Interpol. ext.1.3521.0680.9090.2850.1820.2470.3400.796
Cubic Interpol.1.2211.1701.0750.5670.1270.1920.2430.607
Cubic Interpol. ext.5.4842.1072.5780.5700.1150.2010.1650.721
FFT-based1.1051.1281.0940.2950.1860.2220.2580.804
Polyphase FIR1.0551.0871.0380.2870.2120.2350.2780.807
FIR Filter1.0361.0760.9910.2800.2140.2360.2970.811
Sinc Filter1.0011.0721.0200.2900.2170.2350.2730.803
FFNN0.9650.9370.8780.6000.1680.2390.3000.572
MR-BoF 3 Stages0.8020.6670.4880.2820.2850.4220.6140.797
Table 5. Mean execution time (ms) and standard deviation for resampling signals to 100 Hz. The proposed MR-BoF models are compared against traditional DSP algorithms and a feed-forward neural network-based baseline. GPU-accelerated methods are marked with an asterisk (*).
Table 5. Mean execution time (ms) and standard deviation for resampling signals to 100 Hz. The proposed MR-BoF models are compared against traditional DSP algorithms and a feed-forward neural network-based baseline. GPU-accelerated methods are marked with an asterisk (*).
Input f (Hz)CubicFFTFIRFFNN *MR-BoF 1 *MR-BoF 2 *
10 Hz60 ± 420 ± 1290 ± 20170 ± 220940 ± 2271780 ± 216
20 Hz50 ± 220 ± 1170 ± 13160 ± 225950 ± 2161790 ± 224
50 Hz50 ± 120 ± 1100 ± 11160 ± 214940 ± 2271770 ± 227
500 Hz50 ± 420 ± 170 ± 9160 ± 224950 ± 2191770 ± 225
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Salazar Torres, D.O.; Altinses, D.; Schwung, A. Resampling Multi-Resolution Signals Using the Bag of Functions Framework: Addressing Variable Sampling Rates in Time Series Data. Sensors 2025, 25, 4759. https://doi.org/10.3390/s25154759

AMA Style

Salazar Torres DO, Altinses D, Schwung A. Resampling Multi-Resolution Signals Using the Bag of Functions Framework: Addressing Variable Sampling Rates in Time Series Data. Sensors. 2025; 25(15):4759. https://doi.org/10.3390/s25154759

Chicago/Turabian Style

Salazar Torres, David Orlando, Diyar Altinses, and Andreas Schwung. 2025. "Resampling Multi-Resolution Signals Using the Bag of Functions Framework: Addressing Variable Sampling Rates in Time Series Data" Sensors 25, no. 15: 4759. https://doi.org/10.3390/s25154759

APA Style

Salazar Torres, D. O., Altinses, D., & Schwung, A. (2025). Resampling Multi-Resolution Signals Using the Bag of Functions Framework: Addressing Variable Sampling Rates in Time Series Data. Sensors, 25(15), 4759. https://doi.org/10.3390/s25154759

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop