# Decompose and Conquer: Time Series Forecasting with Multiseasonal Trend Decomposition Using Loess

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

**Iterative Multi-Step (IMS)**: In this approach, forecasts are made incrementally for each time step within the forecast horizon. Specifically, a single forecast is generated, after which, the look-back window is shifted forward by one time step to incorporate the newly created prediction. This process is then repeated iteratively.**Direct Multi-Step (DMS):**Contrary to the IMS method, this method generates forecast values for the entire horizon in a single computational pass, thereby producing all the required forecasts simultaneously.

- We propose a novel forecasting approach that breaks down time series data into their fundamental components and addresses each component separately.
- We assessed the model’s efficiency with real-world time series datasets from various fields, demonstrating the enhanced performance of the proposed method. We further show that the improvement over the state-of-the-art was statistically significant.
- We designed and implemented a synthetic-data-generation process to further evaluate the effectiveness of the proposed model in the presence of different seasonal components.

## 2. Related Work

#### 2.1. Classical Time Series Models

#### 2.2. Machine Learning Models

#### 2.3. Deep Neural Networks

#### 2.4. Transformer-Based Architectures

#### 2.5. Time Series Decomposition

## 3. Problem Formulation

## 4. Methodology

#### 4.1. Synthetic Data Generation

#### 4.2. Implementation Details

## 5. Results

#### 5.1. Datasets

**Electricity**(https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014 (accessed on 9 December 2023)): Records the hourly electricity usage of 321 customers over a span of three years.

**ETT**[21] datasets: Include the target value “oil temperature” along with six power load attributes. ETTm1 and ETTm2 data were noted every 15 min, while ETTh1 and ETTh2 data were logged hourly from 2016 to 2018.

**Exchange**[34]: Features daily exchange rate data across eight countries from 1990 to 2010.

**ILI**(https://pems.dot.ca.gov/ (accessed on 9 December 2023)): Documents the weekly ratio of patients with influenza-like symptoms to the overall patient count. This information was provided by the U.S. Centers for Disease Control and Prevention between 2002 and 2020.

**Weather**(https://www.bgc-jena.mpg.de/wetter/ (accessed on 9 December 2023)): Represents a meteorological series with 21 weather metrics collected every ten minutes in 2020 from the Weather Station at the Max Planck Biogeochemistry Institute.

**Traffic**(http://pems.dot.ca.gov (accessed on 9 December 2023)): Contains hourly data related to road occupancy rates from various sensors on freeways in the San Francisco Bay area. Provided by the California Department of Transportation, this dataset has 862 attributes and spans from 2016 to 2018. The main characteristics of all the datasets are summarized in Table 1.

#### 5.2. Evaluation Metrics

#### 5.3. Baselines

#### 5.4. Experimental Results on Real Data

#### 5.5. Experimental Results on Synthetic Data

#### 5.6. Statistical Tests

## 6. Conclusions and Future Work

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Lim, B.; Zohren, S. Time-series forecasting with deep learning: A survey. Philos. Trans. Royal Soc. A
**2021**, 379, 20200209. [Google Scholar] [CrossRef] [PubMed] - Mahalakshmi, G.; Sridevi, S.; Rajaram, S. A survey on forecasting of time series data. In Proceedings of the 2016 International Conference on Computing Technologies and Intelligent Data Engineering (ICCTIDE’16), Kovilpatti, India, 7–9 January 2016; pp. 1–8. [Google Scholar]
- Deb, C.; Zhang, F.; Yang, J.; Lee, S.E.; Shah, K.W. A review on time series forecasting techniques for building energy consumption. Renew. Sustain. Energy Rev.
**2017**, 74, 902–924. [Google Scholar] [CrossRef] - Zhou, T.; Ma, Z.; Wen, Q.; Sun, L.; Yao, T.; Yin, W.; Jin, R. Film: Frequency improved legendre memory model for long-term time series forecasting. Adv. Neural Inf. Process. Syst.
**2022**, 35, 12677–12690. [Google Scholar] - Zhou, T.; Ma, Z.; Wen, Q.; Wang, X.; Sun, L.; Jin, R. Fedformer: Frequency enhanced decomposed Transformer for long-term series forecasting. In Proceedings of the International Conference on Machine Learning, PMLR, Baltimore, MD, USA, 25–27 July 2022; pp. 27268–27286. [Google Scholar]
- Zeng, A.; Chen, M.; Zhang, L.; Xu, Q. Are Transformers effective for time series forecasting? In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 11121–11128. [Google Scholar] [CrossRef]
- Hewamalage, H.; Ackermann, K.; Bergmeir, C. Forecast evaluation for data scientists: Common pitfalls and best practices. Data Min. Knowl. Discov.
**2023**, 37, 788–832. [Google Scholar] [CrossRef] [PubMed] - Taieb, S.B.; Hyndman, R.J. Recursive and Direct Multi-step Forecasting: The Best of Both Worlds; Department of Econometrics and Business Statistics, Monash University: Clayton, VIC, Australia, 2012; Volume 19. [Google Scholar]
- Bandara, K.; Hyndman, R.; Bergmeir, C. MSTL: A Seasonal-Trend Decomposition Algorithm for Time Series with Multiple Seasonal Patterns. Int. J. Oper. Res.
**2022**, 1, 4842. [Google Scholar] [CrossRef] - Box, G.E.; Jenkins, G.M. Some recent advances in forecasting and control. J. R. Stat. Society. Ser. C (Appl. Stat.)
**1968**, 17, 91–109. [Google Scholar] [CrossRef] - Box, G.E.; Pierce, D.A. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. J. Am. Stat. Assoc.
**1970**, 65, 1509–1526. [Google Scholar] [CrossRef] - Holt, C.C. Forecasting seasonals and trends by exponentially weighted moving averages. Int. J. Forecast.
**2004**, 20, 5–10. [Google Scholar] [CrossRef] - Winters, P.R. Forecasting sales by exponentially weighted moving averages. Manag. Sci.
**1960**, 6, 324–342. [Google Scholar] [CrossRef] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] - Boser, B.E.; Guyon, I.M.; Vapnik, V.N. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, PA, USA, 27–29 July 1992; pp. 144–152. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed] - Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS 2014 Workshop on Deep Learning, Montreal, QC, Canada, 13 December 2014. [Google Scholar]
- Qin, Y.; Song, D.; Cheng, H.; Cheng, W.; Jiang, G.; Cottrell, G.W. A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, VIC, Australia, 19–25 August 2017; pp. 2627–2633. [Google Scholar]
- Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv
**2018**, arXiv:1803.01271. [Google Scholar] - Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst.
**2017**, 30, 5998–6008. [Google Scholar] - Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient Transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 11106–11115. [Google Scholar]
- Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition Transformers with auto-correlation for long-term series forecasting. Adv. Neural Inf. Process. Syst.
**2021**, 34, 22419–22430. [Google Scholar] - Hyndman, R.J.; Athanasopoulos, G. Components of Time Series Data. 2021. Available online: https://otexts.com/fpp3/components.html (accessed on 27 August 2023).
- Hyndman, R.J.; Athanasopoulos, G. Classical Decomposition. 2021. Available online: https://otexts.com/fpp3/classical-decomposition.html (accessed on 27 August 2023).
- Bell, W.R.; Hillmer, S.C. Issues involved with the seasonal adjustment of economic time series. J. Bus. Econ. Stat.
**1984**, 2, 291–320. [Google Scholar] - Findley, D.F.; Monsell, B.C.; Bell, W.R.; Otto, M.C.; Chen, B.C. New capabilities and methods of the X-12-ARIMA seasonal-adjustment program. J. Bus. Econ. Stat.
**1998**, 16, 127–152. [Google Scholar] - Cleveland, R.B.; Cleveland, W.S.; McRae, J.E.; Terpenning, I. STL: A seasonal-trend decomposition. J. Off. Stat
**1990**, 6, 3–73. [Google Scholar] - Dokumentov, A.; Hyndman, R.J. STR: A Seasonal-Trend Decomposition Procedure Based on Regression; Working Paper 13/15; Monash University: Melbourne, VIC, Australia, 2015; Volume 13, pp. 1–32. [Google Scholar]
- Wen, Q.; Zhang, Z.; Li, Y.; Sun, L. Fast RobustSTL: Efficient and robust seasonal-trend decomposition for time series with complex patterns. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 6–10 June 2020; pp. 2203–2213. [Google Scholar]
- Bandara, K.; Bergmeir, C.; Hewamalage, H. LSTM-MSNet: Leveraging forecasts on sets of related time series with multiple seasonal patterns. IEEE Trans. Neural Networks Learn. Syst.
**2020**, 32, 1586–1599. [Google Scholar] [CrossRef] - Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst.
**2019**, 32, 1–10. [Google Scholar] - PyTorch Lightning. 2023. Available online: https://lightning.ai/ (accessed on 19 September 2023).
- Lai, G.; Chang, W.C.; Yang, Y.; Liu, H. Modeling long-and short-term temporal patterns with deep neural networks. In Proceedings of the The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA, 8–12 July 2018; pp. 95–104. [Google Scholar]
- Diebold, F.X.; Mariano, R.S. Comparing predictive accuracy. J. Bus. Econ. Stat.
**2002**, 20, 134–144. [Google Scholar] [CrossRef]

**Figure 1.**A visual comparison of the IMS and DMS methods: The IMS method (

**left**) predicts future values step by step, moving the retrospective window forward one step each time. To determine the complete prediction horizon, this process is repeated. Only one function, $f\left(x\right)$, is learned. Conversely, the DMS method (

**right**) generates all prediction values at once. A distinct function is required for each prediction value, such as ${f}_{1}\left(x\right)$, ${f}_{2}\left(x\right)$, etc.

**Figure 5.**All the forecasts were made for a prediction length of 96 time steps. The comparison was made between Decompose & Conquer and the second-best-performing model, which was the DLinear model, in all situations considered. (

**a**)

**ETTh1**: Oil temperature forecast of Decompose & Conquer vs. DLinear. (

**b**)

**Electricity**: Electricity consumption forecast of Decompose & Conquer vs. DLinear. (

**c**)

**Exchange-rate**: Exchange rate forecast of Decompose & Conquer vs. DLinear. (

**d**)

**Weather**: CO${}_{2}$ forecast of method vs. DLinear.

Dataset | Electricity | ETTh1 & ETTh2 | ETTm1 & ETTm2 | Exchange | ILI | Weather | Traffic |
---|---|---|---|---|---|---|---|

Features | 321 | 7 | 7 | 8 | 7 | 21 | 862 |

Length | 26,304 | 17,420 | 69,680 | 7588 | 966 | 52,696 | 17,544 |

Granularity | 1 h | 1 h | 5 min | 1 d | 1 wk | 10 min | 1 h |

**Table 2.**Experimental results on real data (column “Transformers” reports the best result from FEDformer [5], Autoformer [22], and Informer [21]). The results were averaged over five runs to eliminate the effect of randomness. The IMP column displays the error reduction of the best model compared to the second-best one. The best results are highlighted in

**bold**and the second-best results are underlined.

Dataset | Length | Decompose & Conquer | DLinear | NLinear | Transformers | Repeat Last | Repeat Avg. | IMP | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | ||||||||

ETTh1 | 96 | 0.190 | 0.292 | 0.443 | 0.453 | 0.452 | 0.458 | 0.494 | 0.515 | 1.679 | 0.878 | 0.941 | 0.709 | 57.1% | 35.5% | ||||||

192 | 0.294 | 0.369 | 0.500 | 0.494 | 0.513 | 0.498 | 0.553 | 0.552 | 1.757 | 0.913 | 0.989 | 0.732 | 41.2% | 25.3% | |||||||

336 | 0.383 | 0.435 | 0.550 | 0.532 | 0.563 | 0.531 | 0.613 | 0.582 | 1.806 | 0.934 | 1.033 | 0.750 | 30.4% | 18.2% | |||||||

720 | 0.556 | 0.549 | 0.675 | 0.611 | 0.719 | 0.625 | 0.752 | 0.666 | 1.958 | 0.992 | 1.158 | 0.812 | 17.6% | 10.1% | |||||||

ETTh2 | 96 | 0.067 | 0.183 | 0.164 | 0.286 | 0.159 | 0.280 | 0.185 | 0.309 | 0.259 | 0.362 | 0.199 | 0.320 | 57.86% | 34.6% | ||||||

192 | 0.104 | 0.223 | 0.189 | 0.304 | 0.189 | 0.306 | 0.205 | 0.325 | 0.306 | 0.393 | 0.221 | 0.336 | 45.0% | 26.6% | |||||||

336 | 0.148 | 0.268 | 0.202 | 0.316 | 0.207 | 0.322 | 0.221 | 0.344 | 0.328 | 0.407 | 0.232 | 0.344 | 26.7% | 15.2% | |||||||

720 | 0.227 | 0.341 | 0.315 | 0.402 | 0.263 | 0.365 | 0.248 | 0.360 | 0.376 | 0.436 | 0.268 | 0.367 | 8.46% | 5.3% | |||||||

ETTm1 | 96 | 0.205 | 0.296 | 0.379 | 0.412 | 0.392 | 0.418 | 0.415 | 0.452 | 1.534 | 0.808 | 0.862 | 0.660 | 45.9% | 28.2% | ||||||

192 | 0.215 | 0.305 | 0.432 | 0.440 | 0.454 | 0.452 | 0.483 | 0.494 | 1.589 | 0.837 | 0.900 | 0.678 | 50.2% | 30.7% | |||||||

336 | 0.232 | 0.321 | 0.502 | 0.477 | 0.535 | 0.493 | 0.537 | 0.527 | 1.654 | 0.867 | 0.947 | 0.702 | 53.8% | 32.7% | |||||||

720 | 0.343 | 0.400 | 0.570 | 0.524 | 0.615 | 0.543 | 0.655 | 0.597 | 1.740 | 0.906 | 1.011 | 0.735 | 39.8% | 23.7% | |||||||

ETTm2 | 96 | 0.061 | 0.172 | 0.111 | 0.230 | 0.108 | 0.226 | 0.114 | 0.236 | 0.192 | 0.308 | 0.138 | 0.267 | 43.5% | 24.0% | ||||||

192 | 0.065 | 0.178 | 0.146 | 0.265 | 0.133 | 0.253 | 0.143 | 0.266 | 0.220 | 0.332 | 0.158 | 0.283 | 51.1% | 29.6% | |||||||

336 | 0.072 | 0.188 | 0.176 | 0.291 | 0.162 | 0.280 | 0.162 | 0.281 | 0.251 | 0.355 | 0.184 | 0.304 | 55.6% | 32.9% | |||||||

720 | 0.121 | 0.242 | 0.210 | 0.323 | 0.209 | 0.317 | 0.216 | 0.326 | 0.302 | 0.390 | 0.228 | 0.336 | 42.1% | 23.7% | |||||||

Electricity | 96 | 0.073 | 0.170 | 0.189 | 0.273 | 0.190 | 0.268 | 0.195 | 0.307 | 1.621 | 0.954 | 0.862 | 0.768 | 61.4% | 36.6% | ||||||

192 | 0.096 | 0.195 | 0.188 | 0.275 | 0.189 | 0.270 | 0.203 | 0.314 | 1.627 | 0.959 | 0.864 | 0.767 | 48.9% | 27.8% | |||||||

336 | 0.123 | 0.224 | 0.200 | 0.290 | 0.203 | 0.284 | 0.216 | 0.327 | 1.644 | 0.968 | 0.872 | 0.769 | 38.5% | 21.1% | |||||||

720 | 0.170 | 0.271 | 0.234 | 0.323 | 0.244 | 0.318 | 0.252 | 0.355 | 1.666 | 0.980 | 0.896 | 0.774 | 27.4% | 14.7% | |||||||

Exchange | 96 | 0.043 | 0.152 | 0.092 | 0.218 | 0.091 | 0.214 | 0.133 | 0.272 | 0.083 | 0.203 | 0.135 | 0.275 | 48.2% | 25.1% | ||||||

192 | 0.081 | 0.210 | 0.204 | 0.338 | 0.181 | 0.308 | 0.249 | 0.380 | 0.173 | 0.300 | 0.235 | 0.364 | 53.2% | 30.0% | |||||||

336 | 0.138 | 0.270 | 0.303 | 0.422 | 0.349 | 0.431 | 0.484 | 0.505 | 0.334 | 0.421 | 0.402 | 0.476 | 54.5% | 35.8% | |||||||

720 | 0.465 | 0.507 | 0.825 | 0.692 | 1.161 | 0.832 | 1.279 | 0.905 | 0.985 | 0.769 | 1.079 | 0.822 | 43.6% | 26.7% | |||||||

Weather | 96 | 0.060 | 0.109 | 0.152 | 0.229 | 0.157 | 0.202 | 0.217 | 0.310 | 0.241 | 0.245 | 0.201 | 0.262 | 60.5% | 46.0% | ||||||

192 | 0.100 | 0.164 | 0.195 | 0.274 | 0.199 | 0.244 | 0.297 | 0.376 | 0.290 | 0.282 | 0.250 | 0.297 | 48.7% | 32.7% | |||||||

336 | 0.165 | 0.233 | 0.251 | 0.317 | 0.257 | 0.285 | 0.310 | 0.360 | 0.358 | 0.328 | 0.301 | 0.327 | 34.3% | 18.2% | |||||||

720 | 0.276 | 0.331 | 0.331 | 0.373 | 0.336 | 0.334 | 0.387 | 0.400 | 0.444 | 0.381 | 0.366 | 0.365 | 16.6% | 0.9% | |||||||

Traffic | 96 | 0.257 | 0.243 | 0.695 | 0.420 | 0.678 | 0.402 | 0.608 | 0.373 | 2.781 | 1.085 | 1.448 | 0.813 | 57.7% | 34.9% | ||||||

192 | 0.313 | 0.265 | 0.641 | 0.391 | 0.633 | 0.380 | 0.647 | 0.398 | 2.824 | 1.097 | 1.454 | 0.816 | 50.6% | 30.2% | |||||||

336 | 0.364 | 0.287 | 0.646 | 0.393 | 0.640 | 0.382 | 0.683 | 0.420 | 2.871 | 1.107 | 1.471 | 0.820 | 43.1% | 24.9% | |||||||

720 | 0.430 | 0.321 | 0.682 | 0.415 | 0.674 | 0.401 | 0.677 | 0.417 | 2.885 | 1.108 | 1.482 | 0.821 | 36.2% | 20.0% | |||||||

Illness | 24 | 0.622 | 0.540 | 2.589 | 1.044 | 2.689 | 1.096 | 3.402 | 1.324 | 5.547 | 1.499 | 5.076 | 1.758 | 76.0% | 48.3% | ||||||

36 | 0.889 | 0.650 | 2.840 | 1.106 | 2.572 | 1.069 | 2.813 | 1.132 | 7.312 | 1.833 | 4.788 | 1.674 | 65.4% | 39.2% | |||||||

48 | 1.023 | 0.700 | 3.134 | 1.172 | 2.654 | 1.093 | 2.868 | 1.166 | 7.806 | 1.943 | 4.491 | 1.601 | 61.5% | 36.0% | |||||||

60 | 1.299 | 0.819 | 3.286 | 1.193 | 2.767 | 1.111 | 3.175 | 1.248 | 6.917 | 1.788 | 4.485 | 1.575 | 53.1% | 26.3% |

**Table 3.**DM statistics and p-values from a Diebold–Mariano test comparing the results of the Decompose & Conquer model with various baselines.

DLinear | NLinear | FEDformer | Autoformer | Informer | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

DM | p-Value | DM | p-Value | DM | p-Value | DM | p-Value | DM | p-Value | ||||||

−7.78 | 4.9 × 10${}^{-15}$ | −5.89 | 2.1 × 10${}^{-9}$ | −8.72 | 2.2 × 10${}^{-18}$ | −17.93 | 8.0 × 10${}^{-69}$ | −21.5 | 6.2 × 10${}^{-96}$ |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Sohrabbeig, A.; Ardakanian, O.; Musilek, P.
Decompose and Conquer: Time Series Forecasting with Multiseasonal Trend Decomposition Using Loess. *Forecasting* **2023**, *5*, 684-696.
https://doi.org/10.3390/forecast5040037

**AMA Style**

Sohrabbeig A, Ardakanian O, Musilek P.
Decompose and Conquer: Time Series Forecasting with Multiseasonal Trend Decomposition Using Loess. *Forecasting*. 2023; 5(4):684-696.
https://doi.org/10.3390/forecast5040037

**Chicago/Turabian Style**

Sohrabbeig, Amirhossein, Omid Ardakanian, and Petr Musilek.
2023. "Decompose and Conquer: Time Series Forecasting with Multiseasonal Trend Decomposition Using Loess" *Forecasting* 5, no. 4: 684-696.
https://doi.org/10.3390/forecast5040037