A systematic review of Python packages for time series analysis

This paper presents a systematic review of Python packages with a focus on time series analysis. The objective is to provide (1) an overview of the different time series analysis tasks and preprocessing methods implemented, and (2) an overview of the development characteristics of the packages (e.g., documentation, dependencies, and community size). This review is based on a search of literature databases as well as GitHub repositories. Following the filtering process, 40 packages were analyzed. We classified the packages according to the analysis tasks implemented, the methods related to data preparation, and the means for evaluating the results produced (methods and access to evaluation data). We also reviewed documentation aspects, the licenses, the size of the packages' community, and the dependencies used. Among other things, our results show that forecasting is by far the most frequently implemented task, that half of the packages provide access to real datasets or allow generating synthetic data, and that many packages depend on a few libraries (the most used ones being numpy, scipy and pandas). We hope that this review can help practitioners and researchers navigate the space of Python packages dedicated to time series analysis. We will provide an updated list of the reviewed packages online at https://siebert-julien.github.io/time-series-analysis-python/.


Introduction
A time series is a set of data points generated from successive measurements over time. The analysis of this type of data has found application in many fields, from finance to health, including the monitoring of computer networks or the environment. The current trend of reducing the cost of sensors and data storage, the increasing performance of Big Data and data analysis technologies such as machine learning or data mining, are opening up more and more possibilities to acquire and analyze temporal data. Moreover, as the number of time series analysis application cases rises, more and more data scientists, data engineers, analysts, and software engineers have to use dedicated time series analysis libraries.
In this article, we systematically review Python packages dedicated to time series analysis. Python is one of the programming languages of choice for data scientists 1 . Data scientists are not only responsible for analyzing data; their task is also to ensure that services based on these analyses reach a sufficient level of maturity to be deployed and maintained in production. In this context, we review not only the analysis tasks implemented in the packages, but also several factors external to the tasks themselves, such as which dependencies are used or how big the community behind the development of the package in question is. Our goal is not to evaluate the quality of the implementations themselves but to provide a structured overview that is useful for data scientists confronted with time series analysis (and faced with having to choose which packages to rely on), the scientific community, and the community of Python developers working in this field. This paper is structured as follows: Related work is introduced in section 2; the search methodology and the search results are described in sections 3 and 4, respectively; threats to validity are discussed in section 5; and section 6 concludes the paper.
However, existing implementations (software packages or libraries) are often listed -usually in a non-systematic way -in textbooks (like [10,34] for R, or [27] for Python) or gray literature (for example, Towards Data Science 2 , KDnuggets 3 or Machine Learning Mastery 4 ), and few papers actually systematically review packages or libraries in a specific language. For example, [19] reviewed packages for analyzing animal movement data in R, and [36] surveyed R packages for hydrology. With respect to Python, we found several reviews of packages for different domains: social media content scrapping [42], topological data analysis [29], or data mining [38]. For time series analysis in Python, the only related work we could find is [18], where the authors review packages focusing on forecasting.
There is, to the best of our knowledge, no systematic review of Python packages for generic time series analysis.

Methodology
We conducted a systematic literature review according to [21]. However, these guidelines focus on printed literature, not on software packages. Hence, we adjusted these methods. Our search process is illustrated in Figure 1. We conducted a search in both literature databases and code repositories (GitHub). The following sections provide more details on the different steps of the search itself.

Research questions
We already stated our goal and the context we set for this review in the introduction. We formalize this context as follows: We want to analyze Python packages dedicated to time series analysis for the purpose of structuring the available implementations (we explicitly exclude the purpose of evaluating them) with respect to the implemented time series analysis tasks from the viewpoint of practitioners in the context of building data-driven services on top of these implementations. Our research questions are: -RQ1 Which time series analysis tasks exist? And which of these are implemented in maintained Python packages? -RQ2 How do the packages support the evaluation of the produced results? -RQ3 How do the packages support their usage, and what insights can we gain to estimate the durability of a given package and make an informed choice about its long-term use?

Inclusion criteria
To guide our review and filter relevant packages, we defined the following inclusion criteria (IC): The package should be open source, written in Python, and available on GitHub (IC1). The package should be actively maintained (last commit within less than 6 months) (IC2.1); it should have more than 100 GitHub stars (IC2.2); and it should be listed in PyPI 5 and be installable via pip 6 or conda 7 (IC2.3). The package should explicitly target time series analysis (IC3). We excluded packages that can be used for time series analysis (as building blocks) but whose main purpose is not time series analysis per se (for example, generic scientific computing packages such as scipy or numpy, packages dedicated to data manipulation or storage such as pandas, or generic machine learning or data mining packages such as scikit-learn). Finally, we focused our search on packages offering methods that tend to be domain-agnostic (IC4) and excluded domain-specific packages. Domain-specific packages are packages aiming to solve time series analysis in a specific domain (for example, audio, finance, geoscience, etc.). They usually focus on specific types and formats of time series and domain related analysis tasks.

Searching open-source repositories in GitHub
In order to filter GitHub repositories, we selected a list of topics 8 , filtered the results by language (Python, IC1), by number of stars (at least 100, IC2.2), and considered only repositories that were updated after July 2020 (IC2.1).
In order to select a list of relevant topics, we first manually selected a list of eight Python packages known to be used in time series analysis (i.e., a seeds set): pandas, numpy, scipy, statsmodel, ruptures, tsfresh, tslearn, and sktime; as well as a sample of the packages using the topic "time-series". We examined the topics used by these packages and then extended this list of topics with different spellings while manually double-checking their existence in GitHub. We considered a total of 16 different topics (see Table 1). The first search led to a total of 115 repositories.

time-series
time-series-regression signal-processing time-series-classification time-series-analysis time-series-forecast time-series-visualization time-series-decomposition time-series-forecasting time-series-data-mining timeseries timeseries-forecasting time-series-prediction time-series-segmentation timeseries-analysis time-series-clustering Table 1: List of topics used to conduct the search on GitHub Removing duplicates We found 24 unique repositories that were duplicated (i.e., listed in more than one topic). After duplicate removal, 81 unique repositories remained.
Checking if the repository contains the code of a Python package We restricted our search to packages that are referenced by PyPI and can be installed with pip or conda (IC2.3). Note that the repository name might not reflect the package name (if one exists). For example, the repository https://github.com/PyWavelets/pywt contains the source code for the package named pywavelets. The repository https://github.com/angus924/rocket does not contain the source code for the Python package rocket. We therefore checked each of the 81 repositories manually and excluded 22 repositories, which yielded a total of 59 remaining repositories that contain the source code of a Python package.
Including only packages focused on time series analysis Finally, we manually checked whether the focus of the package is time series analysis (IC3). After exclusion, 47 remaining packages were kept for further analysis.

Searching scientific bibliographic databases
The search for packages only in a repository might not be sufficient to cover all existing packages. For example, one of our seed packages (namely tsfresh) was not uncovered by the search. Hence, we extended our search to existing literature and software databases. We used the bibliographic databases IEEE Xplore 9 , ACM Digital Library 10 , Web of Science 11 , and Scopus 12 , as well as the Journal of Open Source Software (JOSS) 13 , and Zenodo 14 . For IEEE Xplore, ACM Digital Library, Web of Science, and Scopus, we limited ourselves to the search string "Python" AND "time series" in the document title. For the Journal of Open Source Software (JOSS), we first used the key words "time series" and then filtered the results by language (the query used is: https://joss.theoj.org/papers/search?q=time+series). For Zenodo, we also used the search string "Python" AND "time series", limited the search to the software category and removed the duplicates (e.g., different versions of the same software). The full query for Zenodo is: https://zenodo.org/search?page=1&size=200&q=%22time%20series%22%20AND% 20%22python%22&sort=mostrecent&type=software. We only included references that matched our inclusion criteria IC1, IC2.*, and IC3.

Snowballing
In order to extend our search, we used a snowballing approach. We first manually reviewed the package documentations in order to find links to other similar packages. Only two packagestsfresh 15 and sktime 16 -actually document related packages. Second, we manually reviewed the documentation and the GitHub repositories of all packages to find related publications. We then reviewed the papers to find new packages (i.e., we performed a single backward snowballing pass). Out of a total of 79 packages, 15 new packages were included after the snowballing phase, for a total of 67 packages.

Generic vs. domain-specific packages (IC4)
Finally, we classified the packages in two categories: domain-specific and generic. As previously defined, we consider domain-specific packages to be packages aiming to solve time series analysis in a specific domain (for example, audio, finance, geoscience, etc.) and generic packages as those offering methods that tend to be domain-agnostic. Out of the 67 packages, 27 packages were categorized as domain-specific and 40 packages as generic.

Data extraction and categorization
We manually extracted relevant information about the packages from their documentation pages and code. For the categorization, we used an iterative, bottom-up approach. Two researchers first proposed category definitions and then categorized the packages. A third researcher was responsible for resolving disagreements. Iterations were performed until the categories and results were consolidated.

RQ1: Implementation of the time series analysis tasks
To answer our research question RQ1, we first reviewed the task definitions present in the literature and then analyzed the 40 packages classified as generic to extract information about which tasks have been implemented in the packages.
Task definitions Time series analysis tasks are formally defined in the literature. Reviews like [12,13,16,20] define the following tasks: Indexing (query by content): given a time series and some similarity measure, find the nearest matching time series [12,13,20]. Clustering: find groups (clusters) of similar time series [12,13,16,20]. Classification: assign a time series to a predefined class [12,13,16,20]. Segmentation (Summarization): create an accurate approximation of a time series by reducing its dimensionality while retaining its essential features [12,13,16,20]. Forecasting (Prediction): given a time series dataset up to a given time t n , forecast the next values [12,13]. Anomaly Detection: find abnormal data points or subsequences (also called discords) [12,13]. Motif Discovery: find every subsequence (called motif) that appears recurrently in a time series [12,13,16]. Rules Discovery (Rule Mining): find the rules that may govern associations between sets of time series or subsequences [13,16]. Esling and Agon also define implementation components [12]: preprocessing (e.g., filtering noise, removing outliers, or imputing missing values), representation (e.g., dimensionality reduction, finding fundamental shape characteristics), similarity measures, and indexing schemes.
Implemented tasks While analyzing the packages, we found packages explicitly mentioning the tasks corresponding to our literature review. We found 20 packages explicitly providing forecasting methods (T1), 6 packages providing classification methods (T2), 6 packages providing clustering methods (T3), 6 packages providing anomaly detection methods (T4), and 4 packages providing segmentation methods (T5). We classified four packages under the category pattern recognition (T6), encompassing both indexing and motif discovery tasks. We also classified five packages under the category change point detection (T7), which was not in our literature review. Finally, we could not find any package explicitly mentioning the rules discovery task.
Considering the implementation components, we found 4 packages explicitly providing dimensionality reduction methods (DP1), 17 packages explicitly providing missing values imputation methods (DP2), 16 packages explicitly providing decomposition methods (e.g., decomposing time series into trends, seasonal components, or frequency components) (DP3), 24 packages explicitly providing generic transformation and features generation methods (DP4), and 7 packages explicitly providing methods for computing similarity measures (DP5). Table 4 gives an overview of our categorization of the packages.
Forecasting is by far the most frequently implemented task. There is no significant difference, in terms of number of packages, between the other tasks. However, we need to be cautious when interpreting these numbers. First, the tasks as formally defined in the literature might not be explicitly mentioned in the packages documentation or code. Second, the delineation between a task and the methods used to implement it is sometimes blurry and context dependent. For example, one can perform change point detection for the sake of finding time points where some time series properties change and, as a consequence, raising alarms in a production system, or use it as a preprocessing step for segmenting a time series into different phases. Another example are forecasting models, which can also be applied for outlier detection.

RQ2: Evaluation of the produced results
To answer our research question RQ2, we extracted information about the evaluation of the outcomes produced by the packages. We came up with two main clusters: functions that facilitate the evaluation itself (E1, E2, E3) and functions for either generating synthetic data or downloading existing datasets (D1, D2). We found 13 packages explicitly providing methods for model selection, hyperparameter search, or feature selection (E1), 20 packages explicitly providing evaluation metrics and statistical tests (E2), and 25 packages providing visualization methods (E3).
Concerning the data, we found 16 packages explicitly providing functions for generating synthetic time series data (D1), and 19 packages providing access to time series datasets (D2). A large majority of the packages provide a way to evaluate the results produced. Only 4 packages have not been classified in any of the E or D classes.

RQ3: Package usage and community
To answer our research question RQ3, we extracted information about the documentation, the dependencies, and the community supporting the packages. For instance, GitHub provides many statistics about a repository (e.g., the number of stars, forks, issues) that can be used to get a first idea of the liveliness of the different packages. We used the number of GitHub stars and forks to estimate the community behind each package. Figure 2a shows the distribution of stars and forks for all 40 packages. Another piece of information that is relevant to practitioners are the licenses under which the implementations are available. Figure 2b shows the distribution of the licenses used among the 40 repositories. We also investigated the dependencies used by each of the selected 40 packages. We used the Python program johnnydep 17 to automatically collect the dependencies without installing the packages directly. We only looked at direct dependencies required for the installation of the package. We did not consider specific installation options such as dev or test. We did not search for all dependencies recursively. Here is an example of how we called the program johnnydep PACKAGENAME --fields=ALL --no-deps --output-format=json. The dependencies of two packages could not be retrieved automatically (cesium and deeptime). We also manually cross-checked the dependencies and filled in the missing ones. Table 3 shows which dependencies are used the most by the packages.
Finally, we investigated five documentation aspects (Do1-Do5). We found that 30 packages provide a separate documentation page (Do1). The other ten packages use the README of Package Name Used Rank Package Name Used Rank  numpy  37  1  torch  6  8  scipy  30  2  numba  6  8  pandas  23  3  cython  6  8  scikit-learn  21  4  tensorflow  5  9  matplotlib  16  5  seaborn  4  10  statsmodels  8  6  future  4  10  tqdm  7  7  joblib  4  10   Table 3: Ranking of the most frequently used dependencies the repository file as documentation. 18 packages provide notebooks directly executable without installation via a link to either mybinder.org 18 or Google Colab 19 (Do2 +), 12 packages provide stand-alone notebook files to be downloaded (Do2 *), and 10 packages do not provide any notebook file at all. 28 packages provide an API reference (Do3). All packages provide an installation page (Do4) and almost all packages (38) provide user guides in the form of static examples or tutorials.

Discussion and threats to validity
In this section, we discuss the choices we made and that may affect the validity of this review. This review focused on GitHub. Gitlab and Sourceforge were checked manually, but we decided not to include them as sources due to the insufficient number of results.
We limited ourselves to packages with at least 100 stars. This somehow arbitrary limit led us to exclude packages with a number of stars close to 100 (e.g., the stingray package with 93 stars at the time of the search). We excluded packages that were not maintained but might have been relevant for practitioners. An example is the pyflux package (forecasting). We also excluded repositories that are not Python packages. This led us to discard interesting repositories like ad examples (which provides state-of-the-art anomaly detection methods) and many repositories containing code scripts associated with scientific papers.
Concerning the search process, we used a mix of literature databases and GitHub topics together with a snowballing approach to find relevant packages. The reason forthis was that several known packages could not be found automatically. For example, the package cesium does not list any topic and therefore was not found in our first GitHub search. It was found after snowballing. Another example is tsfresh, which was missing in the first GitHub search and was found in the literature search. The problem may be the language filter (strictly Python), as tsfresh lists some of the topics we searched for ("time-series").
We tried to automate some of the tasks (e.g., filtering repositories that contain Python packages or finding the dependencies), using both PyPI and GitHub API, or the johnnydep tool. There were false positives and false negatives. This led us to manually cross check the results obtained from our automated search.
Whether a package focuses on time series analysis or not can sometimes be fuzzy. For example, we decided to leave the topic of survival analysis out of this review. We initially found two packages: lifelines and scikit-survival. The same applies to the boundary between generic and domain-specific packages. We took a conservative approach to keep our survey sufficiently focused.
As already mentioned above, the definition of what should be regarded as a task vs. an "implementation component" is difficult, as a strict boundary may not even exist. Moreover, it is sometimes not clear what methods the packages provide without actually installing them and testing them. Indeed, the documentation might not be complete or the vocabulary used may differ from one package to another. One solution was to check the code itself. Here again, the search strings used play an important role in avoiding false negatives.

Conclusion
This paper presented a systematic review of Python packages dedicated to time series analysis. The search process led to a total of 40 packages that were analyzed further. We proposed a categorization of the packages based on the analysis tasks implemented, the methods related to data preparation, the means for evaluating the results produced, and the kind of documentation present, and also looked at some development aspects (licenses, stars, dependencies). We also discussed the search process with its possible bias and the challenges we encountered while searching for and reviewing the relevant packages. The scope of this survey does, however, not include any evaluation of the implementations or the results they would produce, for example, on benchmark datasets. We hope that this review can help practitioners and researchers navigate the space of Python packages dedicated to time series analysis. Since the packages will evolve, we plan to maintain an updated list of the reviewed packages online at https://siebert-julien. github.io/time-series-analysis-python/.