Next Article in Journal
Moving Deep Learning to the Edge
Previous Article in Journal
Incremental FPT Delay
Previous Article in Special Issue
Deterministic Coresets for k-Means of Big Sparse Data
Open AccessArticle

Mining Sequential Patterns with VC-Dimension and Rademacher Complexity

Department of Information Engineering, University of Padova, 35131 Padova, Italy
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Algorithms 2020, 13(5), 123; https://doi.org/10.3390/a13050123
Received: 10 April 2020 / Revised: 13 May 2020 / Accepted: 14 May 2020 / Published: 18 May 2020
(This article belongs to the Special Issue Big Data Algorithmics)
Sequential pattern mining is a fundamental data mining task with application in several domains. We study two variants of this task—the first is the extraction of frequent sequential patterns, whose frequency in a dataset of sequential transactions is higher than a user-provided threshold; the second is the mining of true frequent sequential patterns, which appear with probability above a user-defined threshold in transactions drawn from the generative process underlying the data. We present the first sampling-based algorithm to mine, with high confidence, a rigorous approximation of the frequent sequential patterns from massive datasets. We also present the first algorithms to mine approximations of the true frequent sequential patterns with rigorous guarantees on the quality of the output. Our algorithms are based on novel applications of Vapnik-Chervonenkis dimension and Rademacher complexity, advanced tools from statistical learning theory, to sequential pattern mining. Our extensive experimental evaluation shows that our algorithms provide high-quality approximations for both problems we consider. View Full-Text
Keywords: data mining; sequential patterns; sampling; VC-dimension; Rademacher complexity; statistical learning data mining; sequential patterns; sampling; VC-dimension; Rademacher complexity; statistical learning
Show Figures

Figure 1

MDPI and ACS Style

Santoro, D.; Tonon, A.; Vandin, F. Mining Sequential Patterns with VC-Dimension and Rademacher Complexity. Algorithms 2020, 13, 123.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop