Submit to Algorithms Review for Algorithms Propose a Special Issue

Journal Menu

Journal Browser

Big Data Algorithmics

Special Issue Editors
Special Issue Information
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Databases and Data Structures".

Deadline for manuscript submissions: closed (31 December 2020) | Viewed by 15307

Share This Special Issue

Special Issue Editors

Prof. Andrea Pietracaprina

E-Mail Website
Guest Editor

Department of Information Engineering, University of Padova, I-35131 Padova, Italy
Interests: big data algorithmics and data mining; algorithms and data structures for parallel and hierarchical platforms; models of computation; wireless interconnection networks and routing

Prof. Geppino Pucci

E-Mail Website
Guest Editor

Department of Information Engineering, University of Padova, I-35131 Padova, Italy
Interests: algorithms and data structures for big data; models of computation; wireless interconnection networks and routing; graph analytics; high-performance numerical kernels; data mining

Dr. Francesco Silvestri

E-Mail Website
Guest Editor

Department of Information Engineering, University of Padova, I-35131 Padova, Italy
Interests: algorithms and data structures for big data; parallel and memory-efficient algorithms; algorithms for similarity search; models of computation
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The last two decades have witnessed an impressive growth in the amount of digital data produced in a wide spectrum of application domains. The effective exploitation of this data is of great value to science, business, government, and society, and it has been considered as one of the most important scientific challenges of the 21^st century. Due to the volume, velocity, heterogeneity, and noise, dealing with big data requires a sharp paradigm shift with respect to traditional computing. Novel computational frameworks such as MapReduce and Streaming have recently established themselves as standards for big data processing, both in academia and in industry. The effective use of these frameworks requires new approaches to algorithm design. For instance, algorithmic solutions must take into account that the data are typically distributed or streamed, and in any case cannot fit in a single machine's local memory. Moreover, since exact solutions are often computationally unfeasible for very large and possibly noisy inputs, suitable tradeoffs must be sought between the accuracy of the returned solution and computational efficiency.

Despite the considerable effort that the research community has been devoting to big data algorithmics, there is a dire need for new contributions as well as efforts to clean up and systematize very recent advances in the field. This Special Issue aims at making a step in this direction by soliciting original and unpublished contributions in the area. Topics of interest include (but are not limited to) those listed below.

TOPICS OF INTEREST

Computational Models for Big Data Processing
MapReduce Algorithms
Streaming Algorithms
Parallel and Distributed Algorithms for Data Intensive Computing
Approximation Algorithms for Massive Instances of Optimization Problems
Algorithms for Large Graphs
Models and Algorithms for Large Evolving Data and Sequences
Models and Algorithms for Large Uncertain Data
Dimensionality Reduction Techniques
Compact Data Structures and Sketches
Sampling Methodologies
Impossibility Results

Prof. Andrea Pietracaprina
Prof. Geppino Pucci
Prof. Francesco Silvestri
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

27 pages, 949 KiB

Open AccessArticle

Finding Top-k Nodes for Temporal Closeness in Large Temporal Graphs

by Pierluigi Crescenzi, Clémence Magnien and Andrea Marino

Algorithms 2020, 13(9), 211; https://doi.org/10.3390/a13090211 - 29 Aug 2020

Cited by 15 | Viewed by 4369

Abstract

The harmonic closeness centrality measure associates, to each node of a graph, the average of the inverse of its distances from all the other nodes (by assuming that unreachable nodes are at infinite distance). This notion has been adapted to temporal graphs (that is, graphs in which edges can appear and disappear during time) and in this paper we address the question of finding the top-k nodes for this metric. Computing the temporal closeness for one node can be done in

O (m)

time, where m is the number of temporal edges. Therefore computing exactly the closeness for all nodes, in order to find the ones with top closeness, would require

O (n m)

time, where n is the number of nodes. This time complexity is intractable for large temporal graphs. Instead, we show how this measure can be efficiently approximated by using a “backward” temporal breadth-first search algorithm and a classical sampling technique. Our experimental results show that the approximation is excellent for nodes with high closeness, allowing us to detect them in practice in a fraction of the time needed for computing the exact closeness of all nodes. We validate our approach with an extensive set of experiments. Full article

(This article belongs to the Special Issue Big Data Algorithmics)

► Show Figures

Figure 1

34 pages, 752 KiB

Open AccessArticle

Mining Sequential Patterns with VC-Dimension and Rademacher Complexity

by Diego Santoro, Andrea Tonon and Fabio Vandin

Algorithms 2020, 13(5), 123; https://doi.org/10.3390/a13050123 - 18 May 2020

Cited by 14 | Viewed by 5385

Abstract

Sequential pattern mining is a fundamental data mining task with application in several domains. We study two variants of this task—the first is the extraction of frequent sequential patterns, whose frequency in a dataset of sequential transactions is higher than a user-provided threshold; the second is the mining of true frequent sequential patterns, which appear with probability above a user-defined threshold in transactions drawn from the generative process underlying the data. We present the first sampling-based algorithm to mine, with high confidence, a rigorous approximation of the frequent sequential patterns from massive datasets. We also present the first algorithms to mine approximations of the true frequent sequential patterns with rigorous guarantees on the quality of the output. Our algorithms are based on novel applications of Vapnik-Chervonenkis dimension and Rademacher complexity, advanced tools from statistical learning theory, to sequential pattern mining. Our extensive experimental evaluation shows that our algorithms provide high-quality approximations for both problems we consider. Full article

(This article belongs to the Special Issue Big Data Algorithmics)

► Show Figures

Figure 1

16 pages, 1062 KiB

Open AccessArticle

Deterministic Coresets for k-Means of Big Sparse Data

by Artem Barger and Dan Feldman

Algorithms 2020, 13(4), 92; https://doi.org/10.3390/a13040092 - 14 Apr 2020

Cited by 8 | Viewed by 4578

Abstract

Let P be a set of n points in

R^{d}

k \geq 1

be an integer and

ε \in (0, 1)

be a constant. An ε-coreset is a subset

C \subseteq P

with appropriate non-negative weights (scalars), that [...] Read more.

Let P be a set of n points in

R^{d}

k \geq 1

be an integer and

ε \in (0, 1)

be a constant. An ε-coreset is a subset

C \subseteq P

with appropriate non-negative weights (scalars), that approximates any given set

Q \subseteq R^{d}

of k centers. That is, the sum of squared distances over every point in P to its closest point in Q is the same, up to a factor of

1 \pm ε

to the weighted sum of C to the same k centers. If the coreset is small, we can solve problems such as k-means clustering or its variants (e.g., discrete k-means, where the centers are restricted to be in P, or other restricted zones) on the small coreset to get faster provable approximations. Moreover, it is known that such coreset support streaming, dynamic and distributed data using the classic merge-reduce trees. The fact that the coreset is a subset implies that it preserves the sparsity of the data. However, existing such coresets are randomized and their size has at least linear dependency on the dimension d. We suggest the first such coreset of size independent of d. This is also the first deterministic coreset construction whose resulting size is not exponential in d. Extensive experimental results and benchmarks are provided on public datasets, including the first coreset of the English Wikipedia using Amazon’s cloud. Full article

(This article belongs to the Special Issue Big Data Algorithmics)

► Show Figures

Journal Menu

Journal Browser

Big Data Algorithmics

Share This Special Issue

Special Issue Editors

Special Issue Information

Benefits of Publishing in a Special Issue

Published Papers (3 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI