Special Issue "Real-time Stream Analytics, Stream Mining, CER/CEP and Stream Data Management in Big Data"

A special issue of Data (ISSN 2306-5729). This special issue belongs to the section "Information Systems and Data Management".

Deadline for manuscript submissions: closed (1 March 2021).

Special Issue Editors

Dr. Alessandro Margara
E-Mail Website
Guest Editor
Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB), Politecnico di Milano, 20133 Milano Italy
Interests: distributed computing; event-based system; CEP/CER; stream processing
Dr. Albert Bifet
E-Mail Website
Guest Editor
Dr. Sabri Skhiri
E-Mail Website
Guest Editor
EURA NOVA Research, Belgium
Interests: stream processing; machine learning; distributed and high performance computing; data management

Special Issue Information

Dear Colleagues,

After the success of the first four editions of this workshop, co-located with the IEEE Big Data 2016, 2017, 2018, and 2019, we are pleased to organize the 5th Workshop on Real-time Stream Analytics, Stream Mining, CER/CEP, and Stream Data Management in Big Data. The authors of accepted papers will have the opportunity to publish an extended version in the Data Special Issue.

Stream processing and real-time analytics have become some of the most important topics in Big Data. Noticeably, the industry tends to develop more robust, powerful, and intelligent stream processing applications. Fraud detection for instant payments, scoring of consumers on websites and shops, claims analysis and cost estimates, image processing for surveillance, food, and agriculture, etc. are only a few potential applications of real-time stream processing and analytics.

The recent introduction of stateful stream processing [9,14,16] has enabled the development of a new kind of real-time applications. Indeed, hot and cold data have been combined into a single real-time data flow using the concept of Stream Tables [15,16]. The concept of duality between Streams and Tables is not recent. It was first introduced in 2003 as a “Relation to Stream” transformation, called STREAM [20]. However, it is only with the emergence of state management [14] that Stream Tables can now be used in real-time and in a completely distributed manner.

Furthermore, stateful stream processing has been applied in data management using stream and complex event processing (CEP) or composite event recognition (CER) [20]. New architecture patterns have been proposed to resolve data pipelines and data management within the enterprise. For instance, the authors in [11,12] proposed new designs for the extract, transform and load (ETL) steps based on stream processing. Thus, by breaking down silos between enterprise data warehouses (EDW) and big data lakes [13], doors have been opened to completely redesign the way data are transported, stored and used within the big data environment. More recently, Friedman et al. described in [21] how a data hub can be implemented to store and distribute data within an enterprise context.

In the past few years, researchers and practitioners in the area of data stream management and CEP/CER [1–5] have developed systems to process unbounded streams of data and quickly detect situations of interest. Today, big data technologies provide a new ecosystem to foster research in this area [6]. Highly scalable distributed stream processors, the convergence of batch and stream engines, and the emergence of state management and stateful stream processing (such as Apache Spark [9], Apache Flink [10], Kafka Stream [18,19], and Google dataflow [17]) have opened up new opportunities for highly scalable and distributed real-time analytics.

Going further, these technologies also provide solid-foundation algorithms complementary to the CEP/CER in the use cases required by the industry. As a result, with the stateful nature of stream processors [14], stream SQL statements can be applied directly in the streaming engine and dynamic tables can be created [12,15,18]. Further, formalisms for reasoning on durative events have appeared in the past, introduced for improving CER [22–24]. This led to the introduction of stream reasoning for improving stream mining tasks, autonomous cars or drones, and many other use cases.

We invite researchers in this field to submit papers studying scalable online learning, incremental learning on stream processing infrastructures, complex event processing, and composite event recognition. We also encourage submissions on data stream management, data architecture using stream processing, and Internet of Things (IoT) data streaming. Additionally, we appreciate submissions studying the usage of stream processing in new innovative architectures.

References:

[1] E. Alevizos, A. Skarlatidis, A. Artikis, and G. Paliouras. “Probabilistic complex event recognition: A survey”. ACM Comput. Surv., 50(5):71:1– 71:31, 2017.

[2] Cugola, Gianpaolo, and Alessandro Margara. "Complex event processing with T-REX" Journal of Systems and Software 85.8: 1709-1728. 2012.

[3] I. Kolchinsky, I. Sharfman, and A. Schuster. “Lazy evaluation methods for detecting complex events”. In Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, DEBS 15, page 3445. ACM, 2015.

[4] Abadi, Daniel J et al. "The Design of the Borealis Stream Processing Engine." CIDR 4: 277-289. 2015.

[5] Agrawal, Jagrati et al. "Efficient pattern matching over event streams." Proceedings of the 2008 ACM SIGMOD international conference on Management of data 9 Jun. 2008: 147-160.

[6] N. Giatrakos, E. Alevizos, A. Artikis, A. Deligiannakis, and M. Garofalakis. “Complex event recognition in the big data era: A survey.” VLDB Journal, 2019. 

[7] Confluent blog post: Event Sourcing, CQRS, Stream Processing and Apache Kafka: What’s the connection? 

https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-connection/

[8] Confluent blog post: A practical guide to build a stream data platform, https://www.confluent.io/blog/stream-data-platform-1/ 

[9] Matei Zaharia and al.: “Discretized Streams: Fault-Tolerant Streaming Computation at Scale”. Proceedings of the SOSP Conference. 2013

[10] Paris Carbone and al. : “Apache Flink™: Stream and Batch Processing in a Single Engine”. In the Bulletin of the IEEE Computer Society Technical Committee on Data Engineering. 2015

[11] Neha Narkhede, ETL is dead, Long Live Streams https://www.infoq.com/presentations/etl-streams. December 2016

[12] Tathagata DAS, Real-time Streaming ETL with Structured Streaming in Apache Spark 2.1, https://databricks.com/blog/2017/01/19/real-time-streaming-etl-structured-streaming-apache-spark-2-1.html. January 2017

[13] Michael Ambrust, Databricks Delta: A Unified Data Management System for Real-time Big Data, https://databricks.com/blog/2017/10/25/databricks-delta-a-unified-management-system-for-real-time-big-data.html October 2017.

[14] Paris Carbone and al., “State Management in Apache Flink™, Consistent Stateful Distributed Stream Processing”. In the proceeding of VLDB 2017.

[15]  Fabian Hueske, Continuous Queries on Dynamic Tables. https://flink.apache.org/news/2017/04/04/dynamic-tables.html April 2017.

[16] Nico Kruber, A Journey to Beating Flink's SQL Performance.

https://www.ververica.com/blog/a-journey-to-beating-flinks-sql-performance, February 2020

[17] Tyler Akidau and al. "The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing". In the Proceedings of the VLDB Endowment, vol. 8, pp. 1792-1803. 2015.

[18] KStream Concepts, KTables, https://kafka.apache.org/documentation/streams/ consulted in March 2020.

[19] Abhishek Gupta, Learn stream processing with Kafka Streams: Stateless operations, https://dev.to/itnext/learn-stream-processing-with-kafka-streams-stateless-operations-1k4h March 2020.

[20] Arasu and al. , “STREAM: The Stanford Data Stream Management System”. In the proceedings of SIGMOD 2003.

[21] Ted Friedman and al., “Implementing the Data Hub: Architecture and Technology Choices”. Gartner Report,  August 2018.

[22] Foundation of Composite Event Recognition - Daghstul Seminar, February 2020 https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=20071

[23] Artikis, A., Sergot, M.J., Paliouras, G.: An event calculus for event recognition. IEEE Trans. Knowl. Data Eng. 27(4), 895–908. 2015.

[24]  Daniele Dell'Aglio, Emanuele Della Valle, Frank van Harmelen, Abraham Bernstein: Stream reasoning: A survey and outlook. Data Sci. 1(1-2): 59-83. 2017. 

[25] Harald Beck, Minh Dao-Tran, Thomas Eiter: LARS: A Logic-based framework for Analytic Reasoning over Streams. Artif. Intell. 261: 16-70. 2018 

Dr. Alessandro Margara
Dr. Albert Bifet
Dr. Sabri Skhiri
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Data is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Stream processing
  • Complex event processing (CEP)
  • Composite event recognition (CER)
  • Stream data management
  • Stream mining
  • Online mining
  • Big data

Published Papers

There is no accepted submissions to this special issue at this moment.
Back to TopTop