Next Article in Journal
Scams and Solutions in Cryptocurrencies—A Survey Analyzing Existing Machine Learning Models
Next Article in Special Issue
Fundamental Research Challenges for Distributed Computing Continuum Systems
Previous Article in Journal
Architecture-Oriented Agent-Based Simulations and Machine Learning Solution: The Case of Tsunami Emergency Analysis for Local Decision Makers
Previous Article in Special Issue
DEGAIN: Generative-Adversarial-Network-Based Missing Data Imputation
 
 
Article
Peer-Review Record

Quickening Data-Aware Conformance Checking through Temporal Algebras

Information 2023, 14(3), 173; https://doi.org/10.3390/info14030173
by Giacomo Bergami *, Samuel Appleby and Graham Morgan
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Information 2023, 14(3), 173; https://doi.org/10.3390/info14030173
Submission received: 14 November 2022 / Revised: 3 March 2023 / Accepted: 5 March 2023 / Published: 8 March 2023
(This article belongs to the Special Issue International Database Engineered Applications)

Round 1

Reviewer 1 Report

The paper describes an approach to conformance checking exploiting relational databases. The authors define a temporal algebra xtLTL_f that is as expressive as LTL_f. The authors show how xtLTL_f can exploit a relational database to obtain performance gains with regard to the other approaches. I will group my comments in major/intermediate/minor comments as follows:
* Major comments: The work is interesting and the authors show that their approach outperforms state-of-the-art tools. However, the paper suffers from some presentation flaws that make it hard to follow. The paper structure should be improved so that the paper is accessible also to readers that are not experts in the field. The authors should have started with a formalization of the problem. Some definitions are missing and it is hard to follow even the case studies at the beginning of the paper (e.g., it is not clear how to read Fig. 2b). The underlying assumptions are not explicitly declared (e.g. are the events totally ordered? are all the events observable?). My suggestion is to: a) first informally introduce the use cases that motivate the work, then b) formalize the problem and introduce the definitions (for example, the concepts of activation, target, trace are just informally mentioned in the introduction, and other concepts such as atomization, elementary intervals are not explicitly introduced) and finally c) get back to the case studies with their formalization. Also, Section 2 (Related Work) indeed contains some important background notions and not just comparisons with other approaches as the title suggests. I suggest to change the title of this section to "Background" or "Preliminaries" and I would also add here the semantics of LTL_f from Supplement I. Table 4 contains the notation, but it cannot replace the formalization because it is hard to read (the rows are hard to follow), and one cannot grasp the meaning of the symbols (e.g. what is \omega?).
It is also not clear what is the relationship between this work and the relational model/traditional relational DBMS architectures. Discussing query optimization and temporal algebras induce the reader to think about an approach that uses a custom database architecture (this is also briefly mentioned in l. 1083), but it is not clear. I suggest to add a picture representing the architecture of the system and to discuss what is the relationship with temporal relational algebras (refer, for example, to Anselma, Bottrighi, Montani, Terenziani (2011). Extending BCDM to cope with proposals and evaluations of updates. IEEE Transactions on Knowledge and Data Engineering).
* Intermediate comments: I do not think that the abstract should start by stating that this paper is an extension of a conference paper. It is enough to report this information in the introduction.
I did not catch what the color blue represents in the text.
Table 1: "Exists" is recursively defined, but the base case is missing. Table 1: "choice" is described in natural language as a mutual exclusion, however the formula uses an inclusive disjunction. This happens also in other parts of the paper (see lines 922-925). DECLARE, to the best of my knowledge, has two constructs: a Choice and an Exclusive choice.
At the end of Section 2 the authors introduce a notation that exploits the physical representation of a relational table, however I suggest to not rely on a physical representation when introducing the logical description of the database in Section 3. This could be avoid by exploiting a primary key.
l. 398: The relational tables CountingTable and ActivityTable are introduced but the table attributes are not explicitly described here. Also, the primary keys and the referential integrity constraints are not explicitly stated as one would expect at logical level.
l. 411-412: In the definition of ActivityTable: a) there some parentheses missing b) please check whether the first occurrence of \phi' and the second occurrence of \pi' should refer to j (I would expect that the event following j-1 is j and that the event preceding j+1 is j).
l. 444: what is k?
Section 3.2.1: I suggest to not rely on Table 1 to explicitly introduce the triplets returned by intermediate operators, but to explicity introduce them.
Sections 3.2.1 and 3.2.2: I do not understand the categorization of Base operators and Unary operators. The authors state that unary operators are a generalization of base operators, however it is not possible to reduce the generalized definitions to the special case of the base operators because it seems that the base operators are indeed both a special case and a description of an implementation of the generalized definitions.
The definition of Absence in Section 3.2.1 exploits the relational table CountTable, however it is not clear how this definition handles the case where an event has not occurred at all. I guess that the relational table does not contain the events that occurred 0 times. Notice that the natural language description of Absence does not rule out 0-occurrences.

* Minor comments: Fig. 2 is without caption. l. 272: "it semantics" -> "its semantics" l. 358: "to the that" -> ? l. 406: "i-the" -> "i-th" l. 590: "??" l. 1986: "to can" -> ? l. 1990: reference 14 seems not consistent with the text. Please check.

Author Response

We (as the authors) thank the reviewer for the constructive comments that, we hope, helped to improve the manuscript’s quality. Please refer to the attached PDF for a full reply to your kind observations.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper is well organized and presentation. There are several suggestions:

1. Background and motivation should be added in the abstract.

2. For query optimizer, steps should be added. Fig. 5 cannot illustrate the steps well. On the other hand, an example can be added as well.

3. Temporal algebras lack formal definitions and explanations.

4. Differences between this paper and the previous one should also be explained in detail.

Overall, the paper studies comprehensively, and can be considered after these revisions.

Author Response

We (as the authors) thank the reviewer for the constructive comments that, we hope, helped to improve the manuscript’s quality. Please refer to the attached PDF for a full response to your points.

Author Response File: Author Response.pdf

Back to TopTop