Next Article in Journal
Study on TLS Point Cloud Registration Algorithm for Large-Scale Outdoor Weak Geometric Features
Next Article in Special Issue
A Computationally Efficient Online/Offline Signature Scheme for Underwater Wireless Sensor Networks
Previous Article in Journal
Power Autonomy Estimation of Low-Power Sensor for Long-Term ECG Monitoring
Previous Article in Special Issue
Cloud of Things in Crowd Engineering: A Tile-Map-Based Method for Intelligent Monitoring of Outdoor Crowd Density
 
 
Article
Peer-Review Record

SAX and Random Projection Algorithms for the Motif Discovery of Orbital Asteroid Resonance Using Big Data Platforms

Sensors 2022, 22(14), 5071; https://doi.org/10.3390/s22145071
by Lala Septem Riza 1,*, Muhammad Naufal Fazanadi 1, Judhistira Aria Utama 2, Khyrina Airin Fariza Abu Samah 3, Taufiq Hidayat 4 and Shah Nazir 5
Reviewer 1:
Reviewer 2:
Sensors 2022, 22(14), 5071; https://doi.org/10.3390/s22145071
Submission received: 29 March 2022 / Revised: 25 May 2022 / Accepted: 30 May 2022 / Published: 6 July 2022
(This article belongs to the Special Issue Big Data Analytics in Internet of Things Environment)

Round 1

Reviewer 1 Report

The proposed algorithms show cases that Spark and Hadoop computing models can be easily adopted in many scientific fields including astronomy. Presented experiments lack appropriate description of system specification and configuration which makes comparing the efficiency of the algorithm impossible. Authors have not provided good reference points to compare Spark with other available solutions. Minor revision of the article is suggested.

It is not clear if computations were performed on one VM or multiple VMs. Figure 2. suggests that authors used only one machine. If more than one node was used: Are worker nodes homogenous or heterogenous? How is the Spark configured? Does it use Spark Standalone FIFO scheduler or one of the Hadoop YARN schedulers?

What is the configuration of storage on the virtual machines?

  • Are the disks mounted from the server hosting the VMs? Is there a redundancy technique used that may influence computation speed (i.e., reading HDFS data from LVM logical volumes is notoriously slow)?
  • If disks are mounted from external storage, then what is the point of using HDFS?

It is not clear if SwiftVis should be used as benchmark software in the field of Astronomy, are there any other similar solutions?

Authors use three different acronyms of User Defined Function: 1) UDP (line 23 and 25), 2) EDF (Figure 1.), 3) UDF (line 169, 172 etc.)

Line 201: DataFrame and RDD are different APIs.

Line 201: this is not how HDFS works. Did you mean: data is copied from HDFS to local storage?

Line 332: this is Amdahl’s law in action.

Line 372: What is the point of comparing SwiftVis (a GUI application that can run local tasks) with Spark (distributed processing model)? Can SwiftVis be run in parallel? Is it possible to run Spark tasks that will launch SwiftVis instances in batch mode (https://www.cs.trinity.edu/~mlewis/SwiftVis/BatchProcessing.html)? This way, authors could compare the speed of the proposed solution with competing software.

Authors use three different acronyms of User Defined Function: 1) UDP (line 23 and 25), 2) EDF (Figure 1.), 3) UDF (line 169, 172 etc.

Line 201: DataFrame and RDD are different APIs.

Line 201: this is not how HDFS works. Did you mean: data is copied from HDFS to local storage?

Line 332: this is Amdahl’s law in action.

Line 372: What is the point of comparing SwiftVis (a GUI application that can run local tasks) with Spark (distributed processing model)? Can SwiftVis be run in parallel? Is it possible to run Spark tasks that will launch SwiftVis instances in batch mode (https://www.cs.trinity.edu/~mlewis/SwiftVis/BatchProcessing.html)? This way, authors could compare the speed of the proposed solution with competing software.

Authors use three different acronyms of User Defined Function: 1) UDP (line 23 and 25), 2) EDF (Figure 1.), 3) UDF (line 169, 172 etc.)

Line 201: DataFrame and RDD are different APIs.

Line 201: this is not how HDFS works. Did you mean: data is copied from HDFS to local storage?

Line 332: this is Amdahl’s law in action.

Line 372: What is the point of comparing SwiftVis (a GUI application that can run local tasks) with Spark (distributed processing model)? Can SwiftVis be run in parallel? Is it possible to run Spark tasks that will launch SwiftVis instances in batch mode (https://www.cs.trinity.edu/~mlewis/SwiftVis/BatchProcessing.html)? This way, authors could compare the speed of the proposed solution with competing software.

The  red underline can be removed from text in some figures. Source code should be provided as text, not as a picture?

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

  1. The rationale for using Apache Hadoop and spark should be clearly provided. 
  2. The contribution and motivation of the paper are not clear in the introduction. It should be written in the itemized form.
  3.  Actually, the start of the introduction is fine with simple sentences. However, it has become very technical in the later paragraphs of the introduction. This is then difficult to understand.
  4. The rationale of the computation model given in Figure 1 should be explained clearly.
  5. Pseudocode given in Figure 2 should be simplified or clearly explained.
  6. Explain the steps of the computation model in a simple way and then go into details later. At this stage, it is difficult to understand.
  7.  Table 2 provides the example of data. It is really making the paper difficult to read. Further, all the terms are not explained here. 
  8. Figure 6 also gives the example of data. Is the figure drawn by yourself? If it is taken from other sources, please refer to it or draw it yourself.
  9. Tables 7 to 9 should be explained in a Figure format.  Also, the explanation of the results should be improved.
  10. Part of the conclusion can be moved to the introduction and the whole conclusion can be re-written.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

no

Reviewer 2 Report

All my comments are addressed. I would suggest accepting the paper.

Back to TopTop