Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Intelligent Indexing—Boosting Performance in Database Applications by Recognizing Index Patterns

Electronics 2020, 9(9), 1348; https://doi.org/10.3390/electronics9091348

by Alberto Arteta Albert^1,*,†,‡, Nuria Gómez Blas^2,†

and Luis Fernando de Mingo López^2,†

Reviewer 1:

Michał Staniszewski

Reviewer 2: Anonymous

Electronics 2020, 9(9), 1348; https://doi.org/10.3390/electronics9091348

Submission received: 16 July 2020 / Revised: 16 August 2020 / Accepted: 18 August 2020 / Published: 20 August 2020

(This article belongs to the Special Issue Pattern Recognition and Applications)

Round 1

Reviewer 1 Report

The presented manuscript presents a novel algorithm for boosting performance of database indexing. Despite of presented description and given results I have many doubts about quality of that work:

Introduction and state-of-the-art part of work consists of only basic description of problem and has poor reference literature (only 15 positions in which 4 are from 2017, 2 of them from 2016, 8 older and 1 webpage). There are no current works regarding that topic, whereas intelligent indexing is still well developed.
Essentially presented work compares only two different approaches real DBMS and DBMS learning algorithm – there is slight difference in presented method but no other comparison was given – such as different classification methods or different architecture of ANN.
Presented manuscript is in early stage of reading – there are many format problems (lines 27 - 28, 57-63, 90-107, 115-142). Before submitting ready version into review process proof reading should be performed.
There is no explanation about DBMS term. The quality of Figures (1, 3, 10) is very poor, the indexing of figures is not consistent, there are no captions and indexing of tables.
Figures 1 and 2 (but this time on page 10) present the same results and can be merged into one plot.
There is no conclusion section at the end.

Author Response

Dear reviewer,

Thank you for spending time on reviewing and improving our manuscript.

Please find my revised version with the new updates based on your comments.

"Introduction and state-of-the-art part of work consists of only basic description of the problem and has poor reference literature (only 15 positions in which 4 are from 2017, 2 of them from 2016, 8 older and 1 webpage). There are no current works regarding that topic, whereas intelligent indexing is still well developed".=> New references have been added, some recent ones. You can find now 9 references from 2017-2020.

"Essentially presented work compares only two different approaches real DBMS and DBMS learning algorithm – there is slight difference in presented method but no other comparison was given – such as different classification methods or different architecture of ANN." The standard DBMS have been compared with 2 different architecture of ANN as per your suggestion.

"Presented manuscript is in early stage of reading – there are many format problems (lines 27 - 28, 57-63, 90-107, 115-142). Before submitting ready version into review process proof reading should be performed." those lines have been reformatted. “=>The manuscript has been reconstructed in Latex to solve these issues.

"There is no explanation about DBMS term". =>Explanation Added.line 18.

"The quality of Figures (1, 3, 10) is very poor, the indexing of figures is not consistent, there are no captions and indexing of tables".=>Figures have been recreated with higher quality and captions

"Figures 1 and 2 (but this time on page 10) present the same results and can be merged into one plot."=> Figures have been merged into Fig 8., although they will appear as sub-figs. Please note that one represents the trend and the other one is just scatter.

"There is no conclusion section at the end." The conclusions are included in the discuss section. That section has been renamed and more conclusions have been added for clarity.

Thank you again,

Dr. Arteta

Author Response File: Author Response.pdf

Reviewer 2 Report

The article presents a machine learning approach to deciding when to change indexes that are causing performance issues on database access. The article is written in a reasonably clear way, but there are many small issues with the writing that must be improved before publication. These include not only the typical typos, but also problems with verbal times, some strangely constructed sentences, sentences that are fragments, don’t end with a full stop, have citations after the full stop, etc…

The article presents the new approach in an understandable way, but there is no discussion if there is some previous work using some other dynamic approaches to this problem or even other approaches based in machine learning. Even stating that there is no such work would be important to show the relevance of this contribution to the field.

I’m also not sure if describing the case of study before proposing the new approach is really the best way to go. Typically a general new approach should be presented and then the chosen case study could be described. At least some explanation should be given for this organization of the paper.

My main concern with the paper is the results section. While the results seem to support the validity of the proposed approach, i think the section should be extended and better explained. I’m really not sure how the tests were chosen or performed, but I think the paper would benefit of an expanded results sections, where for example, different sets of hits could sent in a randomized way to the system over several runs so that some variance measure could be taken for the performance of each approach. This way we could have some idea if the results obtained for each data set are really statistically significant.

Overall I think this is an interesting approach to a relevant problem, but the article needs some more work before it can be published.

Author Response

Dear reviewer,

Thank you so much for spending time to review our manuscript and improve its quality.

Please note that we have made the changes that you suggest. All the suggested changes from the reviewers have been highlighted.

Please see below.

"The article presents a machine learning approach to deciding when to change indexes that are causing performance issues on database access. The article is written in a reasonably clear way, but there are many small issues with the writing that must be improved before publication. These include not only the typical typos, but also problems with verbal times, some strangely constructed sentences, sentences that are fragments, don’t end with a full stop, have citations after the full stop, etc…" => The manuscript has been reviewed again and typos have been detected and corrected. The editor has been changed to Latex, so many of the formatting typos have been fixed automatically.

"The article presents the new approach in an understandable way, but there is no discussion if there is some previous work using some other dynamic approaches to this problem or even other approaches based in machine learning. Even stating that there is no such work would be important to show the relevance of this contribution to the field."=>A sentence explaining previous related work has been added in the state of Art. Line 151

"I’m also not sure if describing the case of study before proposing the new approach is really the best way to go. Typically a general new approach should be presented and then the chosen case study could be described. At least some explanation should be given for this organization of the paper."=>An explanation is added in the beginning 'case of study' section.155-156 and in the results section 252-272

"My main concern with the paper is the results section. While the results seem to support the validity of the proposed approach, i think the section should be extended and better explained. I’m really not sure how the tests were chosen or performed, but I think the paper would benefit of an expanded results sections, where for example, different sets of hits could sent in a randomized way to the system over several runs so that some variance measure could be taken for the performance of each approach. This way we could have some idea if the results obtained for each data set are really statistically significant."=> The results section has been modified by adding more information about the testing. Different sentences of the randomized technique are added as well as more details of the simulated environment 252-272 fragment.

Thank you again for your feedback

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Thank you for including my suggestions. In my opinion introduction soundness is much better in that form. I am afraid that the research pattern and the results presentation is still not finished.

1) ""Essentially presented work compares only two different approaches real DBMS and DBMS learning algorithm – there is slight difference in presented method but no other comparison was given – such as different classification methods or different architecture of ANN." The standard DBMS have been compared with 2 different architecture of ANN as per your suggestion."

The authors used in results presentation ANN in classification stage. The main question is why that kind of classifier and why in such architecture? In the result section the authors should prove (by means of numerical verification) that they checked different architecture of ANN, different classifiers (SVM, random forest, kNN) - maybe it will improve in that case time of indexing presented in Table 4 or at least will give an explanation on the choice of method.
what is the accuracy of given classification? Some statistics should be included - does it give all the time correct result?

2) Short minor remarks:

Figure 8 has no caption ('kkk') and should be joined in one figure - there is no need to show the same result (scatter and trend) in two different figures - both are the same and can be marked in one figure.
section of abbreviations is required - for example DML has not been introduced - it will be easier to read that manuscript.

Author Response

Dear reviewer,

Thank you again for helping us improving our manuscript.

Please note that we included the changes you suggested. see below.

"The authors used in results presentation ANN in classification stage. The main question is why that kind of classifier and why in such architecture? In the result section the authors should prove (by means of numerical verification) that they checked different architecture of ANN, different classifiers (SVM, random forest, kNN) - maybe it will improve in that case time of indexing presented in Table 4 or at least will give an explanation on the choice of method.
what is the accuracy of given classification? Some statistics should be included - does it give all the time correct result?=>2 paragraphs are added with more info about models we used and the reason why this ANN is chosen" . 247-254. 263-267

2) Short minor remarks:

"Figure 8 has no caption ('kkk') and should be joined in one figure - there is no need to show the same result (scatter and trend) in two different figures - both are the same and can be marked in one figure". =>The caption has been fixed and the plot have been merged.
"section of abbreviations is required - for example DML has not been introduced - it will be easier to read that manuscript".=> The section of abbreviations have been added at the end of the manuscript.

Thank you again.

Author Response File: Author Response.pdf

Reviewer 2 Report

The main concerns expressed in the previous review were reasonably address by the authors' answer and improvements made to the manuscript.

Author Response

"The main concerns expressed in the previous review were reasonably address by the authors' answer and improvements made to the manuscript."=> Thank you for helping us improve the quality of our work.

Round 3

Reviewer 1 Report

Dear authors,

Thank you for including my suggestions. In the part of supervised classification you have to be much more precise. I hope that this will be the last iteration of corrections:

1) you wrote that you used 3 different classification approaches in the first part of your method - a) ANN with chosen topology, b) ANN with some different topology, c) kNN.

2) what do you mean by the average difference? As far as I understood you deal with 2-class classification so it will be easily to apply ROC curve or AUC parameter - why you cannot show that?

3) please include some table showing the results of basic classification parameters (TP, FN, AUC etc.) having in rows 3 different classification approaches and in columns those parameters.

Author Response

Dear reviewer,

Thanks again for helping to improve the quality of our manuscript. Please note that we made the changes you suggested. See below.

1) "you wrote that you used 3 different classification approaches in the first part of your method - a) ANN with chosen topology, b) ANN with some different topology, c) kNN".=> Yes that is correct. We tried these 3 models.

2) "what do you mean by the average difference? As far as I understood you deal with 2-class classification so it will be easy to apply ROC curve or AUC parameter - why you cannot show that?" => We do not calculate the "average difference". Cross validation returns the mean or average accuracy of all the combinations of the K-folds data sets. From that accuracy, we can calculate the mean or average error of each model. Then, what we did is subtracting the error of ANN-1 and ANN-2 (the difference of the average errors). We realize it is confusing and we have changed the paragraph to make it clearer.(247-255)

3)" please include some table showing the results of basic classification parameters (TP, FN, AUC etc.) having in rows 3 different classification approaches and in columns those parameters".=> Table is included along with an explanation in detail (line 294-331)

Thank you again

Author Response File: Author Response.pdf

Article Menu

Intelligent Indexing—Boosting Performance in Database Applications by Recognizing Index Patterns

Further Information

Guidelines

MDPI Initiatives

Follow MDPI