Next Article in Journal
An Industrial Load Classification Method Based on a Two-Stage Feature Selection Strategy and an Improved MPA-KELM Classifier: A Chinese Cement Plant Case
Next Article in Special Issue
Stylometric Fake News Detection Based on Natural Language Processing Using Named Entity Recognition: In-Domain and Cross-Domain Analysis
Previous Article in Journal
Cascading and Ensemble Techniques in Deep Learning
 
 
Article
Peer-Review Record

Supervised Dimensionality Reduction of Proportional Data Using Exponential Family Distributions

Electronics 2023, 12(15), 3355; https://doi.org/10.3390/electronics12153355
by Walid Masoudimansour 1,* and Nizar Bouguila 2
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4: Anonymous
Reviewer 5: Anonymous
Electronics 2023, 12(15), 3355; https://doi.org/10.3390/electronics12153355
Submission received: 25 June 2023 / Revised: 25 July 2023 / Accepted: 1 August 2023 / Published: 5 August 2023
(This article belongs to the Special Issue Data Push and Data Mining in the Age of Artificial Intelligence)

Round 1

Reviewer 1 Report

#--- Reviewer Comments to Authors ----#

Dear Authors

In this manuscript, the authors introduce a novel method to address the limitations of existing supervised dimensionality reduction algorithms when dealing with high-dimensional sparse data and multi-modal data. These algorithms need help with the curse of dimensionality and handle multi-modal classes properly. The manuscript is straightforward and comprehensive. Many experiments are performed, and many tables are presented. Before accepting this manuscript, there are some suggestions for modification:

Start section number with 1. Some acronyms are used without textual descriptions. 

Several experiments are realised. However, explaining how classes are defined/chosen must be presented.  One recommendation, plots should be better than tables, don't you agree?

Some minor corrections should be performed as

- LFDA instead of LDAL1 in all tables

- Q in eq. 15?

The manuscript is straightforward and comprehensive.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The article can be very interesting for the state of the art taking into account the objectives that the authors claim to achieve. The introduction and related work are very comprehensive and I think they cover the area in a balanced way.

The proposed method is promising and the results achieved allow inferring improvements with its use.

At the level of experiments, I feel that the authors present the results of a set of tests, but do not interpret them in as much detail as these same experiments should. It would be important that at the end of each experiment/table, the authors make a situation point in relation to their objectives.

There is also an important issue. On page 10, line 225, the authors state that each test was repeated 5 times using cross-validation with 5 folds. The current view calls for a standard of 10 repetitions with cross-validation with 10 folds. Why did the authors choose this setup when the research standard is different?

In terms of formatting, on page 13 there is extra space between tables.

In summary, the article is interesting and the proposed method can be important for the state of the art. I think that the experimentation section can be improved and if the authors do it with quality as well as if they answer the other questions raised, the article may be able to be published.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Major remarks:

- Completely omitted are such recently popular techniques as t-SNE and UMAP. In my opinion, they should at least be mentioned.

- I think that you should explain in detail what does it mean "second order statistics of data". 

- Why did you use the DT algorithm of classification? It is well known that it is a rather bad (in terms of accuracy) method. I propose using the xgboost/lightgbm classifier.

- How did you split datasets into parts?

- It seems that it could be possible to extend the proposed approach to more classes than 2. Could you discuss this?

 

Minor remarks:

- We rather enumerate sections in scientific papers from 1 not from 0.

- In mathematical notation we bold matrices and vectors. 

- Some symbols in equations are not explained eg. N, M, x in Equation (1).

- We enumerate only equations that we cite in the text.

- At the end of most formulas should be a comma or full stop.

- In tables 1-7 I propose to bold the best results to simplify taking conclusions.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

I would recommend accepting the paper for publication, only if the authors indeed make the following changes. Overall, this is an interesting paper .

a) It is important that the authors place some bullet points at the end of the introduction , in order to mention the key novelties

b)  At the beginning of section 2.1 it is good for the authors to include a few sentences again marking the novelty and the purpose of the proposed method

c) page12: xan the authors write a few more sentences for experiments 5 and 6? 

d) In the conclusions section, the authors can add bullet points again marking the key novelties. 

e) in the conclusions section the authors have placed very few points on future work. Also the references section of the paper must become stronger. To facilitate , i strongly recommend the authors to add relevant research. I have recommended below some papers and some sentences to write. 

1.As future work, other papers where such dimensionality analysis can be applied include the following two papers. The first includes massive amounts of data from a study conducted for India on a paper published with Energies, and the second one underneath is a paper in Energies, showing a method that could be used for the same purpose. THese papers would fit well in the future work section . Also further below .
1.
Giannelos, S.; Jain, A.; Borozan, S.; Falugi, P.; Moreira, A.; Bhakar, R.; Mathur, J.; Strbac, G. Long-Term Expansion Planning of the Transmission Network in India under Multi-Dimensional Uncertainty. Energies202114, 7813. https://doi.org/10.3390/en14227813

2.
Giannelos, S.; Borozan, S.; Strbac, G. A Backwards Induction Framework for Quantifying the Option Value of Smart Charging of Electric Vehicles and the Risk of Stranded Assets under Uncertainty. Energies 202215, 3334. https://doi.org/10.3390/en15093334


3.The paper below can be included in the literature review section as the authors propose a kernel-based within class collaborative preserving discriminant projection method to reduce data dimensionality


H. Hu, D. Feng and F. Yang, "A Promising Nonlinear Dimensionality Reduction Method: Kernel-Based Within Class Collaborative Preserving Discriminant Projection," in IEEE Signal Processing Letters, vol. 27, pp. 2034-2038, 2020, doi: 10.1109/LSP.2020.3037460.


4. A very interesting method for dimensionality reduction that can be added in future work is :

C. -H. Yang, H. -C. Huang, M. -F. Hou, L. -Y. Chuang and Y. -D. Lin, "Fuzzy-Based Multiobjective Multifactor Dimensionality Reduction for Epistasis Analysis," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 20, no. 1, pp. 378-387, 1 Jan.-Feb. 2023, doi: 10.1109/TCBB.2022.3144303.


5. And this below:

T. Zhang, F. Shen, T. Zhu and J. Zhao, "An Evolutionary Orthogonal Component Analysis Method for Incremental Dimensionality Reduction," in IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 1, pp. 392-405, Jan. 2022, doi: 10.1109/TNNLS.2020.3027852.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 5 Report

This paper proposes a novel method for supervised dimensionality reduction of proportional data using exponential family distributions. The proposed method addresses common problems faced by traditional supervised dimensionality reduction algorithms, such as the curse of dimensionality and multi-modal data. The method involves projecting the data into a low-dimensional space using a combination of a linear transformation and a non-linear function. The proposed method is shown to outperform existing methods in terms of accuracy and computational efficiency.

 

Strengths:

1. The paper proposes a novel method for supervised dimensionality reduction of proportional data using exponential family distributions, which addresses common problems faced by traditional supervised dimensionality reduction algorithms.

2. The proposed method addresses the curse of dimensionality and multi-modal data, which are common problems faced by traditional supervised dimensionality reduction algorithms.

3. The proposed method is shown to outperform existing methods in terms of accuracy and computational efficiency.

4. The proposed method is significantly lighter in terms of resource usage compared to existing methods, which is important when dealing with large datasets or low power, low resource devices.

5. The paper is well-written and clearly explains the proposed method and its advantages over existing methods.

 

Weaknesses:

1. Although the scalability of the proposed method is mentioned as an advantage, a more thorough analysis of computational complexity and time complexity compared to other algorithms would provide a better understanding of its efficiency. Either theoretical analysis or experimental analysis would be welcome.

2. In the related work, it would be beneficial to mention other dimensionality reduction methods, e.g., Liu et al. Automated feature selection: A reinforcement learning perspective. IEEE Transactions on Knowledge and Data Engineering. 

The writing is friendly to readers. It will be easily accessible to researchers interested in the dimensionality reduction community.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Authors made a huge effort to correct and improve their paper. They responded to all raised problems and I appreciate this. Finally, I can recommend this paper for publication.

Back to TopTop