Active Learning: Encoder-Decoder-Outlayer and Vector Space Diversification Sampling
Round 1
Reviewer 1 Report
The authors proposed an encoder-decoder-outlayer (EDO) framework and vector space diversification (VSD) active learning sampling algorithm. The method uses a small output layer and explores the diversity of the encoded feature. Some comments are provided to improve the quality of this manuscript.
1. Page 2 line 44, when VSD is mentioned for the first time, it is recommended to give its full name.
2. In Fig. 1, the feature maps after dimensionality reduction by different methods do not seem to show the difference. Please elaborate on the differences between the methods.
3. Also, for Fig. 4, please describe in detail the differences between the sub-figures.
4. In recent years, several active learning methods (e.g., “Latent-enhanced variational adversarial active learning assisted soft sensor. IEEE Sensors Journal, DOI: 10.1109/JSEN.2023.3279203.”, “Actively exploring informative data for smart modeling of industrial multiphase flow processes. IEEE Transactions on Industrial Informatics, 2021, Vol. 17(12): 8357-8366.”) have been proposed in literature. For benefits of readers, the authors need to introduce recent active learning methods and discuss the main differences between them.
5. In Table 6, please detail the comparison methods, such as F1 (VSD ave), F1 (VSD max).
6. For active learning methods, it is recommended to give stopping criteria to determine when it is no longer necessary to continue annotating unlabeled samples.
Author Response
Reviewer 1
Comments and Suggestions for Authors
The authors proposed an encoder-decoder-outlayer (EDO) framework and vector space diversification (VSD) active learning sampling algorithm. The method uses a small output layer and explores the diversity of the encoded feature. Some comments are provided to improve the quality of this manuscript.
- Page 2 line 44, when VSD is mentioned for the first time, it is recommended to give its full name.
Thanks for the suggestion. The full name of VSD (Vector Space Diversify) is provided on Page 2 line 44.
- In Fig. 1, the feature maps after dimensionality reduction by different methods do not seem to show the difference. Please elaborate on the differences between the methods.
Thanks for the comment. We have added the following elaborations in Introduction: PCA is an algorithm that finds the principal components of a dataset, which can help to reduce the number of dimensions while preserving the most important information in the dataset. T-SNE is a technique that maps high-dimensional data points onto a low-dimensional space, resulting in a map that preserves the similarities between points in the high-dimensional space. The differences between these two methods are that PCA is a linear technique while t-SNE is non-linear, and t-SNE is better at preserving local structure and can be used to create more visually appealing maps.
- Also, for Fig. 4, please describe in detail the differences between the sub-figures.
Many thanks for the valuable comment. Figure 3 shows the sampling executed on a 2D unit circle employing Gaussian distribution of theta values. It consists of a number of sub-figures, each of which illustrates the effect of a different sampling method on the 2D unit circle such as rand, mean and median.
- In recent years, several active learning methods (e.g., “Latent-enhanced variational adversarial active learning assisted soft sensor. IEEE Sensors Journal, DOI: 10.1109/JSEN.2023.3279203.”, “Actively exploring informative data for smart modeling of industrial multiphase flow processes. IEEE Transactions on Industrial Informatics, 2021, Vol. 17(12): 8357-8366.”) have been proposed in literature. For benefits of readers, the authors need to introduce recent active learning methods and discuss the main differences between them.
Many thanks for the suggestion. The following descriptions have been added in Introduction:
(Dai et al, 2023) proposes a sample selection strategy for active learning to enhance quality prediction performance with limited labeled data. It uses a minimax game and a latent-enhanced variational autoencoder to deceive an adversarial network and Gaussian process regression to incrementally select informative unlabeled samples. (Deng et al, 2020) develops an active learning method to explore information from multiphase flow process data, facilitating smart process modeling and prediction. An index is proposed to describe the process dynamics and nonlinearity, and a criterion to judge the learning termination is designed.
- In Table 6, please detail the comparison methods, such as F1 (VSD ave), F1 (VSD max).
Thanks for the comment. The following description has been added in Section 8:
F1 (trivial) is the F1 score of randomly selecting items from each class. F1 (rand) is the F1 score of the random sampling method. F1 (VSD min), F1 (VSD ave) and F1 (VSD max) are the F1 scores of the VSD sampling algorithm when selecting items with the minimum, average and maximum understanding, respectively. F1 (VSD min-rand), F1 (VSD ave-rand) and F1 (VSD max-rand) are the differences between the F1 scores of the VSD sampling algorithm and the random sampling method when selecting items with the minimum, average and maximum understanding, respectively.
- For active learning methods, it is recommended to give stopping criteria to determine when it is no longer necessary to continue annotating unlabeled samples.
Thanks for the comment. The stopping criteria is when the model performance has reached its peak or when the marginal improvement from additional annotations is negligible.
Author Response File: Author Response.docx
Reviewer 2 Report
The authors propose a training pipeline that separates the pre-training and fine-tuning phases. At the same time, the sampling method uses the pivot node to divide the sub-vector vector space and select only the necessary unlabeled data, reducing the need for manual labeling.
Here are my concerns:
1. The description of data sets is generally placed in the experimental chapter.
2. The quality of illustrations in this paper needs to be further improved, and the description of figures and tables is lacking.
3. The format and layout of the article need to be improved.
4. The formulas in the article do not have labels.
5. The article lacks theoretical depth and innovation, so it is suggested to add more content.
Author Response
Reviewer 2
Comments and Suggestions for Authors
The authors propose a training pipeline that separates the pre-training and fine-tuning phases. At the same time, the sampling method uses the pivot node to divide the sub-vector vector space and select only the necessary unlabeled data, reducing the need for manual labeling.
Here are my concerns:
- The description of data sets is generally placed in the experimental chapter.
Thanks for the suggestion. We have reorganized Section 5 and it includes the description of data sets in the revised version.
- The quality of illustrations in this paper needs to be further improved, and the description of figures and tables is lacking.
Thanks for the suggestion. Descriptions of figures and tables have been added in the revised version:
Figure 1 shows the Encoder-Decoder-Outlayer framework. It consists of an Encoder, a Decoder, and an Outlayer. Cross entropy loss with weight was used as the loss function, and F1-score-guidance was used to trigger when the F1-score decreased below the previous score.
Figure 2 illustrates the architecture of the out layer, which consists of a 3-layer-resnet framework with Prelu activation, batch normalization, linear layer, and hidden layer size set to twice the cluster number.
- The format and layout of the article need to be improved.
Thanks for the suggestion. The format and layout were improved in the revised version.
- The formulas in the article do not have labels.
Thanks for pointing these out. We have added the labels for the formulas.
- The article lacks theoretical depth and innovation, so it is suggested to add more content.
Thanks for the suggestion. More content was added and highlighted in yellow.
Author Response File: Author Response.docx
Reviewer 3 Report
This paper needs major modification to be accepted for publication. The presentation of the figures, tables and the format of the research is very poor, for example Figures1,2,3,4, and 5 are not clear. The captions of all the figures and tables does not follow the common format in journal papers. Page numberings are missing. Parts of the future work should be a part of this paper instead of just proposing it to future work as the contribution in this research can be extended to better fit a research paper. The citation format is incorrect and vary from part to another.
The English language is good. Only minor modification may be required.
Author Response
Reviewer 3
Comments and Suggestions for Authors
This paper needs major modification to be accepted for publication. The presentation of the figures, tables and the format of the research is very poor, for example Figures1,2,3,4, and 5 are not clear. The captions of all the figures and tables does not follow the common format in journal papers. Page numberings are missing. Parts of the future work should be a part of this paper instead of just proposing it to future work as the contribution in this research can be extended to better fit a research paper. The citation format is incorrect and vary from part to another.
Many thanks for the valuable comments. We have updated the manuscript thoroughly and the modifications are highlighted in yellow.
Author Response File: Author Response.docx
Reviewer 4 Report
This paper develops a training pipeline consisting of two parts: Encoder-Decoder-Outlayer framework and Vector Space Diversification Sampling method. The proposed framework consists of two stages: pre-training and fine-tuning. The proposed pipeline provides fast training, parallelization, buffer ability, flexibility, low GPU memory consumption, and a nearly linear-time-complexity sample method.
I have some concerns about this submitted manuscript. Here are my suggestions for improvement:
[1] In the abstract, the authors should provide clear results and findings of this paper.
[2] The literature review should be updated with more recent references that represent the current development of the field. Consider adding relevant literature such as "https://doi.org/10.1609/aaai.v36i8.20850" and "10.1109/TCSVT.2022.3214430" in the introduction and related work section.
[3] Improve the clarity of certain figures (Fig. 3 and Fig. 5). Ensure that the details and contents remain distinguishable when the figures are enlarged.
[4] Expand the methodology section to provide more detailed explanations and ensure sufficient coverage of the proposed approach.
[5 ] Mark the best data in Table 6 using bold font or any other appropriate formatting.
[6] Clearly state the main contributions of this manuscript in bullet-point form to highlight the novelty and significance of the research.
The quality of the writing style is satisfactory, but there are a few instances of informal expressions that should be reviewed and corrected.
Author Response
eviewer 4
Comments and Suggestions for Authors
This paper develops a training pipeline consisting of two parts: Encoder-Decoder-Outlayer framework and Vector Space Diversification Sampling method. The proposed framework consists of two stages: pre-training and fine-tuning. The proposed pipeline provides fast training, parallelization, buffer ability, flexibility, low GPU memory consumption, and a nearly linear-time-complexity sample method.
I have some concerns about this submitted manuscript. Here are my suggestions for improvement:
[1] In the abstract, the authors should provide clear results and findings of this paper.
Thanks for the comment. We have revised the abstract as the followings:
The study introduces a training pipeline comprising two components: the Encoder-Decoder-Outlayer framework and the Vector Space Diversification Sampling method. This framework efficiently separates the pre-training and fine-tuning stages, while the sampling method employs pivot nodes to divide the sub-vector space and selectively choose unlabeled data, thereby reducing the reliance on human labeling. The pipeline offers numerous advantages, including rapid training, parallelization, buffer capability, flexibility, low GPU memory usage, and a sample method with nearly linear-time complexity. Experimental results demonstrate that models trained with the proposed sampling algorithm generally outperform those trained with random sampling on small datasets. These characteristics make it a highly efficient and effective training approach for machine learning models. Further details can be found in the project repository on GitHub.
[2] The literature review should be updated with more recent references that represent the current development of the field. Consider adding relevant literature such as "https://doi.org/10.1609/aaai.v36i8.20850" and "10.1109/TCSVT.2022.3214430" in the introduction and related work section.
Many thanks for the suggestions. We have reviewed and included the suggested papers. Followings are the descriptions added in Introduction:
(Xie et al, 2022) proposes Energy-based Active Domain Adaptation (EADA), which queries groups of target data that incorporate both domain characteristic and instance uncertainty. Experiments show that EADA surpasses state-of-the-art methods on challenging benchmarks with substantial improvements. (Liu et al, 2022) develops a multi-purpose haze removal framework for nighttime hazy images. It uses a nonlinear model based on Retinex theory and a variational Retinex model to estimate a smoothed illumination component and predict the noise map. Experiments show that the proposed framework performs better than famous nighttime image dehazing methods. It can also be applied to other types of degraded images.
[3] Improve the clarity of certain figures (Fig. 3 and Fig. 5). Ensure that the details and contents remain distinguishable when the figures are enlarged.
Many thanks for the suggestion. We have improved the figures.
[4] Expand the methodology section to provide more detailed explanations and ensure sufficient coverage of the proposed approach.
Thank you. We have added more explanations to the methodology part.
[5 ] Mark the best data in Table 6 using bold font or any other appropriate formatting.
The best data in Table 6 is the dbpedia data, as it has the highest F1 score, accuracy, precision, and recall.
[6] Clearly state the main contributions of this manuscript in bullet-point form to highlight the novelty and significance of the research.
Thanks for the comment. The followings were added in the last paragraph of Introduction:
-Proposes an encoder-decoder-out (EDO) active learning method for text classification
-Explores the applicability of EDO, demonstrating its effectiveness in addressing issues of limited labeled data
-Explores the utilization of different models and techniques, such as BERTbase, S-BERT, Universal Sentence Encoder, Word2Vec, and Document2Vec, to optimize datasets for deep learning
-Proposes the use of T-SNE for dimension reduction and comparison of sentence vectors.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
The quality of this version has been improved. Before publication, please improve the quality of figures.
The quality of this version has been improved. Before publication, please improve the quality of figures.
Author Response
R1
Comments and Suggestions for Authors
The quality of this version has been improved. Before publication, please improve the quality of figures.
Response: Thanks for the comment. The quality of the figures have been improved.
Reviewer 2 Report
The authors addressed some of my previous concerns, but I still felt that the structure and content of the article needed improvement.
1. The core method section contains too little content and does not adequately describe the proposed method.
2. Illustration 2 is unclear, and many illustrations in the experiment look like screenshots, which are also unclear.
3. The title of Table 1 should be directly above the table.
4. Illustration 4 is not appropriate directly under the heading.
5. There are many blank places in the paper; it is suggested to adjust the layout of the paper.
It is suggested that the authors include more detailed descriptions of the methods and experiments to make the paper more complete.
Author Response
R2
Comments and Suggestions for Authors
The authors addressed some of my previous concerns, but I still felt that the structure and content of the article needed improvement.
- The core method section contains too little content and does not adequately describe the proposed method.
Response: Thanks a lot for the comment. Detailed description was added to the revised manuscript. The updated paragraphs are highlighted in yellow.
- Illustration 2 is unclear, and many illustrations in the experiment look like screenshots, which are also unclear.
Response: Thanks for pointing out that. We have improved the quality of the figures.
- The title of Table 1 should be directly above the table.
Response: Thanks for pointing out that. The title is directly above the table in the revised manuscript.
- Illustration 4 is not appropriate directly under the heading.
Response: Thanks for pointing out that. Figures 4-6 are placed after the paragraphs in Section 5.4.
- There are many blank places in the paper; it is suggested to adjust the layout of the paper.
Response: Thanks for pointing out that. We have improved the layout of the paper.
It is suggested that the authors include more detailed descriptions of the methods and experiments to make the paper more complete.
Response: Thanks a lot for the comment. Detailed description was added to the revised manuscript. The updated paragraphs are highlighted in yellow.
Reviewer 3 Report
All comments have been considered in the revised manuscript
English language is good
Author Response
R3
Comments and Suggestions for Authors
All comments have been considered in the revised manuscript
Comments on the Quality of English Language
English language is good
Response: Thanks a lot for the positive comment!