Next Article in Journal
Study of the Effect of Temperature to Optimize the Anaerobic Digestion of Slaughterhouse Sludge by Co-Digestion with Slaughterhouse Wastewater
Next Article in Special Issue
Optimization of the Recycling Process for Aligned Short Carbon Fiber TuFF Composites
Previous Article in Journal
Sustainable Fire-Resistant Materials: Recycled Polyethylene Composites with Non-Halogenated Intumescent Flame Retardants for Construction Applications
Previous Article in Special Issue
Evaluating Plastic Waste Management Strategies: Logistic Regression Insights on Pyrolysis vs. Recycling
 
 
Article
Peer-Review Record

FTIR-Based Microplastic Classification: A Comprehensive Study on Normalization and ML Techniques

by Octavio Villegas-Camacho 1, Iván Francisco-Valencia 2, Roberto Alejo-Eleuterio 1,*, Everardo Efrén Granda-Gutiérrez 3, Sonia Martínez-Gallegos 1 and Daniel Villanueva-Vásquez 1
Reviewer 1: Anonymous
Reviewer 2:
Submission received: 25 January 2025 / Revised: 1 March 2025 / Accepted: 13 March 2025 / Published: 18 March 2025
(This article belongs to the Special Issue Challenges and Opportunities in Plastic Waste Management)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper titled “FTIR-Based Microplastic Classification: A Comprehensive Study on Normalization and ML Techniques” used machine learning to classify plastic using FTIR spectra. The application of this study is to microplastic in the environment. However, the used FTIR spectra were selected from the FTIR-Plastic-c4 dataset. I believe the FTIR-Plastic-c4 dataset was generated from typical plastic, not microplastic. If I am wrong, please introduce the dataset. If the dataset was based on typical plastic, please justify how the spectra of microplastic are different from typical plastic and how your results can be applied to microplastic. In addition, you need to justify how microplastic content/concentration impacts the spectra and the application of this research.

Author Response

Comment 1:

The paper titled “FTIR-Based Microplastic Classification: A Comprehensive Study on Normalization and ML Techniques” used machine learning to classify plastic using FTIR spectra. The application of this study is to microplastic in the environment. However, the used FTIR spectra were selected from the FTIR-Plastic-c4 dataset. I believe the FTIR-Plastic-c4 dataset was generated from typical plastic, not microplastic. If I am wrong, please introduce the dataset. If the dataset was based on typical plastic, please justify how the spectra of microplastic are different from typical plastic and how your results can be applied to microplastic. In addition, you need to justify how microplastic content/concentration impacts the spectra and the application of this research.

Answer:

Thanks to the reviewer for giving us the opportunity to improve this contribution. We agree that more clarification should be provided to justify why this research could apply to microplastics. We present the following arguments, and some modifications have been made in the revised version to address the reviewers' observations:

According to the classification of plastics fragmented due to wind, solar, and mechanical energy, microplastics are particles ranging from 1 μm to 0.5 cm in size [https://doi.org/10.1016/j.marpolbul.2025.117639]. The FTIR-Plastic-c4 dataset used in this study has been previously published [https://doi.org/10.1016/j.dib.2024.110612] and was specifically developed for microplastic classification. This dataset includes FTIR spectra of plastic fragments ranging from 2 mm to 5 mm, which fall within environmental microplastics' commonly observed size range. Also, based on experimental findings indicate that microplastics found in water typically range between 0.131 mm and 4.098 mm. While larger microplastics provide strong and well-defined FTIR signals, smaller particles often exhibit weak vibrational bands, shoulder peaks, or secondary vibrations, challenging spectral characterization. To overcome this limitation, the authors built the dataset using larger microplastic fractions (~5 mm) to create a robust spectral reference library. However, these spectra can then be used to identify smaller fractions (~0.1 mm) based on their characteristic spectral features.

Regarding the impact of microplastic concentration on the FTIR spectra, it is well established that higher concentrations can increase absorbance, which may influence spectral intensity but not peak positions. The utilized dataset collected spectra under standardized FTIR measurement conditions to minimize concentration-related distortions and ensure consistency across samples.

Finally, while our approach is suitable for identifying microplastics down to ~0.1 mm, it is important to note that FTIR spectroscopy becomes less effective for much smaller particles (~1 μm) due to signal intensity limitations. Alternative techniques like scanning electron microscopy (SEM) or Raman spectroscopy are preferred in such cases. However, the spectral trends identified in our study provide a good reference for detecting and classifying microplastics in environmental samples.

Reviewer 2 Report

Comments and Suggestions for Authors

Review of the Study: FTIR-Based Microplastic Classification: A Comprehensive Study on Normalization and ML Techniques

 

This study makes a significant contribution to the classification of microplastics using FTIR spectroscopy and advanced ML and DL techniques. The authors analyzed the impact of different normalization methods, with Z-Score emerging as the most stable. By utilizing broad spectral ranges and combining classical and deep learning models, the research provides valuable insights for improving the accuracy of microplastic analysis.

However, certain improvements are necessary for the paper to be accepted.

All the models analyzed in this study fall under the category of machine learning, including neural networks and deep learning models. The authors distinguish between these categories in the methodology section and later in the results section. It would be beneficial to discuss these models in an integrated manner rather than as separate groups.

In Figure 1, the last graphical representation contains the label RB—could this be a mistake? Should it perhaps be NB (Naïve Bayes)?

Lines 397–405 and Table 2 present the hyperparameter values of the models. More explanation is needed regarding the obtained parameters and their significance. These parameters are not explained in the methodology section, and their values are listed only quantitatively in this part. For the Random Forest (RF) model, some parameters are mentioned, but it is unclear how many attributes were used for branching when generating ensemble trees. Additionally, the phrase "Growth control: 5 smaller datasets" requires further clarification. In Table 2, the number of neurons is specified, but it is not clear how this particular number was determined.

Tables 3, 4, and 5: It is advisable to verify whether the reported standard deviation values (Accuracy Avg-SD, Precision Avg-SD, Recall Avg-SD, F1-score Avg-SD) are indeed so low.

Although it is possible to achieve high classification performance using the FTIR-PLASTIC-c4 dataset and advanced ML/DL techniques, results such as 100% accuracy for models like CNN_1, MLP_1, and Random Forest may indicate potential overfitting.

In the Results and Discussion section, there are no graphical representations of the predicted and actual target values. It would be beneficial to include a Confusion Matrix for some of the analyzed models in microplastic classification.

 

 

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Only minor technical errors have been observed, which should be corrected.

Author Response

Comments to Author:

This study makes a significant contribution to the classification on of microplastics using FTIR spectroscopy and advanced ML and DL techniques. The authors analyzed the impact of different normalization methods, with Z-Score emerging as the most stable. By utilizing broad spectral ranges and combining classical and deep learning models, the research provides valuable insights for improving the accuracy of microplastic analysis. However, certain improvements are necessary for the paper to be accepted.

Response:

We sincerely appreciate your positive evaluation of our study and your accurate comments. Your feedback has been invaluable in refining our manuscript. We have carefully addressed each of your suggestions in the revised version, enhancing the clarity and coherence of our methodology and discussion, as well as other sections of the manuscript. We believe these revisions have strengthened the manuscript and made our findings more accessible to the research community. Also, we have revised the English writing as suggested. Thank you for your time and constructive review.

We addressed your concerns as follows: 

Comment 1: 

All the models analyzed in this study fall under the category of machine learning, including neural networks and deep learning models. The authors distinguish between these categories in the methodology section and later in the results section. It would be beneficial to discuss these models in an integrated manner rather than as separate groups.

Answer:

Thank you for your valuable suggestion. In our revised manuscript, we have restructured the methodology and results sections to present all studied models more integrated rather than treating them as separate groups. We acknowledge that all models analyzed in this study fall within the broader category of machine learning. However, we also emphasize the fundamental distinctions between classical machine learning algorithms (such as Random Forest, K-NN, and SVM) and deep learning models, particularly in terms of feature extraction, model complexity, and data processing.

This revision enhances the coherence of our discussion and ensures that readers can better appreciate the comparative performance of different approaches within the same analytical framework. We believe this adjustment strengthens methodological clarity and provides a more holistic understanding of our findings.

 

Comment 2: 

In Figure 1, the last graphical represent on contains the label RB—could this be a mistake? Should it perhaps be NB (Naïve Bayes)?

Answer:

We appreciate the Reviewer’s careful observation. Indeed, the correct acronym in Figure 1 should be NB (Naïve Bayes) instead of RB. We have corrected this mistake in the revised version of our manuscript. Thank you for bringing this to our attention.

 

Comment 3: 

Lines 397–405 and Table 2 present the hyperparameter values of the models. More explanation is needed regarding the obtained parameters and their significance. These parameters are not explained in the methodology section, and their values are listed only quantitatively in this part.

Answer:

Thank you for your valuable suggestion. We have expanded the explanation of the hyperparameters used in our machine learning models, providing more details on their significance and role in model performance. In the revised manuscript, we have incorporated a more comprehensive description in the methodology section, ensuring that readers understand the rationale behind the selection of these parameters. Additionally, we have enhanced the discussion accompanying Table 2 to clarify the impact of each hyperparameter on the classification results.

 

Comment 4:

For the Random Forest (RF) model, some parameters are mentioned, but it is unclear how many attributes were used for branching when generating ensemble trees.

Answer:

In the revised manuscript, we have included a more detailed explanation of the RF parameters. Additionally, we now provide information on the number of features used in the final trees. RF demonstrates interesting performance in spectral data classification, achieving accuracy close to 100% in almost all scenarios. Moreover, its complexity is reduced, as the final trees utilize, on average, only about 40% to 60% of the features. The latter was added to the final part of the results section.

 

Comment 5: Additionally, the phrase "Growth control: 5 smaller datasets" requires further clarification.

Answer:

Thank you for this suggestion. We agree that this expression should be refined. This aspect has been clarified in the revised manuscript as follows:” In the growth control configuration, subsets with fewer than five samples were not further split. In this sense, growth control establishes a constraint to prevent overfitting. Also, no depth limit was imposed on the trees.”

 

Comment 6:

In Table 2, the number of neurons is specified, but it is not clear how this number was determined. 

Answer:

The revised manuscript now includes a detailed description of the process used to determine the number of neurons in each hidden layer (see Section 4.2).

 

Comment 7: Tables 3, 4, and 5: It is advisable to verify whether the reported standard deviation values (Accuracy Avg-SD, Precision Avg-SD, Recall Avg-SD, F1-score Avg-SD) are indeed so low.]

Answer:

We re-ran the classification experiments to verify the average and standard deviation values. The results remained consistent, as the same parameters were used. In the case of neural networks, higher variability was observed, which aligns with the fact that ANNs are initialized with different values each time they are executed.

 

Comment 8:

Although it is possible to achieve high classification performance using the FTIR-PLASTIC-c4 dataset and advanced ML/DL techniques, results such as 100% accuracy for models like CNN_1, MLP_1, and Random Forest may indicate potential overfitting.

Answer:

We acknowledge that achieving accuracy values close to 100% could suggest overfitting. However, these results are primarily due to the strong FTIR signal obtained from the microplastics studied rather than overfitting. The high-quality spectral data allowed the models to be trained effectively, making this dataset particularly suitable for machine learning applications. This explanation has been included in the revised manuscript.

 

Comment 9: 

In the Results and Discussion section, there are no graphical representations of the predicted and actual target values. It would be beneficial to include a Confusion Matrix for some of the analyzed models in microplastic classification.

Answer:

Thank you for this suggestion. The revised manuscript now includes representative confusion matrices in the Results and Discussion section (Figure 2). These matrices correspond to four different normalization methods and illustrate the performance of three classifiers (best, average, and worst cases).

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

All my comments were addressed. Thank you.

Reviewer 2 Report

Comments and Suggestions for Authors

The paper titled 'FTIR-Based Microplastic Classification: A Comprehensive Study on Normalization and ML Techniques' has been appropriately improved and can be accepted for further processing.

Comments on the Quality of English Language

 The English could be improved to more clearly express the research.

Back to TopTop