Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Plastic Litter Detection in the Environment Using Hyperspectral Aerial Remote Sensing and Machine Learning

Remote Sens. 2025, 17(5), 938; https://doi.org/10.3390/rs17050938

by Marco Balsi^1,*

, Monica Moroni²

and Soufyane Bouchelaghem¹

Reviewer 1:

Dong Zhao

Reviewer 2:

Ivona Skultetyova

Reviewer 3: Anonymous

Reviewer 4: Anonymous

Remote Sens. 2025, 17(5), 938; https://doi.org/10.3390/rs17050938

Submission received: 3 January 2025 / Revised: 21 February 2025 / Accepted: 4 March 2025 / Published: 6 March 2025

(This article belongs to the Section Environmental Remote Sensing)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript proposed a plastic litter detection method in the environment by hyperspectral aerial remote sensing and machine learning. This work is interesting. However, the paper requires revisions before a possible publication. Below are the reviewer's comments and suggestions aimed at enhancing the clarity, rigor, and overall quality of the manuscript.

1. The advantages of the proposed method in the abstract is not clear.

2. The contributions of this work should be summarized by item.

3. The authors claim SWIR always contain the most useful absorption peaks features for plastic detection. However, the entire manuscript did not display the reflectance spectra of all plastic materials which used in the experiments from visible wavelengths to SWIR wavelengths (400nm to 1700nm). The reviewer suggests this manuscript demonstrates the reflectance spectra of different plastics.

4. Is there a significant difference in the reflectance spectra of plastics of different colors? You also can measure the reflectance spectra of the standard color checker board like the article named city-Scale distance sensing via bispectral light extinction in bad weather.

Actually, we hope the featured absorption peaks at specific wavelengths for plastic can be observed utilizing comments 3 and 4. This is interesting for readers if this draft can publish.

5. Two camera-spectrometer devices operating in the bands 400-1000 nm (VIS/NIR) and 900-1700 nm (SWIR) that were used in the experiments have different spatial resolution. How to ensure the hyperspectral image captured by this two camera-spectrometer devices have the pixel to pixel spatial position relationship? This is important for the detection method.

6. In lines 221-233, the metric for our method should introduce in the results section.

7. The manuscript lacks sufficient quantitative analysis of the experimental results. You can use the data of tables 1-6 to analyze the result deeply.

8. On page 13, the introduction about the selection strategy of featured wavelengths is not clear enough, and you can demonstrate these main wavelengths if possible.

9. The author should demonstrate the computational complexity and runtime of the proposed method, not just show the Kappa scores for different classifiers.

10. This draft has many minor errors.

In line 91, what is q?

In line 148, the second section 2.2?

Please check the whole draft and revise them thoroughly.

Author Response

We thank the reviewers for their careful and insightful reading of our manuscript, and for the constructive comments, that prompted us to perform a major revision of the paper. The main change is a substantial increase of the number of case studies involved in the work. They are now seven, while in the previous version we included four cases, but then discarded one of them because the data were not good, due to inadequate exposure setting. We completely omitted the latter case now, so the dataset now contains the three cubes of the previous version, and four more, acquired in different conditions over more than four years. While most cubes contain scans of scenarios where we put small heaps of materials composed of real plastic waste sorted by polymer, one of the scenarios added corresponds to a short flight across tunnel greenhouses covered by PP transparent sheet.

Based on these major changes, we reworked most of the experimental part, being able to perform a more reliable leave-one-out procedure on the seven cubes, and demonstrating the application of a generalizing classifier to all cases. To make these changes clearer, we compare the results section of the first version of the paper to the revised one.

In the first version, the following experiments were performed:

first of all, we tried learning and testing a single cube (same cube, but separate training and test set, this was apparently not clear in the manuscript), applying the classifier to an increasing number of features selected by mRMR. In this way, we decided on the number of features we would use.
Then, we tried learning on a single cube and testing on others (obtaining in many cases not satisfactory results). This experiment was performed without any attempt at normalizing the data
The third experiment was a leave-one-out procedure on four cubes, that gave quite satisfactory results for three cubes, but poor results for the cube that we later discarded
Further, we tried several normalization procedures, deciding that, since the simplest one is as effective as the others, we would apply that in the continuation
We showed the results of applying a classifier learned on all three cubes with normalization
and lastly, we performed a LOO procedure on the three cubes with normalization.

In the revised version:

we consistently used seven cubes throughout the experimental section
at the beginning, we repeated the procedure for selecting the number of features to be used. In doing that, we corrected two small errors in the table and reworked the graph in a way that looks more convincing to us
then, we proceeded directly to choosing the normalization strategy and continued with the simple normalization afterwards
we avoided the procedure of learning on a single cube and testing on others, and directly applied proper LOO on seven cubes
Then, after learning a classifier on all seven cubes, we show the results for each and every polymer for one cube only, and then the results of applying such classifier on all cubes, both quantitatively, in terms of kappa scores, and qualitatively, by showing figures of the results only for the case “all plastics”.

We think that such re-organization of the experiments and of the results section significantly improved the validation of the methodology, and that the paper, revised in this way, responds to the major concerns of the reviewers about the significance of the experimental procedures.

In the following, we respond to the concerns of the reviewer item by item

The advantages of the proposed method in the abstract is not clear.

The abstract has been modified to emphasize the original contribution of the paper in a clearer way

The contributions of this work should be summarized by item.

The main contributions of the paper have been summarized by item at the end of the introduction

The authors claim SWIR always contain the most useful absorption peaks features for plastic detection. However, the entire manuscript did not display the reflectance spectra of all plastic materials which used in the experiments from visible wavelengths to SWIR wavelengths (400nm to 1700nm). The reviewer suggests this manuscript demonstrates the reflectance spectra of different plastics.

We added Fig. 4 to describe the spectra in the SWIR band, that are now discussed at the end of sec. 2.2. We do not display or discuss spectra in the VIS/NIR band because we did not use them in this work. The motivation for using only the SWIR band, which is really specific for plastics polymer recognition, has been more clearly discussed in the introduction, with reference to relevant papers.

Is there a significant difference in the reflectance spectra of plastics of different colors? You also can measure the reflectance spectra of the standard color checker board like the article named city-Scale distance sensing via bispectral light extinction in bad weather.

Color in visible light is determined by the perception of the shape of the spectrum (relative magnitude at different wavelengths). We are working in IR, but we can consider color as a similar understanding of spectra shape. Indeed, we never observed significant dispersion of the shapes of the spectra of different samples of each polymer (or other material), even if they had different visible colors, so we would say that there is no “color” effect in SWIR within a given material, or it is negligible w.r.t. the “color” effect represented by the specific absorption bands of different plastic polymers. When discussing the spectra displayed in the new fig. 4, we argue that the small dispersion observed is indicative of the fact that spectra differences are determined by polymer, but there is no “color” effect. Nevertheless, this is an interesting issue that might be considered in greater depth by careful analysis of the spectra of different objects, maybe more properly in laboratory setting, in the continuation of the work.

Actually, we hope the featured absorption peaks at specific wavelengths for plastic can be observed utilizing comments 3 and 4. This is interesting for readers if this draft can publish.

The spectra shown in fig. 4 are clearly different from non-plastics spectra (that are quite flat), because they show evident specific absorption bands. The specific dips in the spectra (some wide, some very sharply localized) may be compared with the location of the most significant features selected by mRMR (fig. 9), but it is necessary to take into account that also relative magnitude of the response in some bands is significant, so that the information-based feature selection method also selects wavelengths that do not correspond directly to narrow absorption peaks.

Two camera-spectrometer devices operating in the bands 400-1000 nm (VIS/NIR) and 900-1700 nm (SWIR) that were used in the experiments have different spatial resolution. How to ensure the hyperspectral image captured by this two camera-spectrometer devices have the pixel to pixel spatial position relationship? This is important for the detection method.

As we did not use the VIS/NIR data, we did not attempt at coregistering the cubes.

In lines 221-233, the metric for our method should introduce in the results section.

We think that the way of measuring performance should be considered part of the methodology, so we preferred to keep it in section 2, also considering that the other reviewers did not object to such organization of the text.

The manuscript lacks sufficient quantitative analysis of the experimental results. You can use the data of tables 1-6 to analyze the result deeply.

We believe that having extended the number of cases treated from 3 to 7 makes the quantitative results more reliable and robust. Tables and newly added figures have been more extensively commented in the revised version of the manuscript, and the main arguments of the results are now more consistently based on quantitative data (in particular the Kappa tables). In addition, we also provided more qualitative results (figures of classification results), because we think that they help in better interpreting the quantitative results that are still the most important validation basis. We avoided adding excessively detailed and large numeric tables in the paper, because we preferred to concentrate on the key parameters for clarity. We hope that we interpreted this essential concern properly in the revised version.

On page 13, the introduction about the selection strategy of featured wavelengths is not clear enough, and you can demonstrate these main wavelengths if possible.

The selection strategy is mRMR, discussed in the introduction. We think that since mRMR is a quite well-established method, the reference to the original formation (cited hundreds of times) is sufficient to justify its use. We also briefly argued about one important possible alternative (PCA), that we indeed tried in previous works. As commented above, we reported on the selected wavelengths graphically, in order to avoid adding a large and complicated table that would result difficult to read and we added a brief discussion in the text

The author should demonstrate the computational complexity and runtime of the proposed method, not just show the Kappa scores for different classifiers.

We added some information about computational complexity at line 341. We believe that giving absolute numbers in terms of seconds makes little sense, given that they depend heavily on the computational environment used, but the remark we added, in relative terms, is quite important for the practical implementation. We have wider considerations of computational complexity in another paper, currently under review, that deals with a laboratory setting, that is more appropriate for such study.

This draft has many minor errors.

In line 91, what is q? just a typo, removed

In line 148, the second section 2.2? Please check the whole draft and revise them thoroughly. corrected

Reviewer 2 Report

Comments and Suggestions for Authors

The submitted article presents a technically and scientifically relevant study on the detection of plastic waste in natural environments using hyperspectral imaging and machine learning techniques. The research is well-structured, methodologically sound, and contributes to an emerging field of environmental monitoring through remote sensing technologies.

The strengths of the article lie in its innovative use of hyperspectral imaging, well-designed experimental setup, and rigorous machine learning-based classification. The authors conducted systematic testing on multiple datasets, evaluated the impact of spectral normalization, and compared classification performance using different algorithms (SVM, LDA, mRMR feature selection). The study demonstrates the effectiveness of SWIR hyperspectral imaging for distinguishing plastics from other materials, which has potential applications in marine and terrestrial pollution monitoring.

The methodology is generally properly , with appropriate validation techniques, including cross-validation and Leave-One-Out testing, ensuring the reliability of the results. The findings indicate that combining hyperspectral data with machine learning algorithms enhances detection accuracy, making this approach promising for real-world implementation. The manuscript is supported by recent and high-quality references (2020–2024), sourced from reputable journals, conferences, and official organizations (ESA, UN, IEEE, Remote Sensing, Marine Pollution Bulletin).

However, while the article is technically solid, it has some methodological and contextual limitations that should be addressed to improve its scientific impact. The study lacks comparative analysis with alternative detection methods, such as multispectral imaging, thermal imaging, or radar-based approaches, which could provide a more comprehensive evaluation of hyperspectral imaging’s advantages and limitations. Additionally, the issue of model generalization remains a concern, as the models were tested on a limited dataset and not validated across different geographic locations, potentially affecting their robustness.

Another notable limitation is the sensitivity of the models to varying lighting conditions, as demonstrated by the poor classification performance on the December 2022 dataset due to incorrect exposure settings. This highlights the need for adaptive exposure mechanisms to ensure stable hyperspectral data acquisition. The study also does not evaluate the feasibility of real-time implementation on drones, nor does it discuss potential regulatory and legal challenges related to drone-based environmental monitoring. These aspects should be further explored or acknowledged to strengthen the study’s practical implications.

Overall, the article presents valuable findings in the field of environmental monitoring and hyperspectral imaging. The research is technically sound, well-referenced, and offers a novel approach to plastic waste detection.

However, some methodological and contextual limitations need to be addressed before publication. I recommend considering the article for publication after minor-to-moderate revisions, focusing on the following:

Expanding the comparative analysis by discussing alternative remote sensing techniques for plastic detection.
Addressing the model’s generalization capability by testing on diverse geographic datasets.
Providing additional insights into real-time processing feasibility and computational efficiency on drone platforms.
Discussing regulatory and legal aspects that may affect drone-based environmental monitoring.

Once these revisions are addressed, the article has strong potential for publication and will contribute significantly to advancements in hyperspectral remote sensing applications for environmental sustainability.

Author Response

In the first version, the following experiments were performed:

first of all, we tried learning and testing a single cube (same cube, but separate training and test set, this was apparently not clear in the manuscript), applying the classifier to an increasing number of features selected by mRMR. In this way, we decided on the number of features we would use.
Then, we tried learning on a single cube and testing on others (obtaining in many cases not satisfactory results). This experiment was performed without any attempt at normalizing the data
The third experiment was a leave-one-out procedure on four cubes, that gave quite satisfactory results for three cubes, but poor results for the cube that we later discarded
Further, we tried several normalization procedures, deciding that, since the simplest one is as effective as the others, we would apply that in the continuation
We showed the results of applying a classifier learned on all three cubes with normalization
and lastly, we performed a LOO procedure on the three cubes with normalization.

In the revised version:

we consistently used seven cubes throughout the experimental section
at the beginning, we repeated the procedure for selecting the number of features to be used. In doing that, we corrected two small errors in the table and reworked the graph in a way that looks more convincing to us
then, we proceeded directly to choosing the normalization strategy and continued with the simple normalization afterwards
we avoided the procedure of learning on a single cube and testing on others, and directly applied proper LOO on seven cubes
Then, after learning a classifier on all seven cubes, we show the results for each and every polymer for one cube only, and then the results of applying such classifier on all cubes, both quantitatively, in terms of kappa scores, and qualitatively, by showing figures of the results only for the case “all plastics”.

In the following, we respond to the concerns of the reviewer item by item

We have improved the introduction, adding a discussion of the state-of-art limited to our specific purposes (review papers are cited, that discuss more widely all issues), and adding an itemized description of the main contributions of our work. We believe that the existence of very good, and recent, review papers, makes it improper to attempt a wide review in this context, so we preferred to focus on specific topics, relevant to this work.

Additionally, the issue of model generalization remains a concern, as the models were tested on a limited dataset and not validated across different geographic locations, potentially affecting their robustness.

Indeed, we absolutely cannot claim that we have proved the generalization ability of our methodology to all possible scenarios. Nevertheless, in the revised version of the paper we have more than doubled the number of cases considered, even if still in controlled environments. We believe that, at least, the feasibility of obtaining generalized solutions is established in this way, and on this basis we are planning a new experimental campaign focused on case studies taken in completely real environments.

This is an objective limitation of the hardware we used: the dynamic range of our SWIR camera is rather limited, so that at least automatic mechanical diaphragm adjustment would make operation significantly easier in practice. We are planning to acquire a higher performance device as soon as possible, so that improvements on the current system for the moment will be limited to reasonable adjustments.

The study also does not evaluate the feasibility of real-time implementation on drones,

We published a conference paper on this specific issue, cited as [23] in the revised version of the paper. As our real-time implementation has already been treated in a published contribution, we preferred not to add more detail in this paper than such self-citation

nor does it discuss potential regulatory and legal challenges related to drone-based environmental monitoring.

We added a paragraph to the introduction to discuss motivations for the use of drones, also mentioning the regulatory issues.

These aspects should be further explored or acknowledged to strengthen the study’s practical implications.

Expanding the comparative analysis by discussing alternative remote sensing techniques for plastic detection.
Addressing the model’s generalization capability by testing on diverse geographic datasets.
Providing additional insights into real-time processing feasibility and computational efficiency on drone platforms.
Discussing regulatory and legal aspects that may affect drone-based environmental monitoring.

Reviewer 3 Report

Comments and Suggestions for Authors

The paper experiments with detecting plastic litter using hyperspectral aerial remote sensing combined with machine learning techniques. Both plastic waste detection through remote sensing and usage of hyperspectral sensors an UAVs (for various purposes) are an actively research topic, therefore the paper has significant research potential.

The authors state that their research was carried out on a diverse dataset (different weather conditions, different seasons, different exposure settings) to prove the generalization capability of their method. Yet, only 4 data acquisitions were used in the research, and as discussed in the Results, one dataset was dropped due to wrong exposure settings. It is not clear why this dropped dataset was included in the research paper at all? Therefore, the suggested method was evaluated on only 3 data acquisitions, and it does not sufficiently validate its generalization capability; the evaluated dataset size should be significantly increased for that.

In the results, Figure 5 shows an evaluation, where the SVM classifier was trained and tested on the same dataset. Figure 6 and 7 show results, where the SVM classifier was partially trained on the same dataset on which it is tested on. To increase the scientific soundness of the method and the results, a separate training and validation dataset should be defined.

The paper mentions deep learning methods multiple times, even discusses their strength and challenges in the last section. However, the authors cite no related work, while multiple studies are available on the topic. Comparing results could also improve the paper.

Author Response

In the first version, the following experiments were performed:

first of all, we tried learning and testing a single cube (same cube, but separate training and test set, this was apparently not clear in the manuscript), applying the classifier to an increasing number of features selected by mRMR. In this way, we decided on the number of features we would use.
Then, we tried learning on a single cube and testing on others (obtaining in many cases not satisfactory results). This experiment was performed without any attempt at normalizing the data
The third experiment was a leave-one-out procedure on four cubes, that gave quite satisfactory results for three cubes, but poor results for the cube that we later discarded
Further, we tried several normalization procedures, deciding that, since the simplest one is as effective as the others, we would apply that in the continuation
We showed the results of applying a classifier learned on all three cubes with normalization
and lastly, we performed a LOO procedure on the three cubes with normalization.

In the revised version:

we consistently used seven cubes throughout the experimental section
at the beginning, we repeated the procedure for selecting the number of features to be used. In doing that, we corrected two small errors in the table and reworked the graph in a way that looks more convincing to us
then, we proceeded directly to choosing the normalization strategy and continued with the simple normalization afterwards
we avoided the procedure of learning on a single cube and testing on others, and directly applied proper LOO on seven cubes
Then, after learning a classifier on all seven cubes, we show the results for each and every polymer for one cube only, and then the results of applying such classifier on all cubes, both quantitatively, in terms of kappa scores, and qualitatively, by showing figures of the results only for the case “all plastics”.

In the following, we respond to the concerns of the reviewer item by item

We removed the case characterized by improper exposure and added 4 new cases to the analysis. We believe that extending the number of cases in this way has substantially improved the quality and significance of the results, as discussed above.

We did not explain clearly before, that even when training and testing on the same hyperspectral cube, we do indeed use separate training and test sets. This has been clarified in the revised paper. Moreover, we applied a full leave-one-out procedure on seven cubes in this revised version, so that results are now more reliable.

We improved the discussion of DL methods in the introduction, adding one relevant reference (other papers already cited also deal with DL). Proper comparison of results was not possible in this work, because it would involve a large experimental work that we are indeed doing currently, but that we will be able to publish a few months from now. Nevertheless, our argument is that ML techniques are more amenable to embedded implementation, therefore we feel that even if they had worse performance in terms of accuracy, they are still preferable for feasibility in embedded real-time solutions. That is why we concentrate on such a solution in this paper that focuses on practical applications

Reviewer 4 Report

Comments and Suggestions for Authors

The paper presents a methodology for detecting plastic litter in various environments using aerial hyperspectral imaging in the short-wave infrared (SWIR) range, combined with machine learning algorithms. The study integrates a push-broom hyperspectral sensor system mounted on a DJI Matrice 600 drone, utilizing spectral bands from 900–1700 nm. Data acquisition is conducted across different environmental conditions, with machine learning classifiers—primarily Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM)—applied to identify plastic types. The authors explore feature selection via the minimum-redundancy/maximum-relevance (mRMR) criterion, dataset generalization across multiple case studies, and simple normalization techniques to improve robustness against illumination variations. The results suggest that real-time plastic detection without complex calibration is feasible and that single-polymer classification may improve overall detection performance.

The introduction section merely lists existing research without summarization. Please start with a comprehensive literature review, comparing the advantages and limitations of each method. Then, highlight the distinctions of your work with them, emphasizing your innovations.

The description in the methods section lacks detail, making it difficult for readers to replicate the experiment. For example, the authors should add the information regarding the input parameters of the LDA and SVM.

Real-world conditions include partially buried plastic, mixed debris, weathering effects, and variations in moisture, which could degrade detection performance. The study assumes that hyperspectral pixels contain only one material, whereas real-world scenarios involve mixed pixels (e.g., plastic debris entangled with vegetation). Therefore, I suggest conducting larger-scale validation experiments in some major areas within the city (such as entire parks). The current validation scenarios are too small, making it difficult to rule out interference from other factors.

Figure 3 (a) The lower left corner of the image is incomplete.

Line 226 227 228 Equations should be numbered.

Author Response

In the first version, the following experiments were performed:

first of all, we tried learning and testing a single cube (same cube, but separate training and test set, this was apparently not clear in the manuscript), applying the classifier to an increasing number of features selected by mRMR. In this way, we decided on the number of features we would use.
Then, we tried learning on a single cube and testing on others (obtaining in many cases not satisfactory results). This experiment was performed without any attempt at normalizing the data
The third experiment was a leave-one-out procedure on four cubes, that gave quite satisfactory results for three cubes, but poor results for the cube that we later discarded
Further, we tried several normalization procedures, deciding that, since the simplest one is as effective as the others, we would apply that in the continuation
We showed the results of applying a classifier learned on all three cubes with normalization
and lastly, we performed a LOO procedure on the three cubes with normalization.

In the revised version:

we consistently used seven cubes throughout the experimental section
at the beginning, we repeated the procedure for selecting the number of features to be used. In doing that, we corrected two small errors in the table and reworked the graph in a way that looks more convincing to us
then, we proceeded directly to choosing the normalization strategy and continued with the simple normalization afterwards
we avoided the procedure of learning on a single cube and testing on others, and directly applied proper LOO on seven cubes
Then, after learning a classifier on all seven cubes, we show the results for each and every polymer for one cube only, and then the results of applying such classifier on all cubes, both quantitatively, in terms of kappa scores, and qualitatively, by showing figures of the results only for the case “all plastics”.

In the following, we respond to the concerns of the reviewers item by item

We now briefly commented on the Matlab implementation of the classifiers, and specifically on which hyperparameters of the SVM are optimized automatically, and added a brief comment on the effects of such parameters on computational complexity. On the other hand, if by “input parameters” you mean what is the actual data used as latent variables for the classifiers, we believe to have made clear which features are selected from the entire cube, having also added some more short comments on the mRMR-selected features.

The pixels of our data are very small (1-2cm) so that we may assume that most pixels are filled with a single material. Results on the whole images show that the classification has satisfactory accuracy at pixel level. Indeed, we may neglect errors on pixels at the edge of objects, provided that the object is detected from the internal, or mostly internal, pixels. We plan to investigate on the detection of plastics occupying pixels only in part (and possibly a small part) when we proceed to use satellite data, but also when considering scenarios where small objects are widely scattered. This is an important issue that we will deal with in future work, as we briefly commented in the revised conclusions.

The study on controlled scenarios is necessary for proper training (manually labeling complicated scenarios would not be feasible). It was necessary, in order to obtain generalizing classifiers that can be applied in fully natural environments. In the continuation of the work, we plan to acquire data in real-world conditions, where plastics litter is normally present at very low density, even in rather polluted environments.

Figure 3 (a) The lower left corner of the image is incomplete.

We now explained in the caption that such images were obtained by simple juxtaposition of single photographs, so they are not necessarily rectangular.

Line 226 227 228 Equations should be numbered.

Equations have been numbered in the revised version, even if they are never referenced to.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I have no comments.

Comments on the Quality of English Language

The English could be improved to more clearly express the research.

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have significantly revised their paper based on the reviewers' comments and correctly addressed most of my previous concerns. I recommend to accept it for publication.

Reviewer 4 Report

Comments and Suggestions for Authors

I recommend to accept this manuscript.

Article Menu

Plastic Litter Detection in the Environment Using Hyperspectral Aerial Remote Sensing and Machine Learning

Further Information

Guidelines

MDPI Initiatives

Follow MDPI