Hyperspectral Detection and Classification of Stain-Contaminated Waste Textiles

Zou, Jiacheng; He, Haonan; Tian, Wei; Zhu, Chengyan; Ye, Fei; Jin, Xiaoke

doi:10.3390/coatings16060629

Open AccessArticle

Hyperspectral Detection and Classification of Stain-Contaminated Waste Textiles

by

Jiacheng Zou

^1,2,

Haonan He

^1,2,

Wei Tian

^1,2,

Chengyan Zhu

^1,2,

Fei Ye

³ and

Xiaoke Jin

^1,2,*

¹

College of Textile Science and Engineering (International Institute of Silk), Zhejiang Sci-Tech University, Hangzhou 310018, China

²

State Key Laboratory of Bio-Based Fiber Materials, Zhejiang Sci-Tech University, Hangzhou 310018, China

³

Huzhou Institute of Quality and Technical Supervision and Inspection (Huzhou Fiber Quality Monitoring Center), Huzhou 313000, China

^*

Author to whom correspondence should be addressed.

Coatings 2026, 16(6), 629; https://doi.org/10.3390/coatings16060629

Submission received: 26 March 2026 / Revised: 17 May 2026 / Accepted: 20 May 2026 / Published: 22 May 2026

(This article belongs to the Special Issue Smart Materials and Textiles Coatings: Preparation, Properties and Applications)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Carbon black, protein, and oil stains induce nonlinear spectral distortion on textile substrates.
FD-SVM achieves 100% accuracy but relies heavily on manual preprocessing selection.
RAW-CNN attains 100% accuracy directly from raw spectra without any preprocessing.

What are the implications of the main findings?

HSI enables simultaneous stain detection and fiber identification on waste textiles.
RAW-CNN simplifies real-time prediction for high-throughput waste textile sorting.
The framework offers a practical route for automated recycling-oriented textile inspection.

Abstract

Surface stain contamination poses a critical barrier to the automated, high-precision fiber identification required for industrial-scale waste textile recycling. In this study, a dataset comprising 120 physical specimens (yielding 1200 regions of interest, ROIs) across 12 contamination categories was constructed by contaminating cotton, polyester, and poly-cotton blend textiles with carbon black, protein, and oil stains. The spectral interference effects of stains—including baseline drift and spectral overlapping induced by physical shielding and chemical absorption—were systematically analyzed. To identify the optimal classification pipeline, three mathematical preprocessing methods (First Derivative, FD; Standard Normal Variate, SNV; and Multiplicative Scatter Correction, MSC) were evaluated alongside Support Vector Machine (SVM) and One-Dimensional Convolutional Neural Network (1D-CNN) models. Results show that among the SVM-based pipelines, the FD-SVM model effectively resolves overlapping absorption peaks, achieved an average accuracy of 98.17% ± 1.33%, but remains highly dependent on mathematical preprocessing. In contrast, the 1D-CNN model employing a progressive stacking architecture of multi-scale convolutional kernels attains a highly robust mean accuracy of 99.58% ± 0.56% under a strict specimen-level 10-fold cross-validation. It achieves this by directly utilizing radiometrically calibrated raw spectra, thereby effectively bypassing manual spectral feature engineering. These findings demonstrate that Hyperspectral Imaging coupled with end-to-end deep learning provides a feasible and industrially deployable solution for simultaneous stain detection and fiber identification in waste textile sorting.

Keywords:

waste textiles; hyperspectral imaging; stain interference; One-Dimensional Convolutional Neural Network; support vector machine

Graphical Abstract

1. Introduction

As the world’s largest textile producer and consumer, China accounts for over 50% of global fiber processing volume, generating approximately 20 million tons of waste textiles annually [1]. Nevertheless, the current comprehensive recycling rate of waste textiles in China remains below 25%, lagging behind that of developed regions such as the European Union (where frontrunner countries like Belgium and the Netherlands have achieved separate collection rates ranging from 38% to 50% [2]), and thereby imposing substantial resource depletion and environmental burdens [3,4]. The Implementation Opinions on Accelerating the Circular Utilization of Waste Textiles, issued by China’s National Development and Reform Commission and other government agencies [5], explicitly mandates the establishment of a well-functioning waste textile recycling system by 2030, with breakthroughs in fiber identification and high-efficiency sorting technologies designated as priority tasks. The recycling workflow for waste textiles encompasses collection, sorting, pretreatment, and regeneration, among which the accurate detection of fiber composition and surface stains constitutes the critical prerequisite for efficient sorting and subsequent high-value resource recovery. In practice, the presence of surface stains and compositional heterogeneity on waste textiles significantly increases the difficulty of detection and identification, not only compromising classification accuracy but also introducing adverse effects in downstream regeneration processes. If stains or non-target components enter subsequent stages, they may cause fiber breakage during mechanical recycling and even damage critical components such as carding and opening equipment [6]; in chemical recycling, such contaminants can readily induce catalyst poisoning, severely impeding dissolution and depolymerization reactions [7]. There is, therefore, an urgent need for novel detection technologies capable of simultaneously identifying surface stains and fiber compositions.

Five principal detection methods are currently employed: manual inspection, microscopic observation, chemical dissolution, spectroscopic analysis, and image analysis. Manual inspection relies on the experience of sorting personnel who assess sample categories through tactile judgment; however, this approach suffers from low efficiency, strong subjectivity, and susceptibility to misjudgment, rendering it inadequate for increasingly complex contaminated and blended samples. Both microscopic observation and chemical dissolution perform localized identification on sectioned specimens; these destructive methods preclude holistic sample characterization and are excessively time-consuming, failing to meet the requirements of high-throughput industrial operations. Spectroscopic analysis identifies and classifies samples based on their spectral signatures, with the core principle being the measurement of differential interaction intensities—such as absorption and reflection—between a sample and electromagnetic radiation across different wavelength bands, thereby determining its chemical composition. Johns et al. [8] employed autofluorescence combined with K-means clustering analysis and successfully distinguished seven common textile materials using minimal spectral input; compared with conventional spectroscopic methods, this approach significantly improved speed but yielded an overall classification accuracy of 71%. Bonifazi et al. [9] utilized short-wave infrared (SWIR) spectroscopy coupled with partial least squares discriminant analysis (PLS-DA) to achieve accurate classification of wool and silk, though sensitivity remained low for certain blended fabrics and only surface-level information could be captured. Overall, although spectroscopic analysis can elucidate material composition at the molecular level, it is constrained by point-based or localized measurement modes and lacks spatial characterization capability, making it difficult to comprehensively address the complexity of stain-contaminated, multi-component waste textiles.

Image analysis relies on image processing and machine learning techniques to extract features such as pattern, color, and texture for sample classification. Furferi et al. [10] achieved automated color recognition and classification of recycled wool through probabilistic neural network (PNN)-based image analysis, though performance was unsatisfactory for dark-colored samples such as black and gray. Tian et al. [11] developed an attention mechanism-integrated convolutional neural network (CNN) that significantly outperformed conventional CNNs in qualitative classification of waste garments; however, this method was limited to visual analysis of garment style categories and was unable to identify fiber compositions or their proportions. It is thus evident that, while image analysis can efficiently extract spatial texture and appearance features, it critically lacks intrinsic chemical spectral information and is highly susceptible to the phenomena of “different spectra for the same material” (where the identical fiber substrate exhibits varying appearances due to different surface stains) and “same spectrum for different materials” (where distinct fiber substrates look identical because their intrinsic features are completely obscured by the same strong contaminant), thereby limiting the essential identification of fiber substrate attributes.

Hyperspectral Imaging (HSI) technology synergistically integrates the advantages of conventional spectroscopy and conventional imaging, enabling the simultaneous acquisition of both spatial and spectral information [12]. Unlike the point-based detection of near-infrared (NIR) spectroscopy and the three-channel acquisition of RGB imaging, HSI records hundreds of continuous narrow-band spectral data points at each pixel, forming a three-dimensional data cube that enables substantially finer material discrimination. This technology has been widely applied in precision agriculture, medical diagnostics, and food safety, and has also demonstrated considerable potential in the field of textile detection. Bonifazi et al. [13] employed a portable hyperspectral instrument combined with principal component analysis (PCA) and PLS-DA, achieving a fiber identification accuracy of up to 99.2%; their study demonstrated that flexible and rapid fiber identification is feasible without large-scale imaging apparatus, while simultaneously noting that partial spectral overlapping among multi-component fabrics limited the identification accuracy of blended textiles. Li et al. [14] employed visible-range hyperspectral reflectance imaging for the non-contact and non-destructive detection of blood stains on various colored substrates. By exploiting the characteristic absorption features of haemoglobin in the 400–500 nm range, the method enabled reliable discrimination of blood from a wide range of visually similar substances, including numerous red-colored contaminants, and demonstrated high sensitivity in detecting both visible and latent stains. Other researchers [15,16] combined HSI with false-color composite and multivariate analysis, as well as with discrete wavelet transform (DWT) and One-Dimensional Convolutional Neural Network (1D-CNN), respectively achieving visualization and rapid, accurate identification of chemical residues on textiles. In addition to conventional textile analysis, hyperspectral imaging also shows considerable potential for coated and functionally finished textiles, where surface coatings and finishing treatments can significantly alter optical reflectance behavior, contamination adhesion characteristics, and spectral response patterns. Therefore, robust stain-recognition and fiber-identification methods may further contribute to intelligent quality evaluation, contamination monitoring, and automated inspection in coated textile systems.

Despite these advances in hyperspectral textile analysis, the majority of existing studies were conducted on idealized clean samples, with limited attention devoted to the surface stains typically present on real-world waste textiles. Taking carbon black, protein, and oil stains as examples, these contaminants exhibit characteristic spectral signatures in the near-infrared region; upon binding with fibers, they not only cause spectral overlapping but also induce variability shifts, superposition, and even complete physical shielding, thereby reducing or entirely invalidating the sensitivity of models originally developed for clean samples. Consequently, how to effectively parse the composite spectral information of stain-contaminated waste textiles and establish highly robust classification models constitutes a key challenge in advancing this technology from laboratory-scale identification to industrial-scale detection and sorting.

Compared with existing studies that apply hyperspectral imaging and convolutional neural networks in textile analysis, current research generally follows two separate directions: either focusing on the classification of clean textile materials under ideal conditions or targeting the identification of surface contaminants such as stains and chemical residues as independent classes. In contrast, this study treats stains not as isolated targets, but as an inherent attribute of waste textiles. By explicitly incorporating multiple types of contaminants (carbon black, protein, and oil) into the classification framework, the problem is reformulated as the recognition of textile substrates under complex interference conditions. This leads to significant spectral distortion, overlap, and attenuation, posing a substantially greater challenge than conventional settings. Therefore, the key distinction of this work lies in investigating whether deep learning models can effectively decouple intrinsic fiber spectral features from stain-induced interference, thereby achieving robust classification in realistic, contaminated scenarios.

To address the problem that surface stain interference reduces identification accuracy and hinders subsequent efficient recycling of waste textiles, the core hypothesis of this study is that despite the severe spectral variability and baseline drift introduced by surface contaminants, the intrinsic spectral features of the underlying textile fibers can still be effectively captured and decoupled by an end-to-end 1D-CNN, thereby bypassing the need for manual spectral feature engineering. Therefore, the specific objective of this study is to systematically investigate and compare the robustness of conventional machine learning and deep learning frameworks against complex stain interferences. Guided by this objective, this study selected cotton, polyester, and poly-cotton blend as substrate materials and prepared stained specimens contaminated with carbon black, protein, and oil. A spectral dataset covering 12 substrate–stain combinations was established. Based on hyperspectral characterization of these stained specimens, three preprocessing algorithms were comparatively analyzed in terms of their ability to resolve the spectral features of contaminated samples. Support Vector Machine (SVM) and 1D-CNN classification models were then constructed, and the differences between the resulting conventional machine learning and deep learning frameworks were systematically compared from the perspectives of qualitative classification performance, preprocessing dependency, and computational efficiency. The ultimate aim is to provide an innovative technical pathway for waste textile recycling systems and to lay a theoretical foundation for the development of automated detection and identification equipment.

2. Materials and Methods

2.1. Experimental Materials and Instrumentation

2.1.1. Experimental Materials

Three fabric substrates were selected: cotton, polyester, and 65/35 poly-cotton blend. Carbon black, protein powder, and soybean oil were employed as stain agents, supplemented with corresponding auxiliary reagents to prepare the stain solutions. The detailed specifications, purities, structural parameters (including density, thread count, and thickness), and sources of all experimental materials are listed in Table 1.

2.1.2. Hyperspectral Imaging System Parameters

Hyperspectral data were acquired using a GaiaSorter (Dual) hyperspectral imaging system (Jiangsu Shuangli Hepu Technology Co., Ltd., Wuxi, China). The system principally comprises an Imspector N25E spectral camera (Specim, Spectral Imaging Ltd., Oulu, Finland), a halogen lamp illumination unit, a motorized translation stage, and a dark enclosure, operating in a push-broom scanning mode. The spectral camera covers the short-wave near-infrared (SWNIR) range of 966–2560 nm. However, because the extreme edges of the detector array (<1000 nm and >2400 nm) exhibit significant instrumental dark current and an unacceptably low signal-to-noise ratio (SNR), the effective spectral range retained for all subsequent analyses was strictly constrained to 1000–2400 nm. This range reliably enables the capture of overtone and combination-band absorption features of specimens in the near-infrared region.

To ensure optimal SNR in both spatial resolution and spectral response, the key system parameters were optimized through preliminary experiments and configured as follows: camera exposure time of 6 ms, lens-to-sample working distance of 33.5 cm, translation stage scanning speed of 30 mm/s, and a full-frame spatial resolution of 384 × 288 pixels. With a field of view of approximately 280 mm × 210 mm at this working distance, the corresponding spatial pixel size is approximately 0.73 mm per pixel. Given that a typical textile fiber diameter (10–30 μm) is significantly smaller than this pixel resolution, the hyperspectral system primarily captures macroscopic spectral information at the yarn or fabric level rather than resolving individual fibers. During acquisition, a diffuse reflectance standard white reference panel was used to perform radiometric calibration, thereby eliminating artifacts arising from non-uniform illumination distribution and dark current noise. The optical path configuration and hardware components of the system are illustrated in Figure 1.

2.2. Stain Specimen Preparation

Cotton, polyester, and poly-cotton blend served as substrates for the preparation of three categories of stain specimens: carbon black, oil, and protein stains. A total of nine substrate–stain experimental groups were established, with ten replicate specimens prepared per group, yielding 90 valid stain specimens in total.

2.2.1. Preparation of Simulated Stain Solutions

Three standardized stain solutions were formulated with reference to the artificial soiled fabric preparation method specified in GB/T 13174-2021 [17] (Determination of Detergency and Cyclic Washing Performance of Laundry Detergents), in conjunction with the physicochemical characteristics of contaminants typically found on waste textiles:

(1): Carbon black stain solution: The dispersant Peregal O (fatty alcohol polyoxyethylene ether) was completely dissolved in an ethanol–water mixed solvent, after which carbon black powder at a mass fraction of 0.5% was added. A stable carbon black suspension was obtained following magnetic stirring and dispersion.
(2): Protein stain solution: Protein powder was placed in a beaker, and deionized water was gradually added in batches under thermostatic magnetic stirring until the solution reached a supersaturated state. After standing and filtration to remove undissolved precipitates, a highly viscous protein solution was obtained.
(3): Oil stain solution: Span-80 and Tween-80 were used as a compound emulsifier at a mass ratio of 65:35, with the oil-to-water mass ratio controlled at 3:1. The mixed system was maintained in a constant-temperature water bath at 50 °C, and the aqueous phase was slowly added dropwise into the oil phase under high-speed magnetic stirring until a stable milky emulsion was formed.

2.2.2. Stain Specimen Fabrication

To reproduce the localized and persistent stain features formed during the natural accumulation of waste textiles, stained specimens were prepared by combining pointwise quantitative dropping with air drying and thermal aging, so as to simulate the non-uniform diffusion and aging behavior of contaminants on fabric surfaces. The detailed procedures were as follows:

(1): Carbon black stained specimens: A 2 mL aliquot of carbon black suspension was drawn using a disposable quantitative dropper and vertically dropped onto the center of a clean, flat substrate. A glass rod was used to guide the liquid into an approximately circular stain, which was then left to dry naturally.
(2): Protein stain specimens: The protein solution was preheated to 40 °C to reduce its viscosity, and 2 mL was applied to the substrate surface using a disposable quantitative dropper. The samples were then placed in a forced-air drying oven at 60 °C for thermal aging for 2 h.
(3): Oil stain specimens: After heating the emulsion to 50 °C, 2 mL was dropped onto the center of the substrate using a disposable quantitative dropper. The sample was then left standing to allow full wetting and penetration, followed by accelerated aging at 60 °C for 4 h.

Photographs of all prepared stain specimens are presented in Figure 2.

2.3. Hyperspectral Data Acquisition and Calibration

2.3.1. Image Data Acquisition

Because textile materials are porous and partially light-transmissive, reflected signals from the sample stage may introduce additional noise if the incident light penetrates the specimen. To ensure that the acquired spectral information originated solely from the sample itself, each test specimen was placed on top of multiple layers of clean fabric made of the same substrate material during hyperspectral image acquisition.

Furthermore, to strictly eliminate any stray ambient light interference, the image acquisition process was conducted entirely within a closed darkroom. The ambient temperature and relative humidity were maintained at typical stable indoor laboratory conditions throughout the experiments to prevent environmental fluctuations from affecting the sensor response or sample moisture content.

2.3.2. Reflectance Calibration

Since the HSI system is susceptible to dark current noise from the CCD sensor and uneven illumination from the halogen light source, the raw images often exhibit brightness variation, with a brighter center and darker edges, as well as spectral baseline drift. To obtain the true relative reflectance of the sample surface and eliminate system-induced errors, black-and-white calibration was performed on the acquired raw hyperspectral images. The calibration formula is expressed as Equation (1):

R = \frac{I_{r a w} - I_{dark}}{I_{white} - I_{dark}}

(1)

where R denotes the calibrated relative reflectance image, I_raw is the raw hyperspectral image of the sample, I_white is the image of a standard polytetrafluoroethylene (PTFE) white reference acquired under the same integration time and light-source intensity and represents the full-reflection state, and I_dark is the dark-background image acquired with the light source turned off and the lens completely covered, representing the system dark noise. This calibration procedure effectively improved the image signal-to-noise ratio and ensured the physical consistency of data acquired across different batches.

2.3.3. Region of Interest (ROI) Extraction and Dataset Construction

The calibrated hyperspectral images still contain extraneous information such as the background sample stage. Moreover, to avoid interference from localized extreme values, an ROI extraction approach was employed to construct the spectral dataset. A total of 120 specimens were collected, comprising 90 stain specimens from the nine substrate–stain experimental groups and 30 clean specimens of identical specifications. Using the ENVI 5.6 software platform, ten non-overlapping rectangular ROIs (20 × 20 pixels each) were uniformly selected within the central region of each specimen’s hyperspectral image.

The mean spectral reflectance across all pixels within each ROI was computed and treated as an independent spectral observation. Based on this protocol, a high-dimensional dataset containing 1200 spectral samples was ultimately constructed, encompassing 12 substrate–stain combinations with 100 spectra per category. Crucially, to rigorously prevent spatial data leakage and ensure statistical independence during subsequent model evaluation, a specimen-level data partitioning strategy was strictly enforced. All 10 ROIs extracted from the exact same physical specimen were inherently grouped and assigned exclusively together to either the training, validation, or test set.

2.4. Model Development

Although reflectance calibration can effectively eliminate system errors introduced by detector dark current and uneven illumination, the raw spectral data remain inevitably affected by multiple physical interferences. On the one hand, the stains deposited on the prepared specimens may alter photon scattering paths and induce diffuse reflection effects. On the other hand, minor instrumental fluctuations and ambient stray light may lead to baseline drift and random noise. These non-target signals often obscure informative spectral features and reduce the discriminative capability of subsequent models. Therefore, mathematical preprocessing of the raw spectra is a necessary step to improve the signal-to-noise ratio and enhance model performance. In this study, spectral preprocessing and subsequent classification model development were performed on the MATLAB R2024a software platform. To ensure a standardized and fair comparison of computational costs, all model training and evaluation procedures were executed on a workstation equipped with an Intel Core i7-13620H processor and 16 GB of RAM, without relying on GPU acceleration.

2.4.1. Spectral Data Preprocessing

(1): FD: To resolve the problem that substrate characteristic peaks in contaminated samples may be obscured or overlapped, the FD algorithm was applied to the spectral data. By calculating the rate of change in spectral reflectance with respect to wavelength, this method can effectively remove wavelength-independent additive baseline drift and sharpen broad overlapping peaks into separable characteristic signals. The calculation is shown in Equation (2):

F D (λ_{i}) = \frac{R (λ_{i + 1}) - R (λ_{i})}{λ_{i + 1} - λ_{i}}

(2)

where FD(λ_i) is the first-derivative value at wavelength i; R(λ_i) and R(λ_i₊₁) are the reflectance values at wavelengths i and i + 1, respectively; and the denominator represents the wavelength sampling interval.

(2): SNV: This transformation was used to reduce the effects of surface scattering noise and optical path-length variation in diffuse reflectance spectra of solid samples. Unlike correction methods that depend on population-level statistics, SNV adopts an independent standardization strategy for each individual spectrum. Specifically, each spectrum is centered and variance-normalized according to Equation (3):

X_{S N V, i} = \frac{x_{i} - \bar{x}}{\sqrt{\frac{\sum_{k = 1}^{m} {(x_{k} - \bar{x})}^{2}}{m - 1}}}

(3)

where m is the number of wavelength points; k = 1, 2, 3, …, m; X_SNV,i is the transformed value at the i-th wavelength point after SNV processing; and x_k denotes the reflectance at the k-th wavelength point in the same spectrum. By scaling all spectra to a common level, this method can effectively suppress spectral tilt and intensity variation caused by fabric wrinkling or local stain accumulation.

(3): MSC: This method is likewise employed to eliminate spectral deviations arising from non-uniform particle distribution and differences in surface scattering levels, but it operates on the assumption of a mean reference spectrum. The algorithm first computes the mean spectrum across all samples as a reference baseline, and then performs a univariate linear regression between each measured spectrum and the reference spectrum. The calculation is shown in Equations (4) and (5):

X_{i} = a + b \cdot X_{r e f, i} + e_{i}

(4)

X_{M S C, i} = \frac{X_{i} - a}{b}

(5)

where X_ref,i denotes the mean reference spectrum; a and b are the intercept and slope obtained by regression fitting, respectively; X_MSC,i is the corrected value at the i-th wavelength point after MSC processing; and e_i is the spectral residual not explained by the linear model. By subtracting the intercept a and dividing by the slope b, MSC can effectively correct baseline translation and offset induced by optical path-length variation, thereby aligning the spectral intensity baseline across different sample batches.

2.4.2. Establishment of Classification Models

Hyperspectral data exhibit the high-dimensional characteristics of integrated imaging and spectroscopy, containing abundant spatial texture and spectral fingerprint information. However, as the number of spectral bands increases, data redundancy also rises substantially, and the mixing of stains with textile substrates often results in a complex variability mapping between spectral features and class labels. Among currently available modeling algorithms, PLS-DA, Random Forest (RF), and Extreme Learning Machine (ELM) each offer specific advantages. Nevertheless, when confronted with the high heterogeneity and spectral overlapping inherent to the substrate–stain composite systems investigated here, these conventional algorithms may be constrained by limited shallow feature extraction capability or insufficient fitting power for strong nonlinear relationships, making it difficult to achieve optimal robustness. To investigate the differences in performance between different algorithmic architectures in overcoming stain interference, two representative classifiers were selected for comparative modeling.

(1): SVM: The SVM model is a classical supervised learning algorithm that achieves class separation by identifying an optimal hyperplane in a high-dimensional feature space and is particularly advantageous for small-sample, high-dimensional classification tasks. Considering the high-dimensional and nonlinear nature of spectral data, the radial basis function (RBF) was selected as the kernel function to map the input vectors into a higher-dimensional space and address linear inseparability. Its mathematical expression is given in Equation (6):

K (x_{i}, x_{j}) = \exp (- γ {‖x_{i} - x_{j}‖}^{2})

(6)

where x_i and x_j are spectral vectors in the feature space, ‖x_i − x_j‖² denotes the squared Euclidean distance, and γ is the kernel parameter controlling the distribution width of the kernel function. During model construction, a grid-search strategy was adopted to optimize the penalty coefficient (C) and kernel parameter (γ) jointly. The search space for C was predefined within {0.1, 1, 10, 100, 1000}, while the kernel parameter γ was evaluated across the discrete set {‘scale’, ‘auto’, 0.001, 0.01, 0.1, 1}. Given the specimen-level 10-fold cross-validation framework, the optimal hyperparameters were determined independently within the training subset of each fold to strictly prevent data leakage. Results indicate that the extracted optimal hyperparameters exhibited high stability across the 10 folds, with the most frequently converged optimal combination for the best-performing FD-SVM model being C = 0.1 and γ = ‘scale’. The one-versus-rest strategy was employed to address the multiclass classification task.

(2): 1D-CNN: The 1D-CNN model is a deep feedforward neural network that performs sliding-window operations along spectral sequences using convolutional kernels. Through local receptive fields and weight-sharing mechanisms, it can automatically extract hierarchical features with translation invariance from high-dimensional spectral data. The core convolution operation can be expressed as Equation (7):

y_{j}^{l} = σ (\sum_{i = 1}^{K} w_{i}^{l} \cdot x_{j + i - 1}^{l - 1} + b^{l})

(7)

where w denotes the convolutional kernel weight, b is the bias term, and σ is the nonlinear activation function. Compared with conventional fully connected networks, 1D-CNN not only substantially reduces the number of model parameters but also preserves the waveform structure of spectral data. Based on the integrated spatial–spectral characteristics of hyperspectral data, the 1D-CNN architecture designed in this study adopted a progressive stacking architecture of macro-to-micro multi-scale convolutional kernels.

Strategically, the feature extraction leverages a macro-scale convolutional kernel (size 21) in the initial layer to capture the global spectral profile and prominent absorption valleys. As the network deepens, the kernel sizes are progressively reduced to focus on subtle peak-shape differences. Specifically, within each convolutional block, the operations are executed in a strict sequence: 1D Convolution, Batch Normalization (BN), and Rectified Linear Unit (ReLU) activation. Max-pooling layers are applied to compress feature dimensionality, while Dropout mechanisms (rate = 0.5) are embedded to suppress internal covariate shift and prevent overfitting. The model is optimized using the Adam algorithm with categorical cross-entropy. To further prevent over-optimization during the specimen-level 10-fold cross-validation, an early stopping criterion is strictly enforced, halting the training if the validation loss fails to improve for 15 consecutive epochs. To provide full transparency and reproducibility, the complete network topology and hyperparameter configurations are detailed in Table 2.

Given that both SVM and CNN possess strong nonlinear mapping capability, both are expected to achieve high classification accuracy on limited datasets. However, accuracy alone is insufficient to comprehensively assess their suitability for complex industrial scenarios. Therefore, this study does not confine the evaluation to a single accuracy metric; instead, it further examines differences in model robustness under different stain interferences, the interpretability of feature extraction mechanisms, and actual computational efficiency, with the aim of providing broader decision support for waste textile detection and identification technologies.

3. Results and Discussion

3.1. Spectral Feature Analysis of Stained Specimens

The near-infrared spectral response of waste textiles is a coupled manifestation of the fiber substrate and its surface physicochemical properties. Similar spectral-interference mechanisms are also commonly encountered in coated or surface-engineered textiles, where functional coating layers may modify surface scattering behavior, optical absorption, and contaminant interaction characteristics. In the present study, the spectral information of stained specimens represents a superposition of the spectral features of both the fiber substrate and the surface stain. To elucidate the interaction between these two contributors, the mean reflectance data of all samples were extracted over the 1000–2400 nm wavelength range, and the spectral curves were plotted separately for the three substrate categories—cotton, polyester, and poly-cotton blend—as shown in Figure 3.

3.1.1. Spectral Feature Analysis of Fiber Substrates

As indicated by the black curves in Figure 3, the three fabrics exhibited distinct and differentiable spectral fingerprints under clean conditions, which serve as the primary basis for material identification:

(1): Cotton: Dominated by the abundant hydrophilic hydroxyl groups (–OH) in cellulose molecules, the cotton spectrum displays broad and deep characteristic absorption valleys near 1480 nm and 1940 nm, corresponding to the first overtone and combination-band stretching vibrations of O–H bonds, respectively. Additionally, a characteristic combination-band peak near 2100 nm arises from the coupling of internal O–H bending and C–O stretching vibrations within the cellulose structure.
(2): Polyester: As a hydrophobic synthetic fiber, polyester exhibits extremely weak water absorption at 1940 nm. Instead, owing to the presence of aromatic rings and methylene groups in its molecular chain, polyester displays sharp and distinctive dual-valley structural peak profiles at 1660 nm (C–H first overtone) and 2130 nm (C–H combination band), forming a pronounced contrast with cotton.
(3): Poly-cotton blend: The spectral curve of the blended fabric exhibits a characteristic additive effect, with the overall waveform lying intermediate between those of pure cotton and pure polyester. It retains the broad hydrophilic valley of cotton at 1940 nm while also displaying the characteristic sharp peak of polyester at 1660 nm, reflecting the spectral superposition behavior of the mixed components.

3.1.2. Analysis of Stain-Induced Interference Effects

Comparison of the reflectance curves obtained from different stain-loaded fabrics—cotton, polyester, and poly-cotton blend (red, blue, and green curves in Figure 3)—reveals that surface stain loading induced varying degrees of spectral distortion. These spectral alterations are not random noise but originate from the characteristic near-infrared absorption of specific molecular structures and functional groups inherent to the contaminants. From the perspective of surface engineering, these stain layers may produce surface effects similar to those observed in irregular transient coating systems, including additional absorption, scattering, and shielding phenomena that substantially modify the original spectral response of textile substrates. Three distinct interference effects were identified:

(1): Carbon black: Regardless of the fiber substrate, the spectral reflectance after carbon black loading decreased abruptly across the entire wavelength range and exhibited pronounced flattening. This is attributable to the amorphous carbon and microcrystalline graphite structures in carbon black, which give rise to electronic energy-level transitions that produce strong, non-selective absorption from the visible to the near-infrared region. Following this near-infrared light absorption, the amorphous carbon typically undergoes rapid non-radiative relaxation, converting the absorbed photon energy into local heat [18]. This broadband absorption almost completely shields the interaction between incident light and the underlying fiber, thereby severely masking the substrate fingerprint features.
(2): Oil stain: The principal constituent of oil stains is triglyceride, which is rich in methylene groups (–CH₂–) from long-chain fatty acids. At the physical-optical level, the oil film fills the interstitial spaces between fibers, alters the effective surface refractive index, and causes an overall downward shift in the spectral baseline. At the molecular–structural level, the dense C–H bonds in the oil generate specific absorption near 1720 nm (first overtone) and 2300 nm (combination band). Because the polyester substrate also contains abundant C–H bonds, the characteristic absorption positions of the two sources overlap substantially, causing the oil-stain spectrum to manifest as a complex superposition onto the substrate spectrum. This renders the direct waveform-based differentiation between oil-contaminated polyester and clean polyester particularly challenging.
(3): Protein stain: Compared with oil and carbon black stains, the spectral interference of protein stains is more complex owing to the presence of specific functional groups within the protein macromolecular backbone. The backbone contains abundant peptide bonds (–CO–NH–), and the characteristic N–H and C=O bonds exhibit strong absorption near 2060 nm (Amide II band) and 2180 nm (Amide I/III combination band). As observed in Figure 3, these exogenous characteristic peaks are situated precisely between the C–H feature peak of cotton (2100 nm) and the C–H peak of polyester (2130 nm), producing a pronounced cross-overlapping effect. The introduction of these chemical groups causes the characteristic peaks in the composite spectrum to broaden, split, or even undergo peak-position shifts, which can readily trigger the misidentification phenomenon of “different spectra for the same material.”

In summary, stains encountered in real-world recycling scenarios not only degrade the spectral signal-to-noise ratio but also disrupt the intrinsic features of the substrate through physical shielding (carbon black), baseline shift combined with bond-energy overlap (oil stain), and amide-bond spectral overlapping (protein stain). The resulting complex coupled spectra—composed of substrate fingerprints and stain interferences—cannot be reliably identified through single-band analysis or simple linear correlation methods. It is therefore essential to introduce effective spectral preprocessing techniques to correct physical scattering noise and to combine them with highly robust machine learning models, in order to achieve in-depth decoupling and accurate classification of both fiber compositions and stain categories within a multi-dimensional data space.

3.2. Comparative Analysis of Preprocessing Methods

To evaluate the denoising and feature-enhancement effects of different preprocessing algorithms on stain-contaminated spectra, a qualitative analysis was first conducted from the perspective of spectral morphology. Figure 4 presents the raw spectra (RAW) and the preprocessed spectral curves after FD, SNV, and MSC treatment for specimens of all stain types on the three substrate categories: cotton, polyester, and poly-cotton blend.

Examination of the raw spectral images in Figure 4a,e,i reveals that, owing to light scattering induced by fabric surface texture and non-uniform stain diffusion, the spectral curves of clean, protein-stained, and oil-stained specimens (black, red, and green curves) on all three substrates exhibit varying degrees of baseline drift along the ordinate. This drift is most pronounced in the cotton–protein specimens. Notably, the carbon black spectral curves (blue curves) across all three substrates are uniformly flat, with reflectance values substantially lower than those of the other specimens on the same substrate, confirming that the shielding effect of carbon black severely suppresses the spectral response.

Regarding the baseline drift issue, Figure 4c,d,g,h,k,l show that both the SNV and MSC algorithms demonstrate pronounced convergence effects when processing clean, protein-stained, and oil-stained specimens, effectively reducing the intra-class variability among individual samples. However, a closer comparison of the spectral morphologies after SNV and MSC processing reveals a fundamental difference in how the two algorithms handle spectrally similar versus spectrally extreme samples. MSC performs correction by fitting each spectrum to the mean reference spectrum; while correcting scattering effects, it preserves the physical relative magnitude of each spectrum, such that real inter-class differences among clean and the three stain categories are retained, although no additional enhancement of spectral features is achieved compared with the raw spectra. In contrast, SNV adopts a single-sample independent standardization strategy. When applied to morphologically similar samples (clean, protein, and oil), it reduces intra-class variance and compresses the reflectance differences between classes across most wavelength bands, thereby highly compacting the spectral curves of different categories. However, when applied to the spectrally extreme carbon black samples, SNV not only amplifies the residual substrate spectral response that was not completely shielded by carbon black, but also disproportionately stretches the noise, resulting in a strongly oscillatory waveform with dramatically inflated variance. Examination of the FD images in Figure 4b,f,j reveals that, compared with SNV and MSC, the FD algorithm performs a more thorough morphological reshaping of the spectra by computing the rate of change in reflectance with respect to wavelength. It forcibly eliminates irrelevant background drift and converts the broad absorption valleys in the raw spectra into sharp extremum peaks, thereby substantially sharpening the spectral profile features.

To quantitatively evaluate the effectiveness of the preprocessing methods beyond visual inspection, the Fisher discriminant ratio and the Silhouette coefficient were introduced to objectively measure class separability in the spectral feature space. Specifically, the Fisher ratio quantifies the ratio of inter-class distance to intra-class dispersion, while the Silhouette coefficient (ranging from −1 to 1) evaluates how well a sample aligns with its own class compared to neighboring classes. As shown in Table 3, the raw spectral data exhibited poor separability.

Mechanistically, while SNV and MSC effectively normalize global spectral intensity to mitigate physical scattering, they are generally less effective at resolving the severely overlapping absorption peaks caused by complex stains, resulting in more marginal quantitative improvements in the Fisher ratio and Silhouette score. In contrast, the FD transformation substantially mitigates constant baseline drift and helps sharpen local overlapping absorption features. By doing so, FD facilitates improved linear separability of the spectral data, leading to a more noticeable increase in both metrics. This quantitative and mechanistic improvement is a key contributing factor to the superior classification performance observed in the subsequent FD-SVM model. Given that qualitative morphological analysis of spectra cannot fully represent the high-dimensional feature extraction logic of classification algorithms, the present study retains four data configurations—raw spectra (RAW), FD, SNV, and MSC—as independent inputs and employs quantitative classification metrics to identify the optimal combination of preprocessing method and classifier.

3.3. Analysis of SVM Model

To rigorously prevent spatial data leakage and evaluate model robustness, the 1200 samples were analyzed using a specimen-level 10-fold cross-validation strategy rather than a simple random split. The SVM model results corresponding to each preprocessing method are summarized in Table 4, and the aggregate test-set confusion matrices are presented in Figure 5. It is worth noting that the aggregate confusion matrices are generated by accumulating the classification results from the strictly independent test set of each fold during the cross-validation process.

In terms of overall classification accuracy, the RAW-SVM model built on unprocessed spectra performed poorly, achieving an average test-set accuracy of only 85.17% ± 3.14%. Furthermore, an analysis of model complexity revealed severe overfitting in the RAW-SVM model, which utilized a massive average of 605.4 support vectors (SVs) per fold. This indicates that the RBF kernel was forced to construct an overly complex decision boundary to memorize the noisy raw spectra. As shown in the aggregate confusion matrix in Figure 5a, significant confusion occurred among carbon black-stained substrates: cotton-carbon specimens were frequently misclassified as poly-cotton blend-carbon (60 instances), and conversely, blend-carbon was misclassified as cotton-carbon (50 instances). This indicates that the strong non-selective absorption and physical shielding of carbon black almost completely masked the underlying substrate differences between pure cotton and poly-cotton blend. In addition, poly-cotton blend-oil specimens were often misclassified as poly-cotton blend-protein (16 instances), indicating that without preprocessing, baseline drift and physical scattering interference cause severe spectral overlapping of oil and protein characteristic peaks. These represent the critical identification blind spots within the RAW-SVM model.

After the introduction of preprocessing algorithms, the classification accuracy improved markedly across all models. The SNV-SVM model achieved an average accuracy of 93.67% ± 1.07% (Figure 5c). The MSC-SVM model raised the average accuracy to 94.83% ± 2.52%; however, as shown in Figure 5d, misclassification of poly-cotton blend-oil specimens as polyester-oil (19 instances) emerged, indicating that its capacity to separate overlapping bond energies under baseline shifts remained limited. By comparison, the FD-SVM model exhibited the best performance, attaining an average accuracy of 98.17% ± 1.33% on the test set (Figure 5b). Crucially, the FD-SVM model required an average of only 125.7 support vectors per fold, objectively proving a dramatic reduction in model complexity and enhanced generalization capability. This result demonstrates that FD algorithm effectively sharpened and amplified the subtle morphological differences between blended and pure substrates, substantially reducing the compositional confusion observed in the RAW and MSC models.

Regarding computational efficiency, the average fold training times of the RAW-SVM, MSC-SVM, and SNV-SVM models ranged between 11.25 and 16.02 s, whereas the FD-SVM model required the longest training time of 24.68 s. This is primarily because the FD algorithm, while sharpening spectral features, simultaneously amplified high-frequency random noise during the model training phase, thereby prolonging the grid-search optimization process for identifying the optimal hyperparameters and imposing a greater computational burden on the SVM model compared with SNV and MSC.

Overall, in the context of chemical recycling of waste textiles, if a model such as RAW-SVM or MSC-SVM misclassifies a poly-cotton blend as pure polyester, non-glycolyzable cellulose impurities would be introduced into the polyester recovery line, potentially causing downstream catalytic process failure. Although the FD-SVM model incurs a longer grid-search training time, its highly robust and accurate performance across test samples under the specific experimental conditions of this study, combined with rapid test-set inference, makes it the optimal choice among the SVM-based models for ensuring both detection accuracy and process safety.

3.4. Analysis of 1D-CNN Model

To examine the adaptive feature-extraction capability of deep learning under different stain interferences, as well as its dependence on manual spectral preprocessing, the 1D-CNN model based on the progressive stacking architecture of macro-to-micro multi-scale convolutional kernels proposed above was evaluated using different input datasets. Under the rigorous specimen-level 10-fold cross-validation framework, the classification accuracy and convergence epochs of the model under different preprocessing conditions are summarized in Table 5.

As shown by the experimental results in Table 5, the established 1D-CNN model exhibited excellent classification performance under all four input conditions. The RAW-CNN model achieved a highly robust average test-set accuracy of 99.58% ± 0.56%, indicating that the hierarchical structure of the deep convolutional network was able to autonomously learn filtering weights functionally analogous to baseline correction and feature sharpening in conventional preprocessing methods. As a result, it could directly extract effective material fingerprint features from raw spectra affected by carbon black shielding, oil contamination, and other interferences. Further analysis of the convergence behavior, as reflected by the number of training epochs—with the early stopping criterion triggered when the validation loss failed to decrease for 15 consecutive epochs—revealed that these preprocessing methods in fact imposed certain negative effects on the deep learning model. The RAW-CNN model reached a stable convergent state after an average of only 93.0 epochs, demonstrating that the network possessed strong adaptive normalization capability with respect to spectral baseline drift and reflectance-intensity variation. In contrast, the FD-CNN model required an average of 162.8 epochs to converge, representing an increase of approximately 75% in training cost relative to the raw-data input. This observation is consistent with the pattern previously found in the SVM models: while FD sharpens spectral features, it inevitably amplifies high-frequency random noise, forcing the convolutional kernels to expend more iterations in filtering irrelevant fluctuations and thereby slowing the optimization process.

To visually validate the model’s feature-enhancement effect on the samples, the t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm, which is well suited for variability data, was further introduced for dimensionality reduction analysis. To represent the 10-fold cross-validation appropriately, the deep features from a representative validation fold were extracted for this visualization. The raw spectral data, as well as the deep features extracted from the Flatten layer (i.e., the abstract output immediately following the third max-pooling layer, prior to the final fully connected layer) of the RAW-CNN from the training set and the test set, were visualized, and the results are shown in Figure 6.

As shown in Figure 6a, in the raw spectral space, the data points from the 12 classes were highly intermixed under the variability superposition effects introduced by stains, and clear inter-class boundaries were almost absent. By contrast, Figure 6b shows that the training samples exhibited well-defined cluster separation in the deep feature space. This transition from high-dimensional overlap to low-dimensional clarity arose from the synergistic dimensionality reduction achieved by the progressive stacking of multi-scale convolutional kernels together with the max-pooling layers, which enabled the network to effectively decouple variability physical shielding noise and precisely extract the intrinsic chemical fingerprints of both surface stains and fiber materials. Notably, the poly-cotton blend classes that were highly prone to confusion in the SVM models were fully separated into clear cluster boundaries by this mechanism. As shown in Figure 6c, when the model was applied to the unseen test set, it still maintained highly consistent and excellent cluster separation, confirming that the embedded Batch Normalization (BN) and Dropout mechanisms played a critical regularization role. This transformation from severe overlap to clear separation demonstrates that, under stain interference, the deep features extracted by RAW-CNN without relying on manual mathematical feature engineering possess strong generalization capability and robustness.

3.5. Comprehensive Evaluation of Support Vector Machine and Convolutional Neural Network Models

To evaluate the application potential of conventional machine learning and deep learning in industrial waste textile detection and identification scenarios, and considering that both SVM and 1D-CNN possess strong variability mapping capability, the present study did not limit the comparison to classification accuracy alone. Instead, the best-performing model from each classifier family was comprehensively assessed from multiple dimensions, as summarized in Table 6.

As can be directly seen from Table 6, although both FD-SVM and RAW-CNN ultimately achieved excellent recognition accuracy, the logic by which they reached this performance was fundamentally different. FD-SVM represents a strongly preprocessing-dependent strategy, whose high performance is built upon manual trial-and-error selection of the optimal preprocessing algorithm, namely FD. Such fixed preprocessing parameters may lose effectiveness when stain distribution becomes more heterogeneous or when the noise pattern changes. By contrast, RAW-CNN demonstrates strong adaptive capability. Without relying on manual mathematical feature engineering, it directly extracts discriminative features from radiometrically calibrated spectra through the convolutional layers embedded within the network, thereby exhibiting greater robustness against stain interference and stronger generalization potential for unseen samples.

In terms of computational cost and industrial applicability, both upfront investment and real-time output must be considered separately. FD-SVM requires less training time (24.68 s), whereas RAW-CNN required approximately 63 s for training under the same hardware conditions, reflecting the computational cost associated with the autonomous construction of a deep feature space. However, in industrial automated detection and identification, model training is a one-time preparatory task, whereas system throughput is ultimately determined by millisecond-level real-time prediction speed. FD-SVM follows a conventional multistep pipeline of “mathematical preprocessing first, classification prediction later”, which increases the number of operations and the time cost of each spectral measurement. In contrast, RAW-CNN directly classifies radiometrically calibrated spectra and thereby simplifies the real-time prediction workflow. This strategy—trading a higher one-time training cost for superior online detection efficiency—closely matches the practical requirements of high-throughput waste textile recycling. Overall, RAW-CNN minimizes intermediate processing-related interference and time loss while maintaining excellent classification accuracy, and was therefore identified as the optimal model in this study. It can provide more stable and efficient algorithmic support for the development of automated waste textile detection and identification equipment.

From an industrial deployment perspective, several practical constraints must be considered beyond model accuracy. First, in terms of real-time implementation, the inference stage of the trained 1D-CNN model can be executed at millisecond-level latency on standard CPUs, and can be further accelerated using GPUs or embedded AI hardware (e.g., edge computing modules), making it compatible with high-speed conveyor-based sorting systems. The simplified end-to-end workflow of RAW-CNN eliminates intermediate preprocessing steps, thereby reducing pipeline complexity and improving system stability under continuous operation. Second, regarding system integration, hyperspectral imaging units can be incorporated into existing automated sorting lines, where spectral acquisition, model inference, and mechanical actuation (e.g., air-jet or robotic separation) are performed in a synchronized manner. Such detection frameworks may also provide practical value for intelligent inspection of coated textiles and functional finishing processes, particularly in applications requiring rapid assessment of surface uniformity, contamination distribution, or coating-related spectral variability. In real-world automated detection scenarios, the robustness of the model to stain interference is critical, as real-world waste textiles often exhibit highly variable contamination patterns, illumination fluctuations, and surface heterogeneity. Third, in terms of cost and scalability, although hyperspectral systems generally involve higher initial investment compared to conventional RGB imaging, their ability to provide rich spectral information enables more accurate material identification and reduces downstream sorting errors. This can potentially offset the initial cost through improved recycling efficiency and material recovery rates. Finally, it should be noted that large-scale deployment may still face challenges such as sensor noise at longer wavelengths, environmental variability, and the need for periodic model updates when new textile or contamination types are introduced. Therefore, future work will focus on enhancing model robustness under dynamic industrial conditions and optimizing hardware–software co-design for real-time applications.

4. Conclusions

In this study, HSI was used to construct a spectral dataset covering 12 substrate–stain combinations. The interference effects of stains on the spectral characteristics of waste textiles were analyzed, and the practical performance of different preprocessing algorithms combined with SVM and 1D-CNN classifiers was comprehensively evaluated. The main conclusions are as follows:

(1): Different types of stains introduce complex interference effects on the near-infrared spectra of textiles, significantly altering their intrinsic spectral characteristics. Carbon black causes strong broadband absorption and physical shielding, while oil and protein stains introduce characteristic absorption features that overlap with substrate signals, leading to spectral distortion and reduced separability. These results demonstrate that stain contamination can substantially degrade the reliability of conventional spectral discrimination methods, highlighting the necessity of robust feature extraction strategies under interference conditions.
(2): Preprocessing methods exhibit distinct roles in balancing noise suppression and feature enhancement in mixed spectral systems. Among the evaluated methods, FD preprocessing showed strong capability in enhancing subtle spectral features and improving class separability, particularly under severe interference conditions. However, its effectiveness is closely dependent on the classifier architecture, and the resulting FD-SVM model achieved near-perfect classification performance under the specific experimental conditions of this study. This finding indicates that preprocessing strategies should be carefully matched with downstream models rather than universally applied.
(3): The 1D-CNN model demonstrated strong capability for handling stain-contaminated textile spectra through automatic extraction of discriminative features from raw data without manual preprocessing. Compared with conventional SVM-based approaches, it provides a simplified processing pipeline and improved robustness to spectral variability. This end-to-end learning framework highlights the potential of deep learning methods to enable efficient and stable identification of contaminated textiles in automated detection scenarios.
(4): It should be noted that this study is based on a controlled laboratory-prepared dataset, which introduces certain limitations in representing real-world conditions. Specifically, the carbon black particle size was fixed at 5 μm, the stain volume was standardized at 2 mL, and each stain type was applied at a single concentration. In addition, the stain regions were uniformly distributed in localized circular patterns. While such controlled conditions ensure reproducibility and facilitate systematic analysis, they do not fully capture the complexity of real waste textiles, which may involve unknown stain types, heterogeneous spatial distribution, variable contamination levels, and mixed stains (e.g., coexistence of oil and protein).

Therefore, although the proposed model demonstrates strong performance under controlled conditions, its behavior in real industrial environments is expected to be influenced by these additional sources of variability. Future work will focus on expanding the dataset to include more diverse and realistic contamination scenarios, incorporating multiple stain concentrations and mixed interference conditions, and conducting rigorous external validation. These efforts are essential for improving model robustness and facilitating reliable deployment in large-scale waste textile recycling systems. Beyond the current waste textile recycling context, future studies may further investigate the applicability of the proposed framework to coated and functionally finished textiles, particularly regarding the influence of surface treatments and coating layers on spectral recognition behavior under contamination conditions.

Author Contributions

Conceptualization, X.J. and J.Z.; methodology, J.Z., H.H. and W.T.; software, J.Z. and H.H.; validation, J.Z., H.H., W.T. and F.Y.; formal analysis, X.J., J.Z., H.H. and W.T.; investigation, J.Z., H.H., W.T. and C.Z.; resources, X.J., F.Y. and C.Z.; data curation, J.Z., H.H., W.T. and C.Z.; writing—original draft preparation, J.Z.; writing—review and editing, X.J.; visualization, J.Z. and H.H.; supervision, X.J. and C.Z.; project administration, X.J. and C.Z.; funding acquisition, X.J. and F.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Eyas Program Incubation Project of Zhejiang Provincial Administration for Market Regulation (CY2023324).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Shen, Y.; Chen, T.; Zhang, L.J. Research progress in recycling and reuse of waste textiles and clothing. J. Text. Res. 2023, 44, 232–239. [Google Scholar] [CrossRef]
European Topic Centre on Circular Economy and Resource Use (ETC CE). Textile Waste Management in Europe’s Circular Economy (ETC CE Report 2024/5). 2024. Available online: https://www.eionet.europa.eu/etcs/etc-ce/products/etc-ce-report-2024-5-textile-waste-management-in-europes-circular-economy (accessed on 30 April 2026).
Zhang, Y.; Luo, Y.X.; Xing, T.H.; He, A.N.; Wang, M.Q.; Huang, Z.Y.; Liang, Z.H.; Mei, F.; Chen, F.X. Progress in research of resource recycling and reusing of waste polyester textiles under dual carbon target. Polym. Mater. Sci. Eng. 2023, 39, 182–190. [Google Scholar] [CrossRef]
Seifali Abbas-Abadi, M.; Tomme, B.; Goshayeshi, B.; Mynko, O.; Wang, Y.; Roy, S.; Kumar, R.; Baruah, B.; De Clerck, K.; De Meester, S.; et al. Advancing textile waste recycling: Challenges and opportunities across polymer and non-polymer fiber types. Polymers 2025, 17, 628. [Google Scholar] [CrossRef] [PubMed]
National Development and Reform Commission (NDRC) of China. Implementation Opinions on Accelerating the Circular Utilization of Waste Textiles; Document No. Fagai Huanzi 526; NDRC: Beijing, China, 2022. Available online: https://www.gov.cn/zhengce/zhengceku/2022-04/12/content_5684664.htm (accessed on 30 April 2026).
Ghosh, J.; Repon, M.R.; Rupanty, N.S.; Asif, T.R.; Tamjid, M.I.; Reukov, V. Chemical valorization of textile waste: Advancing sustainable recycling for a circular economy. ACS Omega 2025, 10, 11697–11722. [Google Scholar] [CrossRef] [PubMed]
Standring, Z.; Macintyre, L.; Jiang, G.; Bucknall, D.; Arrighi, V. Impact of chemicals and processing treatments on thermo-mechanical recycling of polyester textiles. Molecules 2025, 30, 2758. [Google Scholar] [CrossRef] [PubMed]
Johns, M.A.; Zhao, H.; Gattrell, M.; Lockhart, J.; Cranston, E.D. Identification of common textile microplastics via autofluorescence spectroscopy coupled with K-means cluster analysis. Analyst 2024, 149, 4747–4756. [Google Scholar] [CrossRef] [PubMed]
Bonifazi, G.; Gasbarrone, R.; Palmieri, R.; Serranti, S. A characterization approach for end-of-life textile recovery based on short-wave infrared spectroscopy. Waste Biomass Valorization 2024, 15, 1725–1738. [Google Scholar] [CrossRef]
Furferi, R.; Servi, M. A machine vision-based algorithm for color classification of recycled wool fabrics. Appl. Sci. 2023, 13, 2464. [Google Scholar] [CrossRef]
Tian, R.; Lv, Z.Q.; Fan, Y.H.; Wang, T.Y.; Sun, M.J.; Xu, Z.Q. Qualitative classification of waste garments for textile recycling based on machine vision and attention mechanisms. Waste Manag. 2024, 183, 74–86. [Google Scholar] [CrossRef] [PubMed]
Faghih, E.; Saki, Z.; Moore, M. A Systematic literature review—AI-enabled textile waste sorting. Sustainability 2025, 17, 4264. [Google Scholar] [CrossRef]
Bonifazi, G.; Gasbarrone, R.; Palmieri, R.; Serranti, S. End-of-life textile recognition in a circular economy perspective: A methodological approach based on near infrared spectroscopy. Sustainability 2022, 14, 10249. [Google Scholar] [CrossRef]
Li, B.; Beveridge, P.; O’Hare, W.T.; Islam, M. The application of visible wavelength reflectance hyperspectral imaging for the detection and identification of blood stains. Sci. Justice 2014, 54, 432–438. [Google Scholar] [CrossRef] [PubMed]
Kampik, L.; Gruber, S.H.; Weisleitner, K.; Bauer, G.; Steiner, H.; Tous, L.; Hubert Unterberger, S.; Dominikus Pallua, J. Hyperspectral imaging for non-destructive detection of chemical residues on textiles. Textiles 2025, 5, 42. [Google Scholar] [CrossRef]
Jiang, R.C.; Qu, D.D.; Wang, K.; Liu, G.J.; Dou, X.; Lu, S.R.; Xie, S.J.; Wen, W.Z.; Xu, A.C.; Liang, B. Rapid detection of multiple chemical residues in pure cotton fabric using hyperspectral imaging and deep learning. IEEE Access 2025, 13, 204196–204209. [Google Scholar] [CrossRef]
GB/T 13174-2021; Determination of Detergency and Cyclic Washing Performance of Laundry Detergents. Standardization Administration of China: Beijing, China, 2021.
Margraf, J.T.; Strauss, V.; Guldi, D.M.; Clark, T. The electronic structure of amorphous carbon nanodots. J. Phys. Chem. B 2015, 119, 7258–7265. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Schematic diagram of the hyperspectral imaging system.

Figure 2. Photographs of the stain specimens on cotton, polyester, and poly-cotton blended substrates (from left to right): (a–c) carbon black; (d–f) protein; and (g–i) oil stains.

Figure 3. Average spectral curves (with shaded bands representing ± 1 standard deviation to illustrate intra-class data dispersion) of different substrates under various stain conditions: (a) Cotton; (b) Polyester; (c) Poly-cotton blend.

Figure 4. Preprocessed spectra (RAW, FD, SNV, and MSC, from left to right) for different fiber substrates: (a–d) cotton, (e–h) polyester, and (i–l) poly-cotton blend.

Figure 5. Aggregate confusion matrices of SVM models on the test set: (a) RAW-SVM; (b) FD-SVM; (c) SNV-SVM; (d) MSC-SVM.

Figure 6. T-SNE visualizations from a representative fold: (a) raw spectra; (b) RAW-CNN deep features of the training set; (c) RAW-CNN deep features of the test set.

Table 1. Specifications and sources of the experimental materials.

Name	Specification/Purity	Manufacturer/Source
Cotton	Plain woven fabric, 100% cotton; Areal mass: 200 g/m²; Warp density: 110 ends/inch; Weft density: 90 picks/inch; Thickness: 0.32 mm	Hongda Weaving Factory, Shijiazhuang, China
Polyester	Plain woven fabric, 100% polyester; Areal mass: 150 g/m²; Warp density: 170 ends/inch; Weft density: 170 picks/inch; Thickness: 0.18 mm	Hongda Weaving Factory, Shijiazhuang, China
Poly-cotton blend	Plain woven fabric, 65/35 polyester/cotton blend; Areal mass: 180 g/m²; Warp density: 150 ends/inch; Weft density: 120 picks/inch; Thickness: 0.27 mm	Hongda Weaving Factory, Shijiazhuang, China
Carbon black	Particle size: 5 μm	Shanghai Aladdin Biochemical Technology Co., Ltd., Shanghai, China
Span-80	Chemically pure (CP)	Shanghai Aladdin Biochemical Technology Co., Ltd., Shanghai, China
Tween-80	Chemically pure (CP)	Shanghai Aladdin Biochemical Technology Co., Ltd., Shanghai, China
Peregal O	Industrial grade	Nantong Hantai Chemical Co., Ltd., Nantong, China
Anhydrous ethanol	Analytical reagent (AR)	Zhejiang Tengyu New Materials Technology Co., Ltd., Huzhou, China
Protein powder	Food grade	Xindongkang Nutrition Technology Co., Ltd., Changsha, China
Soybean oil	Food grade	Yihai Kerry Arawana Holdings Co., Ltd., Shanghai, China

Table 2. Detailed architecture and training configurations of the proposed 1D-CNN model.

Layer/Operation	Filters/Units	Kernel Size	Stride	Padding	Activation	Output Shape
Input	-	-	-	-	-	N, 1
Conv1D 1 + BN	16	21	1	Valid (None)	ReLU	N-20, 16
Max Pooling 1	-	2	2	-	-	(N-20)/2, 16
Dropout 1	Rate = 0.5	-	-	-	-	(N-20)/2, 16
Conv1D 2 + BN	32	5	1	Valid (None)	ReLU	(N-24)/2, 32
Max Pooling 2	-	2	2	-	-	(N-24)/4, 32
Dropout 2	Rate = 0.5	-	-	-	-	(N-24)/4, 32
Conv1D 3 + BN	64	3	1	Valid (None)	ReLU	(N-32)/4, 64
Max Pooling 3	-	2	2	-	-	(N-32)/8, 64
Dropout 3	Rate = 0.5	-	-	-	-	(N-32)/8, 64
Flatten	-	-	-	-	-	1D Vector
Dense (Output)	12	-	-	-	Softmax	12

Note: Optimizer: Adam; Learning Rate: 0.001; Batch Size: 32; Max Epochs: 300; Loss Function: Categorical Cross-entropy. BN = Batch Normalization.

Table 3. Quantitative evaluation of class separability under different preprocessing methods.

Preprocessing Method	Fisher Discriminant Ratio	Silhouette Coefficient
RAW	395.05	0.5164
FD	1031.36	0.8788
SNV	536.21	0.5561
MSC	557.06	0.6568

Table 4. Processing results of SVM.

Model	Test Set Accuracy (%)	Standard Deviation (%)	Average Number of Support Vectors (SVs)	Average Training Time (s)
RAW-SVM	85.17	3.14	605.4	16.02
FD-SVM	98.17	1.33	125.7	24.68
SNV-SVM	93.67	1.07	461.4	13.72
MSC-SVM	94.83	2.52	367.0	11.25

Table 5. Processing results of CNN neural network.

Model	Test Set Accuracy (%)	Standard Deviation (%)	Convergence Epochs
RAW-CNN	99.58	0.56	93.0
FD-CNN	98.75	1.41	162.8
SNV-CNN	98.08	2.61	116.7
MSC-CNN	98.33	1.79	108.2

Table 6. Comprehensive assessment between SVM and CNN.

Evaluation Dimension	FD-SVM	RAW-CNN
Test set accuracy	98.17% ± 1.33%	99.58% ± 0.56%
Model training cost	24.68 s (CPU i7-13620H only)	93.0 Epochs (≈63 s, CPU i7-13620H only)
Preprocessing dependency	Strong (FD preprocessing required; raw data accuracy only 85.17%)	None (Bypasses manual mathematical feature engineering)
Workflow per single detection	Mathematical preprocessing first, classification prediction later	Direct classification from radiometrically calibrated spectra

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zou, J.; He, H.; Tian, W.; Zhu, C.; Ye, F.; Jin, X. Hyperspectral Detection and Classification of Stain-Contaminated Waste Textiles. Coatings 2026, 16, 629. https://doi.org/10.3390/coatings16060629

AMA Style

Zou J, He H, Tian W, Zhu C, Ye F, Jin X. Hyperspectral Detection and Classification of Stain-Contaminated Waste Textiles. Coatings. 2026; 16(6):629. https://doi.org/10.3390/coatings16060629

Chicago/Turabian Style

Zou, Jiacheng, Haonan He, Wei Tian, Chengyan Zhu, Fei Ye, and Xiaoke Jin. 2026. "Hyperspectral Detection and Classification of Stain-Contaminated Waste Textiles" Coatings 16, no. 6: 629. https://doi.org/10.3390/coatings16060629

APA Style

Zou, J., He, H., Tian, W., Zhu, C., Ye, F., & Jin, X. (2026). Hyperspectral Detection and Classification of Stain-Contaminated Waste Textiles. Coatings, 16(6), 629. https://doi.org/10.3390/coatings16060629

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperspectral Detection and Classification of Stain-Contaminated Waste Textiles

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Materials and Instrumentation

2.1.1. Experimental Materials

2.1.2. Hyperspectral Imaging System Parameters

2.2. Stain Specimen Preparation

2.2.1. Preparation of Simulated Stain Solutions

2.2.2. Stain Specimen Fabrication

2.3. Hyperspectral Data Acquisition and Calibration

2.3.1. Image Data Acquisition

2.3.2. Reflectance Calibration

2.3.3. Region of Interest (ROI) Extraction and Dataset Construction

2.4. Model Development

2.4.1. Spectral Data Preprocessing

2.4.2. Establishment of Classification Models

3. Results and Discussion

3.1. Spectral Feature Analysis of Stained Specimens

3.1.1. Spectral Feature Analysis of Fiber Substrates

3.1.2. Analysis of Stain-Induced Interference Effects

3.2. Comparative Analysis of Preprocessing Methods

3.3. Analysis of SVM Model

3.4. Analysis of 1D-CNN Model

3.5. Comprehensive Evaluation of Support Vector Machine and Convolutional Neural Network Models

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI