Polymers are encountered in everyday life and are of interest for many applications. In this study polymer classification was used as an example to investigate direct model transferability. Resin kits containing 46 materials representing the most important plastic resin used in industry today were used. Each material was treated as one class. Three resin kits were used to show prediction performance on different physical samples of the same material. The samples were measured by three randomly chosen MicroNIR™ OnSite spectrometers (labeled as Unit 1, Unit 2 and Unit 3).
2.1.2. Direct Model Transferability of the Classification Models
The performance of the polymer classification models was evaluated at four levels, the same-unit-same-kit performance, the same-unit-cross-kit performance, the cross-unit-same-kit performance, and the cross-unit-cross-kit performance. To account for the most variation in sample shape and thickness, each resin sample was scanned in five specified locations. In addition, at each position the sample was scanned in two orientations with respect to the MicroNIR™ lamps to account for any directionality in the structure of the molding. For each position and orientation, three replicate scans were acquired, totaling thirty scans per sample, per spectrometer. Prediction was performed for every spectrum in the validation set. For the same-unit-same-kit performance, the models built with data collected from four locations on each sample in one resin kit by one spectrometer were used to predict data collected from the other location on each sample in the same resin kit by the same spectrometer. The total number of predictions was 276 for all 46 materials for each case. For the same-unit-cross-kit performance, the models built with all the data collected from one resin kit by one spectrometer were used to predict all the data collected from a different resin kit by the same spectrometer. The total number of predictions was 1380 for all 46 materials for each case. For the cross-unit-same-kit performance, the models built with all the data collected from one resin kit by one spectrometer were used to predict all the data collected from the same resin kit by a different spectrometer. The total number of predictions was 1380 for all 46 materials for each case. For the cross-unit-cross-kit performance, the models built with all the data collected from one resin kit by one spectrometer were used to predict all the data collected from a different resin kit by a different spectrometer. The total number of predictions was 1380 for all 46 materials for each case. Five different classification algorithms were used to build the models, which were PLS-DA, SIMCA, TreeBagger, SVM and hier-SVM.
The prediction performance was evaluated in terms of prediction success rates and the number of missed predictions. The representing results were summarized in Table 1
and Table 2
, respectively. The prediction success rates were calculated by dividing the number of correct predictions with the number of total predictions. The number of missed predictions is presented to make the difference clearer, since with a large number of total predictions a small difference in prediction success rate would mean a conceivable difference in the number of missed predictions. It should be noted that in a few cases the total number of predictions was not exactly 276 or 1380, because extra spectra were collected unintentionally during experiments and no spectra were excluded from analysis. To make the comparison consistent, in these tables all the models were developed using data from Kit 1 for different spectrometers. The prediction data were collected using different resin kits and different spectrometers for the four levels of performance.
The same-unit-same-kit cases were control cases and presented as the diagonal elements for each algorithm in the left three columns of the tables. As expected, 100% prediction success rates and 0 missed predictions were obtained for all algorithms except for one PLS-DA case (Unit 1 K1 for modeling and testing) where there was only 1 missed prediction. The same-unit-cross-kit cases showed the true prediction performance of the models for each spectrometer, since independent testing samples were used. The results are presented as the diagonal elements for each algorithm in the right three columns of the tables. All the models showed very good same-unit-cross-kit predictions. Although SIMCA showed the best performance, the differences in performance were very small between algorithms. It should be noted that samples made of the same type of material but with different properties are included in the resin kits, indicating that the MicroNIR™ spectrometers have the resolution to resolve minor differences between these polymer materials. For the cross-kit cases, Kit 2 was used for Unit 1 and Unit 2, while Kit 3 was used for Unit 3, because at the time of data collection using Unit 3, Kit 2 was no longer available. Nonetheless, conclusions about the cross-kit performance were not impacted by this.
The direct model transferability was first demonstrated by the cross-unit-same-kit results, which are presented by the non-diagonal elements for each algorithm in the left three columns of the tables. Except the PLS-DA algorithm, all the other algorithms showed good performance. In general, the order of performance was Hier-SVM > SVM > SIMCA > TreeBagger >> PLS-DA. When the hier-SVM algorithm was used, the worst case only had 28 missed predictions out of 1380 predictions, and 1/3 of the cases showed perfect predictions.
The direct model transferability was further demonstrated by the most stringent cross-unit-cross-kit cases, which are often the real-world cases. The results are presented by the non-diagonal elements for each algorithm in the right three columns of the tables. Other than the PLS-DA algorithm, all the other algorithms showed good performance, but which was slightly worse than the cross-unit-same-kit results with some exceptions. In general, the order of performance was hier-SVM > SVM > TreeBagger ≈ SIMCA >> PLS-DA.
Besides the representing results shown in these tables, all possible combinations of datasets were analyzed, including 6 same-unit-same-kit cases, 6 same-unit-cross-kit cases, 8 cross-unit-same-kit cases, and 16 cross-unit-cross-kit cases in total for each algorithm. The conclusions were similar to those presented above. For the most stringent cross-unit-cross-kit cases, the mean prediction success rates of all the cases were 98.15%, 97.00%, 96.74%, 95.83%, and 80.19% for hier-SVM, SVM, TreeBagger, SIMCA, and PLS-DA, respectively. The high prediction success rates for hier-SVM, SVM, TreeBagger and SIMCA indicate good direct model transferability for polymer classification with MicroNIR™ spectrometers. To achieve the best result, hier-SVM should be used. But the conventional SIMCA algorithm that is available to most NIR users is also sufficient.