Next Article in Journal
Rheological and Flexural Strength Characteristics of Cement Mixtures through the Synergistic Effects of Graphene Oxide and PVA Fibers
Previous Article in Journal
Long-Chain Bio-Based Nylon 514 Salt: Crystal Structure, Phase Transformation, and Polymerization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Development of a Machine Learning Model to Predict the Color of Extruded Thermoplastic Resins

1
Graduate School of Organic Materials Science, Yamagata University, 4-3-16 Jonan, Yonezawa 992-8510, Yamagata, Japan
2
Omni-Plus System Limited, 994 Bendemeer Road, 01-03 B-Central, Singapore 339943, Singapore
3
Matwerkz Technologies Pte Ltd., 994 Bendemeer Road, 01-03 B-Central, Singapore 339943, Singapore
4
Research Center for GREEN Materials and Advanced Processing, Yamagata University, 4-3-16 Jonan, Yonezawa 992-8510, Yamagata, Japan
*
Authors to whom correspondence should be addressed.
Polymers 2024, 16(4), 481; https://doi.org/10.3390/polym16040481
Submission received: 17 January 2024 / Revised: 2 February 2024 / Accepted: 6 February 2024 / Published: 8 February 2024

Abstract

:
The conventional method for the color-matching process involves the compounding of polymers with pigments and then preparing plaques by using injection molding before measuring the color by an offline spectrophotometer. If the color fails to meet the L*, a*, and b* standards, the color-matching process must be repeated. In this study, the aim is to develop a machine learning model that is capable of predicting offline color using data from inline color measurements, thereby significantly reducing the time that is required for the color-matching process. The inline color data were measured using an inline process spectrophotometer, while the offline color data were measured using a bench-top spectrophotometer. The results showed that the Bagging with Decision Tree Regression and Random Forest Regression can predict the offline color data with aggregated color differences (dE) of 10.87 and 10.75. Compared to other machine learning methods, Bagging with Decision Tree Regression and Random Forest Regression excel due to their robustness, ability to handle nonlinear relationships, and provision of insights into feature importance. This study offers valuable guidance for achieving Bagging with Decision Tree Regression and Random Forest Regression to correlate inline and offline color data, potentially reducing time and material waste in color matching. Furthermore, it facilitates timely corrections in the event of color discrepancies being observed via inline measurements.

1. Introduction

Color analysis stands as a crucial tool with a myriad of applications. Color analysis plays a pivotal role in determining tolerances for automotive coatings, ensuring the ultimate satisfaction of the end products. This becomes particularly crucial as automobiles are composed of a diverse range of materials. It is essential to verify that the color coating maintains a consistent and uniform appearance when applied to different materials with varying textures [1]. Additionally, Ariño et al. explored the impact of a plastic texture on color perception. Their conclusion highlighted the noteworthy influence that the texture of plastic exerts on color perception [2].
To create a standard for color communication during color analysis, the International Commission on Illumination (CIE) developed the CIE L* a* b* color space in 1976. The CIE 1976 L* a* b* color space is a three-dimensional, approximately uniform color space, produced by plotting in rectangular coordinates, L*, a*, and b* [3]. L* indicates lightness, a* is the red/green coordinate, and b* is the yellow/blue coordinate. The positive a* axis points roughly towards red color stimuli, the negative axis points approximately towards green stimuli, the positive b* axis points approximately towards yellow stimuli, and the negative b* axis points approximately towards blue stimuli. L* is associated with the luminance of the stimulus, making it a basic indicator of lightness [4]. The differences in L*, a *, and b* between two specimens, which are also referred to as Delta Values, are calculated using Equations (1)–(3).
L * = L S a m p l e * L S t a n d a r d *
a * = a S a m p l e * a S t a n d a r d *
b * = b S a m p l e * b S t a n d a r d *
Historically, the measurement of color has typically been carried out via offline color measurements using offline bench-top spectrophotometers [5]. To achieve this, the materials must undergo molding into plates after the extrusion process. However, the preparation of samples for offline color measurement is a labor-intensive and time-consuming task, resulting in delayed measurement reports. This delay carries a significant risk of producing products that may not meet specifications during the waiting period [6].
An offline bench-top spectrophotometer serves as a specialized instrument, tailored for conducting color measurements and analyses in a laboratory or controlled environment. In contrast to inline spectrophotometers that are seamlessly integrated into production lines for real-time polymer melt flow measurements, the offline bench-top variant excels in delivering precise and accurate color measurements within a stationary setting.
The offline bench-top spectrophotometer incorporates a spherical interior. The design strategically obstructs the light source, directing it from the color chip and reflecting it at an 8-degree angle from the specimen. This configuration ensures that the reflected light is effectively captured by the detector, enabling precise and accurate color measurements. Two commonly employed measurement geometries in offline bench-top spectrophotometers are SCI and SCE [7].
Specular Component Included (SCI): In SCI measurements, the spectrophotometer captures all reflected light, including both specular and diffuse components. This effectively eliminates the impact of specular reflection from the surface, allowing the measurement to focus solely on color rather than appearance. As a result, SCI is universally adopted by companies for formulating color recipes.
Specular Component Excluded (SCE): In SCE measurements, the spectrophotometer selectively records only the diffuse reflection of light from the material’s surface, excluding specular reflection. This approach incorporates the surface appearance into the measurement. Consequently, SCE proves more valuable for quality control in the production process, especially when a balance between color and appearance is crucial.
Figure 1 shows the working principle of an offline bench-top spectrophotometer with (a) Specular Component Excluded geometry and (b) Specular Component Included geometry.
Conversely, an inline color measurement involves conducting direct color measurements on the polymer melt, which is already pigmented, preferably within the compound extruder itself [5], by using an inline process spectrophotometer (IPS). This allows operators to examine the polymer during production [8], and they are alerted as soon as the color begins to deviate out-of-spec so that corrections can be made immediately to minimize product rejects and wastage.
The IPS works by illuminating the molten polymer within the die using light from the source at Angle 2, which travels through the fiber optics and a Reflection Polymer Melt Probe (RPMP). The reflected signal from the polymer melt is then captured at Angle 1 and transported back to the IPS. [9] Angle 1 aims to closely approximate the sphere measurement (commonly referred to as diffuse/8°) of a bench-top spectrophotometer, which is set at 8 degrees. However, owing to equipment constraints, the optimal angle that it can attain is Angle 1. Figure 2 shows the working principle of the inline process spectrophotometer.
However, in previous research, it was observed that the scale of the colors of inline and offline color measurements are distinctly different [10]. Specifically, the color change in inline measurements is minimal, whereas it registers as significantly more pronounced in offline measurements. This discrepancy emphasizes the need for refined approaches in handling data from these two distinct measurement methods. Addressing this incongruity will facilitate the prediction of the CIE L*, a*, and b* values for the output solid polymer based on the inline color measurement, enabling corrections in case of any detected deviations and averting the rejection of the entire production batch.
In recent times, there has been a discernible shift towards the application of machine learning algorithms and artificial intelligence to model and optimize the relationship between input and output variables. Illustrating this trend is Lee’s study, where an Artificial Neural Network (ANN) was implemented. The ANN was specifically designed to predict product properties such as mass, diameter, and height [11]. Shams-Nateri’s study also demonstrated an application of Neural Networks to relate the color of fibers in the mentioned directions [12]. Jeon constructed machine learning models to predict the melting temperature after plasticization [13]. Joo devised three models to predict the physical properties of PP composites, employing three distinct machine learning (ML) methods: Multiple Linear Regression (MLR), Deep Neural Network (DNN), and Random Forest (RF) [14].
The utilization of machine learning algorithms to develop predictive models from training data demonstrates significant potential for enhancing product quality and minimizing waste and downtime in the polymer processing industries [15]. However, a common drawback that has been observed in many of the machine learning approaches and highlighted in the literature is the opaque nature of these algorithms. Often, it becomes challenging to discern the reasons behind the model’s accurate predictions, as they provide little insight into the underlying process factors and relationships influencing the output [16].
The objective of this study is to design a machine learning model to predict the offline color measurement data using the inline color measurement and material dosage as input parameters. To achieve this, Bagging with Decision Tree Regression, Deep Neural Network, Multiple Linear Regression, and Random Forest Regression are used as the machine learning model. The performance of the model will be evaluated using aggregated dE, which is similar to the root mean square error (RMSE). The insights gained from this study will facilitate the real-time monitoring and prediction of offline color data during compounding through the utilization of inline color data. This approach enables timely corrections to be implemented in the event of any detected deviations.

2. Materials and Methods

2.1. Materials

In this study, a compounding process, followed by an injection molding process, was conducted to gather a well-diversified set of data for training the machine learning models. The materials employed in this study include polycarbonate resin, dispersing agent, and pigments. Polycarbonate (PC) (Makrolon® 2807) was supplied by Covestro, Singapore. PC Makrolon 2807 has a density of 1.20 g/cm3 and a melt flow rate (MFR) of 10 g/10 min (measured at 300 °C/1.2 kg). Polycarbonate was chosen for its high usage in engineering plastic manufacturing. Ethylene Bis Stearamide (EBS) L-205F dispersing agent was supplied by DP Chemicals Pte Ltd., Singapore. Pigments, which included Tiona 288, Raven 1010, Heliogen Green K8730, Ultramarine Blue 05, Solvent Yellow 114, and Plast Red 8355, were supplied by Hexachem (M) Sdn. Bhd, Selangor, Malaysia, and DP Chemicals Pte Ltd., Singapore.
Formulations crafted for PC experiments are tabulated in Table 1. The components in each formulation were first manually hand-tumbled to ensure uniformity before feeding them into the extruder.

2.2. Compounding Equipment

Compounding was performed by using an intermeshing co-rotating twin screw extruder (Coperion GmbH, Stuttgart, Germany). It has a 26 mm screw diameter, an L-to-D ratio of 44, is powered by a 27-kW motor, and features 11 heating zones for the barrel along with one for the die. The barrel temperatures were set at 260–280 °C for PC, with a screw speed of 230 rpm. Upon exiting the die, the extrudate was quenched in cold water, dried using air, and then converted into pellets via a pelletizer. The pellets were then molded via injection molding (Sumitomo C250, Singapore) with a clamp tonnage of 100 tons into a cuboid color chip (95 mm by 55 mm by 2 mm), as shown in Figure 3. The dimension of the color chip was selected based on the industrial standard in the polymer compounding industry. The injection barrel temperature was set at 260–280 °C at an injection speed of 120 mm/s, with mold temperatures of 100 °C. The specimen was conditioned at 23 ± 2 °C for 24 h before color measurements.

2.3. Color Measurement

In our experiment, the color measurement of the polymer melt was conducted using Equitech’s EQUISPEC™ Inline Process Spectrophotometer (IPS) (Equitech, Charlotte, NC, USA), along with a Reflection Polymer Melt Probe (RPMP). The RPMP was mounted at the die head of the extruder and ensured that there was ample shear force to consistently cause the new polymer melt to shear across the RPMP.
The data acquisition rate of the color measurement of polymer melts was set at every 2 s. The CIE L* a* b* color reading [3] from the spectrophotometer was recorded as the inline measurement by using D65 as the standard illuminant [17] and a standard observer angle of 10 degrees. IPS has a measurement uncertainty of 0.01 unit for CIE L* a* b* color reading. The mean data were only collected after 5 min when the reading is stabilized and shown in Table A1. The measurement period was 5 min.
For offline color measurement, we used an X-Rite Ci7800 bench-top Spectrophotometer [18] (X-Rite—Southeast Asia and Pacific, Singapore) with a 400 mm UV filter, equipped with Color iMatch professional software (Version 10.7.2). A 10° supplementary standard observer and D65 illuminant [17] were used, coupled with SCI mode. Given that the surface texture can induce diffusion and scattering of light, influencing color appearance [19], the SCI mode was exclusively preferred for assessing color rather than appearance. The CIE L* a* b* color readings from the spectrophotometer were documented as offline measurements. The bench-top spectrophotometer used in this study has a measurement uncertainty of 0.01 unit for CIE L* a* b*. The mean data were calculated based on the average reading of 10 pieces of color chips for each dosage and shown in Table A1.

3. Machine Learning Architectures

In this paper, four machine learning models were employed for predictions: Bagging with Decision Tree Regression, Deep Neural Network, Multiple Linear Regression, and Random Forest Regression.

3.1. Bagging with Decision Tree Regression

The Bagging with Decision Tree Regression model is a combination of Bagging Regression and Decision Tree Regression.
A Decision Tree Regression is a predictive model that maps features of an input to make decisions or predictions [20]. In the context of regression, it is used to predict a continuous outcome based on input features [21]. The tree structure consists of nodes representing decisions based on features and leaves representing the predicted outcomes [22]. Figure 4 shows an example of the Decision Tree Regression that was generated in this study. The decision tree starts with a root condition of Solvent Yellow 114 with a dosage under 0.005, where it best splits the data to minimize the mean squared error (MSE). It then creates a condition for splitting the data, aiming to reduce the variance in the predicted values which is inline a* ≤ 3.78 and Raven 1010 ≤ 0.013. The recursive splitting process continues, forming a binary tree structure. The goal is to iteratively partition the data into subsets that exhibit lower variance in the target variable. As the tree grows, leaf nodes contain the predicted values for the target variable, which might be the mean or median of the target values in the leaf. During the prediction phase, a new data point traverses the tree, following the path of decisions until it reaches a leaf node. The predicted value is then determined by the value that is associated with that leaf.
Bagging Regression is an ensemble learning technique that involves training multiple decision trees with different feature orders [23]. Figure 5 shows the working principle of Bagging Regression. In this process, features are randomly selected and arranged to create decision trees. This is repeated multiple times (1000 times in this paper), resulting in a collection of diverse decision trees. When making predictions for new data, the Bagging Regression aggregates the outputs of these individual trees, often by averaging, to provide a more robust and generalized prediction. The randomness introduced in feature selection and ordering helps reduce overfitting, making the model more effective and resilient. The parameters for Bagging with Decision Tree Regression used in this paper are summarized in Table 2.

3.2. Deep Neural Network

A Deep Neural Network makes predictions through a process called forward propagation, which involves passing the input data through the network’s layers of interconnected neurons. Figure 6 shows the working principle of the Deep Neural Network. In this paper, the network is trained for 50 epochs, with each epoch processing batches of 32 samples at a time, considering the small sample size. The Deep Neural Network comprises three layers: an input layer with 128 neurons using Rectified Linear Unit (ReLU) activation, a hidden layer with 64 neurons and ReLU activation, and an output layer with three neurons corresponding to the targets (offline L*, a*, b*). The network is compiled using the Adam optimizer and the mean squared error loss function, commonly chosen for regression problems. Once trained, the Deep Neural Network is utilized to make predictions on new dataset features. The parameters for the Deep Neural Network in this paper are summarized in Table 3.

3.3. Multiple Linear Regression

Multiple Linear Regression makes predictions by combining the weighted sum of multiple input features with a constant term. In our paper, the input features are the material dosage and inline L* a* b*. The model learns these weights during training to minimize the difference between its predictions and the actual target values, allowing it to generalize and make accurate predictions on new data by considering multiple input features simultaneously. The parameters for Multiple Linear Regression in this paper are summarized in Table 4.

3.4. Random Forest Regression

Random Forest Regression, an ensemble learning technique, refines Bagging principles by introducing more randomization into the construction of individual decision trees. In contrast to Bagging with a Decision Tree Regression, Random Forest Regression selects only a random subset of features, not all, for splitting a node. This deliberate feature subset randomness aims to decrease correlations between trees, enhancing the overall robustness. However, it may potentially miss crucial features. The working principle of Random Forest Regression is illustrated in Figure 7, where a subset of features is randomly chosen for each tree’s training. This process is iterated 1000 times. When given new data, the Random Forest Regression aggregates outputs from individual trees to provide a more robust prediction. The parameters for Random Forest Regression in this paper are summarized in Table 5.

4. Machine Learning Methodology

4.1. Data Exploration through the Pearson Correlation Coefficient

To enhance prediction accuracy, understanding the linear relationship between features—material dosage and inline color data and the target variables—and offline color data is crucial. The Pearson correlation coefficient (PCC) was employed for this purpose. The PCC measures the strength and direction of the linear relationship between two variables. The PCC was computed using Equation (4).
r = ( X i X ¯ ) ( Y i Y ¯ ) ( X i X ¯ ) 2 · ( Y i Y ¯ ) 2
where Xi and Yi are individual data points of the variables X and Y, and X ¯ and Y ¯ are the means of variables X and Y, respectively.
This analysis was primarily undertaken to identify any linear relationships and the necessity of data augmentation for improved model performance. Moreover, the statistical significance of these correlations was assessed to ensure that the observed relationship is not due to random chance but reflects a genuine association in the data.
Table 6 illustrates the correlations between each material and the offline a* value. It is evident that materials that are strongly correlated with the offline a* value include the red pigment, indicating a positive linear relationship, and the blue pigment, indicating a negative linear relationship. This discovery piques interest, given that the a* value is typically impacted by the dosage of both red and green pigments. Nevertheless, the difference in correlation coefficients between green and blue pigments is not considerable. Therefore, it is reasonable to propose that the blue pigment tends to exhibit a greenish tone.
Table 7 illustrates the correlations between each material and the offline b* value. The yellow pigment (revealing a positive linear relationship) and blue pigment (revealing a negative linear relationship) exhibit strong correlations with the b* value. This aligns with the CIE L* a* b* color space, where the yellow pigment contributes a positive b* value, and the blue pigment contributes a negative b* value.
Table 8 presents the correlations between each material and the offline L* value. The results show that the white pigment (indicating a positive linear relationship), as well as the blue and black pigments (indicating a negative linear relationship), demonstrate strong correlations with the L* value. The positive association with the L* value aligns with the CIE L* a* b* color space, where the white pigment contributes to a positive L* value.
It is noteworthy that, apart from the black pigment, the blue pigment also significantly contributes to the negative L* value. This observation suggests that the blue pigment could serve as an alternative to the black pigment in contributing to a negative L* value.
Table 9 displays the correlations between the inline and offline color data, revealing robust associations between the two. The results show that strong correlations between inline and offline color data affirm the potential of inline color data to predict offline color characteristics. These correlations between material dosage, inline L* a* b*, and offline L* a* b* values are critical for identifying relevant features in model training, guiding the approach towards more precise and reliable color prediction.
In summary, these findings not only demonstrate significant linear relationships between the chemical components and color data but also adhere to the established principles of the CIE L* a* b* color space. This enhances the understanding of how different materials influence color properties, which is vital for developing more accurate machine learning models for color prediction.

4.2. Dataset Allocation

Out of the complete dataset that was generated from the color measurement, which comprises 83 color formulations as presented in Table 1, 74 datasets were designated for training, and the remaining 9 datasets were for testing to assess model performance. The order of the data was randomized before splitting to reduce overfitting and improve the generalization of the data.
Each dataset comprises 11 features and 3 target variables, as illustrated in Figure 8. The features are categorized into two groups: the dosage of each material and inline L* a* b*. The target variables include offline L* a* b* values. These datasets will be employed for model training and performance evaluation.

4.3. Evaluation Metric

The performance of each model was assessed using the aggregated dE, a domain-specific RMSE. RMSE is defined as the standard metric in regression analysis that measures the average magnitude of the errors between predicted and actual values. An aggregated dE gauges the average color difference between predicted and actual values of the test dataset, calculated using Equation (5). Equation (6) shows the equation for calculating RMSE. Lower dE values signify greater accuracy in the model prediction.
A g g r e g a t e d   d E * = i = 1 n ( L ^ i * L i * ) 2 + ( a ^ i * a i * ) 2 + ( b ^ i * b i * ) 2 n
L ^ 1 * , L ^ 2 * ,…, L ^ n * , are predicted L* values, and L 1 * , L 2 * ,… L n * are actual L* values.
a ^ 1 * , a ^ 2 * ,…, a n * , are predicted a* values, and a 1 * , a 2 * ,… a are actual a* values.
b ^ 1 * , b ^ 2 * ,…, b ^ n * , are predicted b* values, and b 1 * , b 2 * ,… b n * are actual b* values.
n is the number of samples.
R M S E = i = 1 n ( y ^ i * y i * ) 2 n
y ^ 1 * , y ^ 2 * ,…, y ^ n * , are predicted values, and y 1 * , y 2 * ,… y n * are actual values.
n is the number of samples.

5. Results

5.1. Performance of Machine Learning Model

Table 10 displays the aggregated dE of each model at a sample size of 83 and a test sample size of 9. Both Bagging with Decision Tree Regression and Random Forest exhibit the lowest aggregated dE values of 10.84 and 10.75, respectively. In contrast, the Deep Neural Network demonstrates a higher aggregated dE, indicating overfitting caused by the limited sample size. Multiple Linear Regression also exhibits a high aggregated dE due to its inability to capture complex, nonlinear relationships that are present in the dataset, limiting its predictive accuracy.
In summary, Bagging with Decision Tree Regression and Random Forest Regression exhibit the lowest aggregated dE values. However, the observed color difference remains too high for practical production use. The impact of the sample size on reducing the color difference will be explored in the next section to assess the feasibility of achieving more satisfactory results.

5.2. Impact of Sample Size on Machine Learning Accuracy

To understand the effect of the sample size on model accuracy, a systematic analysis of how increasing sample sizes influence the aggregated dE for various models that are referenced in this paper was conducted. Each model architecture was trained and evaluated using various sample sizes. These samples were obtained as random subsets of the total training samples and selected without replacement, ensuring the uniqueness and variability of each sample set. Figure 9 illustrates the aggregated dE plotted against the number of samples for each model type.
In Figure 9a, a decline in the aggregated dE is observed for Bagging with Decision Tree Regression as the sample size increases. This trend suggests enhanced predictive accuracy, likely due to the model’s exposure to a broader range of feature variations within the larger datasets.
Conversely, Figure 9b demonstrates a decrease in the aggregated dE for the Deep Neural Network model up to a sample size of 45. Beyond this point, the aggregated dE increases. There might be several explanations for this, but it is likely an overfitting issue, as the model learns the noise of the additional data instead of capturing the underlying patterns [24].
Figure 9c reveals an initial rise in the aggregated dE for Multiple Linear Regression with increasing sample sizes, followed by a decrease after reaching 45 samples. This pattern is consistent with findings in Knofczynski’s research, underscoring the necessity of a minimum sample size for accurate predictions [25]. Smaller sample sizes might result in misleading outcomes due to insufficient data representation.
Finally, Figure 9d shows that the Random Forest Regression exhibits a trend that is akin to Bagging with Decision Tree Regression. The aggregated dE decreases as more samples are introduced, which is expected given their common underlying mechanism based on Decision Tree Regression.
These observations suggest that the Bagging with Decision Tree Regression and Random Forest Regression provide the highest and most consistent returns (in terms of aggregated dE) for a given increase in the dataset compared to other models.
Table 11 highlights the pros and cons of our study compared with previous studies.

6. Conclusions

In this study, machine learning algorithms were developed to predict offline color data using both inline color measurements during polymer melt compounding and offline color measurements on injection-molded cuboid color chips. Four machine learning models, namely, Bagging with Decision Tree Regression, Deep Neural Network, Multiple Linear Regression, and Random Forest Regression, were employed with the input of measurement data and material dosage.
Among these models, Bagging with Decision Tree Regression and Random Forest Regression demonstrated notable effectiveness, achieving the lowest aggregated dE values of 10.84 and 10.75. As the current aggregated dE values are somewhat high for production-level application, further analysis of the effect of the sample on model prediction accuracy is required. Bagging with Decision Tree Regression and Random Forest Regression show a consistent reduction in aggregated dE values with an increasing sample size. This suggests that the truth function for offline color is easily discoverable by increasing the training sample size.
This methodology suggests a potentially more efficient approach to ensure color chip conformity during production. As the model performance improves with the training dataset size, the minimization of material and time wastage becomes more achievable. Overall, the results indicate a promising avenue for integrating machine learning into color quality control processes within the polymer manufacturing industry.

Author Contributions

Conceptualization, P.K.N., Y.W.L. and H.I.; methodology, P.K.N., Y.W.L. and H.I.; software, P.K.N. and Q.S.G.; validation, M.F.S. and Y.W.L.; formal analysis, P.K.N.; investigation, P.K.N. and S.T.; resources, P.K.N., M.F.S. and Q.S.G.; data curation, P.K.N.; writing—original draft preparation, P.K.N.; writing—review and editing, P.K.N., Y.W.L., M.F.S., Q.S.G., S.T. and H.I.; visualization, P.K.N. and Q.S.G.; supervision, Y.W.L., H.I. and S.T.; project administration, M.F.S. and S.T.; funding acquisition, P.K.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Authors Puay Keong Neo, Moi Fuai Soon, and Qing Sheng Goh were employed by the company Omni-Plus System Limited, Singapore. Author Yew Wei Leong was employed by the company Matwerkz Technologies Pte Ltd., Singapore. The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A

Table A1. Dataset of inline and offline color readings.
Table A1. Dataset of inline and offline color readings.
FormulationInline L*Inline a*Inline b*Offline L*Offline a*Offline b*
114.372.116.3386.68−0.498.87
224.740.455.7283.06−0.8112.15
331.44−0.066.4384.94−0.1710.64
439.170.149.7186.360.328.24
544.480.9213.1488.01−0.546.13
648.251.9816.3890.95−0.365.56
749.932.5518.0592.94−0.445.23
850.572.7318.6994.06−0.425.02
950.942.8219.1194.69−0.434.58
1048.461.3312.7475.55−0.423.21
1145.141.4111.1566.74−0.41.29
1241.251.449.8059.63−0.370.29
1333.541.186.8844.66−0.17−1.03
1424.031.084.1927.950.39−0.44
1518.001.805.1026.480.380.07
1614.401.965.2570.81−56.0510.94
1714.551.655.0563.96−68.3416.37
1814.731.054.8252.7−70.0620.14
1914.860.604.7641.81−53.3815.94
2014.960.294.7132.2−26.837.7
2115.070.004.6426.06−5.420.62
2215.14−0.154.5924.97−1.18−0.9
2315.24−0.274.6324.87−0.66−1.19
2415.781.986.4024.841.093.15
2515.881.986.3523.970.41.98
2615.191.966.1923.980.41.97
2715.201.986.2424.030.41.94
2815.261.996.2624.030.41.95
2915.292.016.3324.110.41.91
3015.702.056.5324.120.411.94
3115.442.146.6023.960.42.01
3216.272.136.4181.44−3.391.34
3316.392.096.3376.34−6.06−6.03
3416.502.106.2565.04−10.42−22.51
3516.562.156.1255.55−10.49−36.02
3616.562.196.0942.420.09−50.14
3716.592.216.0434.1516.07−54.19
3816.582.245.9630.7820.03−50.65
3916.602.295.8827.1619.01−40.4
4016.731.946.3582.68−10.379.52
4117.651.198.4482.12−6.7985.79
4218.180.4710.7679.871.6690.88
4319.27−0.5914.3676.6811.5690.26
4420.13−1.2917.6173.521.7487.14
4520.83−1.6921.0170.1430.6482.29
4621.54−1.4722.9867.6836.1178.3
4716.862.1533.5264.2642.2372.51
4823.203.4921.0082.69−5.1361.86
4926.163.9635.0580.77−1.1569.22
5024.525.5140.4876.429.2276.01
5122.497.1243.6972.317.2475.34
5217.752.9238.7739.12−3.0521.52
5314.721.6534.56260.693.3
5416.862.1533.5225.080.541.71
5544.78−0.567.7275.89−5.33−6.27
5643.13−1.175.2971.97−6.19−12.14
5738.48−2.03−0.7160.6−5.24−26.93
5833.61−2.12−6.2452.94−1.99−35.12
5926.99−0.62−0.8038.43−2.25−8.83
6019.751.171.7023.980.68−0.97
6115.501.884.6123.620.650.53
6224.01−7.169.5067.86−39.253.2
6324.80−8.099.8862.57−44.738.02
6422.68−6.969.2150.62−46.065.83
6520.43−5.858.6843−39.766.06
6616.84−4.228.0234.84−18.752.12
6716.12−2.087.0124.68−0.581.04
6818.071.026.2524.95−1.370.83
6914.633.757.2461.463.16−6.31
7014.824.317.5355.571.351.46
7115.034.807.9251.4269.8417.81
7215.185.228.1748.0865.431.76
7315.285.618.3944.8661.1836.52
7415.405.888.5341.356.0533.22
7515.456.028.5839.1252.529.7
7615.116.088.4836.647.5525.41
7738.7626.415.8165.7351.643.92
7835.3231.164.2561.3958.126.59
7927.5833.643.2653.2862.6714.54
8023.1529.584.3849.9461.2717.73
8119.7518.423.1134.6425.154.53
8216.017.915.6228.0410.248.2
8315.105.796.9027.167.417.06

References

  1. Kirchner, E.J.J.; Ravi, J. Setting tolerances on color and texture for automotive coatings. Color Res. Appl. 2014, 39, 88–98. [Google Scholar] [CrossRef]
  2. Arino, I.; Johansson, S.; Kleist, U.; Liljenström-Leander, E.; Rigdahl, M. The Effect of Texture on thePass/Fail Colour Tolerances of Injection-Molded Plastics. Color Res. Appl. 2007, 32, 47–54. [Google Scholar] [CrossRef]
  3. ISO/CIE 11664-4:2019(E); Colorimetry—Part 4: CIE 1976 L*A*B* Colour Space. International Commission on Illumination (CIE): Vienna, Austria, 2017.
  4. Schanda, J. Colorimetry: Understanding the CIE System; John Wiley & Sons: Hoboken, NJ, USA, 2007; p. 496. [Google Scholar]
  5. Reshadat, R.; Balke, S.T.; Calidonio, F.; Dobbin, C.J. In-line Color Monitoring of Pigmented Polyolefins During Extrusion. I. Assessment. In Coloring Technology for Plastics; William Andrew: Norwich, NY, USA, 1999; pp. 141–148. [Google Scholar]
  6. Krumbholz, N.; Hochrein, T.; Vieweg, N.; Hasek, T.; Kretschmer, K.; Bastian, M.; Mikulics, M.; Koch, M. Monitoring polymeric compounding processes inline with THz time-domain spectroscopy. Polym. Test. 2009, 28, 30–35. [Google Scholar] [CrossRef]
  7. X-Rite, Incorporated. Measuring “True” Color; Should I Use SCE or SCI? X-Rite, Incorporated. Available online: https://www.xrite.com/service-support/measuringtruecolorshouldiusesceorsci (accessed on 30 June 2023).
  8. Reshadat, R.; Desa, S.; Joseph, S.; Mehra, M.; Stoev, N.; Balke, S.T. In-line near-infrared monitoring of polymer processing. Part I: Process/monitor interface development. Appl. Spectrosc. 1999, 53, 1412–1418. [Google Scholar] [CrossRef]
  9. Equitech. Probes For Difference Applications. Equitech. Available online: https://equitechintl.com/products/probes/ (accessed on 4 October 2023).
  10. Keong, N.P. Inline colour monitoring of thermoplastic extrusion: Correlation of colour measurement and rheological behavior. In Proceedings of the 13th SPSJ International Polymer Conference (IPC2023), Hokaido, Japan, 18–21 June 2023. [Google Scholar]
  11. Lee, J.; Kim, J.; Kim, J. A Study on the Architecture of Artificial Neural Network Considering Injection-Molding Process Steps. Polymers 2023, 15, 4578. [Google Scholar] [CrossRef]
  12. Shams-Nateri, A.; Amirshahi, S.; Latifi, M. Prediction of Yarn Cross-Sectional Color from Longitudinal Color by Neural Network. Res. J. Text Appar. 2006, 10, 25–35. [Google Scholar] [CrossRef]
  13. Jeon, J.; Rhee, B.; Gim, J. Melt Temperature Estimation by Machine Learning Model Based on Energy Flow in Injection Molding. Polymers 2022, 14, 5548. [Google Scholar] [CrossRef]
  14. Joo, C.; Park, H.; Kwon, H.; Lim, J.; Shin, E.; Cho, H.; Kim, J. Machine Learning Approach to Predict Physical Properties of Polypropylene Composites: Application of MLR, DNN, and Random Forest to Industrial Data. Polymers 2022, 14, 3500. [Google Scholar] [CrossRef]
  15. Munir, N.; Nugent, M.; Whitaker, D.; McAfee, M. Machine Learning for Process Monitoring and Control of Hot-Melt Extrusion: Current State of the Art and Future Directions. Pharmaceutics 2021, 13, 1432. [Google Scholar] [CrossRef]
  16. Munir, N.; McMorrow, R.; Mulrennan, K.; Whitaker, D.; McLoone, S.; Kellomäki, M.; Talvitie, E.; Lyyra, I.; McAfee, M. Interpretable Machine Learning Methods for Monitoring Polymer Degradation in Extrusion of Polylactic Acid. Polymers 2023, 15, 3566. [Google Scholar] [CrossRef] [PubMed]
  17. ISO/CIE 11664-2:2022(E); Colorimetry—Part 2: CIE Standard Illuminants. International Commission on Illumination (CIE): Vienna, Austria, 2022.
  18. X-Rite, Incorporated. Ci7800 Sphere Benchtop Spectrophotometer. Available online: https://www.xrite.com/categories/benchtop-spectrophotometers/ci7x00-family/ci7800 (accessed on 30 June 2023).
  19. Agate, S.; Williams, A.; Dougherty, J.; Velev, O.D.; Pal, L. Polymer Color Intelligence: Effect of Materials, Instruments, and Measurement Techniques—A Review. ACS Omega 2023, 8, 23257–23270. [Google Scholar] [CrossRef] [PubMed]
  20. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  21. Breiman, L. Pasting small votes for classification in large databases and on-line”, Machine Learning. Mach. Learn. 1999, 36, 85–103. [Google Scholar] [CrossRef]
  22. Ho, T. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 832–844. [Google Scholar]
  23. Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
  24. Amini, M.; Abbaspour, K.C.; Khademi, H.; Fathianpour, N.; Afyuni, M.; Schulin, R. Neural network models to predict cation exchange capacity in arid regions of Iran. Eur. J. Soil Sci. 2005, 56, 551–559. [Google Scholar] [CrossRef]
  25. Knofczynski, G.T.; Mundfrom, D. Sample Sizes When Using Multiple Linear Regression for Prediction. Educ. Psychol. Meas. 2007, 68, 431–442. [Google Scholar] [CrossRef]
  26. Ao, Y.; Li, H.; Zhu, L.; Ali, S.; Yang, Z. The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J. Pet. Sci. Eng. 2019, 174, 776–789. [Google Scholar] [CrossRef]
  27. Tu, J.V. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J. Clin. Epidemiol. 1996, 49, 1225–1231. [Google Scholar] [CrossRef]
  28. Pramanik, S.; Chowdhury, U.N.; Pramanik, B.K.; Huda, N. A comparative study of bagging, boosting and C4. 5: The recent improvements in decision tree learning algorithm. Asian J. Inf. Technol. 2010, 9, 300–306. [Google Scholar]
  29. Machova, K.; Puszta, M.; Barcak, F.; Bednar, P. A comparison of the bagging and the boosting methods using the decision trees classifiers. Comput. Sci. Inf. Syst. 2006, 3, 57–72. [Google Scholar] [CrossRef]
  30. Mijwel, M.M. Artificial neural networks advantages and disadvantages. Mesop. J. Big Data 2021, 2021, 29–31. [Google Scholar] [CrossRef]
  31. Vittinghoff, E.; Glidden, D.V.; Shiboski, S.C.; McCulloch, C.E. Linear, Logistic, Survival, and Repeated Measures Models. In Regression Methods in Biostatistics; Springer: New York, NY, USA, 2011; p. 509. [Google Scholar]
  32. Langsetmo, L.; Schousboe, J.T.; Taylor, B.C.; Cauley, J.A.; Fink, H.A.; Cawthon, P.M.; Kado, D.M.; Ensrud, K.E.; Osteoporotic Fractures in Men (MrOS) Research Group. Advantages and disadvantages of random forest models for prediction of hip fracture risk versus mortality risk in the oldest old. JBMR Plus 2023, 7, e10757. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Working principle of an offline bench-top spectrophotometer with (a) Specular Component Excluded and (b) Specular Component Included.
Figure 1. Working principle of an offline bench-top spectrophotometer with (a) Specular Component Excluded and (b) Specular Component Included.
Polymers 16 00481 g001
Figure 2. Working principle of inline process spectrophotometer.
Figure 2. Working principle of inline process spectrophotometer.
Polymers 16 00481 g002
Figure 3. Image of injection-molded color chip used in this study.
Figure 3. Image of injection-molded color chip used in this study.
Polymers 16 00481 g003
Figure 4. Working principle of decision trees.
Figure 4. Working principle of decision trees.
Polymers 16 00481 g004
Figure 5. Working principle of Bagging with Decision Tree Regression.
Figure 5. Working principle of Bagging with Decision Tree Regression.
Polymers 16 00481 g005
Figure 6. Working principle of Deep Neural Networks.
Figure 6. Working principle of Deep Neural Networks.
Polymers 16 00481 g006
Figure 7. Working principle of Random Forest.
Figure 7. Working principle of Random Forest.
Polymers 16 00481 g007
Figure 8. Summary of features and target variables.
Figure 8. Summary of features and target variables.
Polymers 16 00481 g008
Figure 9. Plot of root mean square error against sample size for (a) Bagging with Decision Tree Regression, (b) Deep Neural Network, (c) Multiple Linear Regression, and (d) Random Forest Regression.
Figure 9. Plot of root mean square error against sample size for (a) Bagging with Decision Tree Regression, (b) Deep Neural Network, (c) Multiple Linear Regression, and (d) Random Forest Regression.
Polymers 16 00481 g009
Table 1. Formulation of polycarbonate with different pigments to build the dataset.
Table 1. Formulation of polycarbonate with different pigments to build the dataset.
FormulationPC
Makrolon 2807
EBS
L-205F
Tiona
288
Raven 1010Heliogen Green K 8730Ultramarine
Blue 05
Solvent Yellow 114Plast Red 8355
11000000000
299.650.30.0500000
399.60.30.100000
499.450.30.2500000
599.20.30.500000
698.70.3100000
797.70.3200000
896.70.3300000
994.70.3500000
1098.70.30.9990.0010000
1198.70.30.9950.0050000
1298.70.30.990.010000
1398.70.30.960.040000
1498.70.30.70.30000
1598.70.30.50.50000
1699.690.3000.01000
1799.680.3000.02000
1899.650.3000.05000
1999.60.3000.1000
2099.50.3000.2000
2199.30.3000.4000
2299.10.3000.6000
2398.70.3001000
2499.690.300.010000
2599.680.300.020000
2699.650.300.050000
2799.60.300.10000
2899.50.300.20000
2999.30.300.40000
3099.10.300.60000
3198.70.3010000
3299.690.30000.0100
3399.680.30000.0200
3499.650.30000.0500
3599.60.30000.100
3699.50.30000.200
3799.30.30000.400
3899.10.30000.600
3998.70.3000100
4099.690.300000.010
4199.680.300000.020
4299.650.300000.050
4399.60.300000.10
4499.50.300000.20
4599.30.300000.40
4699.10.300000.60
4798.70.3000010
4898.70.30.950000.050
4998.70.30.90000.10
5098.70.30.70000.30
5198.70.30.50000.50
5298.70.30.480.02000.50
5398.70.300.025000.9750
5498.70.300.05000.950
5598.70.30.95000.0500
5698.70.30.9000.100
5798.70.30.7000.300
5898.70.30.5000.500
5998.70.30.480.0200.500
6098.70.300.02500.97500
6198.70.300.0500.9500
6298.70.30.95000.0500
6398.70.30.9000.100
6498.70.30.7000.300
6598.70.30.5000.500
6698.70.30.480.0200.500
6798.70.300.02500.97500
6898.70.300.0500.9500
6999.690.3000000.01
7099.680.3000000.02
7199.650.3000000.05
7299.60.3000000.1
7399.50.3000000.2
7499.30.3000000.4
7599.10.3000000.6
7698.70.3000001
7798.70.30.9500000.05
7898.70.30.900000.1
7998.70.30.700000.3
8098.70.30.500000.5
8198.70.30.480.020000.5
8298.70.300.0250000.975
8398.70.300.050000.95
Table 2. Machine learning model architecture of Bagging with Decision Tree Regression.
Table 2. Machine learning model architecture of Bagging with Decision Tree Regression.
ParameterValue
Random state for Decision Tree Regression42
Number of base estimators1000
Random state for Bagging Regression42
Number of parallel jobs−1
Table 3. Machine learning model architecture of Deep Neural Networks.
Table 3. Machine learning model architecture of Deep Neural Networks.
ParameterValue
Number of hidden layers2
Number of input layer neurons11
Number of hidden layer neurons192
Number of output layer neurons3
Hidden layer activation functionRelu
OptimizerAdam
Loss functionRMSE
Training iterations (epochs)50
Batch size32
Table 4. Machine learning model architecture of Multiple Linear Regression.
Table 4. Machine learning model architecture of Multiple Linear Regression.
ParameterValue
fit_interceptTrue
NormalizeFalse
Copy_XTrue
Number of jobsNone
Table 5. Machine learning model architecture of Random Forest.
Table 5. Machine learning model architecture of Random Forest.
ParameterValue
Random state for Random Forest42
Number of base estimators1000
Table 6. Pearson correlation coefficients between chemical components and offline a* value.
Table 6. Pearson correlation coefficients between chemical components and offline a* value.
Chemical ComponentOffline Color DataPearson Correlation Coefficient
Plast Red 8355Offline a* value0.355185
Solvent Yellow 114Offline a* value0.119745
EBS L-205FOffline a* value0.021655
PC Makrolon 2807Offline a* value0.001232
Raven 1010Offline a* value−0.048207
Tiona 288Offline a* value−0.074003
Heliogen Green K 8730Offline a* value−0.097658
Ultramarine Blue 05Offline a* value−0.106822
Table 7. Pearson correlation coefficients between chemical components and offline b* value.
Table 7. Pearson correlation coefficients between chemical components and offline b* value.
Chemical ComponentOffline Color DataPearson Correlation Coefficient
Solvent Yellow 114Offline b* value0.3597
PC Makrolon 2807Offline b* value0.105443
Plast Red 8355Offline b* value0.047905
EBS L-205FOffline b* value0.012755
Tiona 288Offline b* value−0.059283
Heliogen Green K 8730Offline b* value−0.074992
Raven 1010Offline b* value−0.102322
Ultramarine Blue 05Offline b* value−0.359825
Table 8. Pearson correlation coefficients between chemical components and offline L* value.
Table 8. Pearson correlation coefficients between chemical components and offline L* value.
Chemical ComponentOffline Color DataPearson Correlation Coefficient
Tiona 288Offline L*0.447425
Solvent Yellow 114Offline L*−0.008293
PC Makrolon 2807Offline L*−0.130144
EBS L-205FOffline L*−0.162846
Plast Red 8355Offline L*−0.225973
Heliogen Green K 8730Offline L*−0.23764
Ultramarine Blue 05Offline L*−0.354951
Raven 1010Offline L*−0.359394
Table 9. Pearson correlation coefficients between inline and offline color data.
Table 9. Pearson correlation coefficients between inline and offline color data.
Inline Color DataOffline Color DataPearson Correlation
Coefficient
Inline L*Offline L*0.583606
Inline a*Offline a*0.576646
Inline b*Offline b*0.522276
Table 10. Table of machine learning model and its aggregated dE values.
Table 10. Table of machine learning model and its aggregated dE values.
ModelAggregated dE
Bagging Regression with Decision Tree Regression10.84
Deep Neural Networks22.90
Multiple Linear Regression25.39
Random Forest10.75
Table 11. Table of comparison of pros and cons between current and previous studies.
Table 11. Table of comparison of pros and cons between current and previous studies.
Machine
Learning Model
Bagging with Decision Tree RegressionNeural NetworksMultiple Linear
Regression
Random Forest
Regression
ProsLess probability of overfitting [20]Incorporated multi-task learning, where learning does not occur solely for one task [11]Fast calculation speed [14]Effective for learning with limited samples [26]
Simple model [20]Able to implicitly detect complex nonlinear relationships between dependent and independent variables [27] Robust for learning with strong data error [26]
Robust to the effect of noisy data [28] Feasible for nonlinear or approximately linear problems [26]
ConsUses significant computational complexity [29]Unexplained behaviors of the model [30]Assumes data are normally distributed, homogenous in variance, and independent of one another [31]Uses significant computational resources, as they require several splitting and evaluations of candidate splits [20]
Loss of simplicity compared to a simple decision tree [29]The duration of training a Neural Network is unknown [30] There are no equations linking the variables with the predicted variable [32]
Proneness to overfitting [27]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Neo, P.K.; Leong, Y.W.; Soon, M.F.; Goh, Q.S.; Thumsorn, S.; Ito, H. Development of a Machine Learning Model to Predict the Color of Extruded Thermoplastic Resins. Polymers 2024, 16, 481. https://doi.org/10.3390/polym16040481

AMA Style

Neo PK, Leong YW, Soon MF, Goh QS, Thumsorn S, Ito H. Development of a Machine Learning Model to Predict the Color of Extruded Thermoplastic Resins. Polymers. 2024; 16(4):481. https://doi.org/10.3390/polym16040481

Chicago/Turabian Style

Neo, Puay Keong, Yew Wei Leong, Moi Fuai Soon, Qing Sheng Goh, Supaphorn Thumsorn, and Hiroshi Ito. 2024. "Development of a Machine Learning Model to Predict the Color of Extruded Thermoplastic Resins" Polymers 16, no. 4: 481. https://doi.org/10.3390/polym16040481

APA Style

Neo, P. K., Leong, Y. W., Soon, M. F., Goh, Q. S., Thumsorn, S., & Ito, H. (2024). Development of a Machine Learning Model to Predict the Color of Extruded Thermoplastic Resins. Polymers, 16(4), 481. https://doi.org/10.3390/polym16040481

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop