Preliminary Examinations for the Identification of U.s. Domestic and International Cotton Fibers by Near-infrared Spectroscopy

Cotton is and has been a large cash crop in the United States and abroad for many years. Part of the widespread interest and utility of this product is due to its attractive chemical and physical properties for use in textiles. The textile industry could benefit from the presentation of a quick, reliable method to classify U.S. from foreign cottons so that the appropriate tariffs can be levied for non-American cottons. In addition, there is some interest in avoiding cotton identity theft. Thus, an accurate and precise instrumental method would be of interest to correctly identify the country of origin of cotton. This study provides an analytical method to identify domestic and foreign cotton fibers using near-infrared (NIR) spectroscopy coupled with principal component analysis (PCA). Samples from American cottons were evaluated along with a representative amount of international samples. The results provide a proof of concept that indicates that PCA analysis can be used to separate the respective domestic and foreign cotton groups.


Introduction
Currently, the methods available to identify cotton are time-consuming, destructive and laborious.Previous methods that have been used include the burning test, solubility tests and staining tests [1].

OPEN ACCESS
The burning test is useful, but has the limitation of destroying the sample, and in the case of blended yarns, it allows the more flammable yarn to burn first.Furthermore, the burning test is used to roughly categorize fiber type, such as cotton, wool or man-made textiles.When used alone, the solubility tests have the limitation that they can only identify generic types of fiber [1].Other cotton fiber identification methods include American Association of Textile Chemists and Colorists methods (TM020 and TM020A), which yield qualitative and quantitative methods, respectively [2].In addition, cotton fiber identity theft is a large economic problem facing the U.S. [3].Cotton identity theft is characterized by passing non-U.S.cotton off as American premium cottons to make a higher profit or to take advantage of trade regulations afforded to U.S. cottons [4].Thus, current manual methods to identify cotton fiber involve a large degree of variability and are labor-intensive [3].However, measurements to identify cotton fiber (U.S./domestic or non-U.S./international) are not available.Therefore, new techniques are needed to enable the user and governmental agencies to know the source of the cotton, domestic or international.
The NIR region encompasses the spectral range of 800-2500 nm (4000-12,500 cm −1 ), where the subregions are defined as first, second and third overtones and combination bands, which correspond to fundamental spectral bands in the mid-infrared region [5][6][7][8].The primarily used NIR spectral region encompasses 1100-2500 nm.The primary absorbances observed in the NIR spectral region are for the chemical species CHi, NHi and OHi, although, in addition, the first, second and third overtones in the NIR region can be related to specific fundamental frequencies in the mid-infrared (MIR) spectral region.Furthermore, NIR spectroscopy offers many advantageous features, including little to no sample preparation, accuracy, precision, non-destruction of samples, ease of use, flexibility of multiple sampling systems (e.g., fiber optic probe, rotating sphere) and the option of analyzing different size samples of pepper (0.841-mm standard diameter), powder (0.177-mm standard diameter) and large, raw cotton samples, which is particularly useful for analyzing a heterogeneous sample, such as cotton.
Near-infrared (NIR) spectroscopy has been used extensively to discriminate/measure both qualitative and quantitative properties of natural fibers, synthetic fibers, textiles and textile auxiliaries.The advantages of using NIR to determine more than one key cotton parameter along with its precision and accuracy were reported by Camjani and Muller [9].In a different textile study, NIR was applied toward the classification of carpet face fiber, which was instrumental in the success of the carpet recycling program [10].Still other studies reported on the application of NIR aimed at determining fiber fineness, maturity, micronaire and fiber contaminants [11,12].Another examination quite beneficial to the textile industry provided an overview on the analysis of textiles using NIR [13].NIR spectroscopy has been used to quantitatively measure cotton quality parameters, both in and outside the laboratory [14][15][16][17][18][19][20].Blend analyses, which include determining cotton content, have been performed with NIR spectroscopy [21,22].Several authors have used NIR spectroscopy to study trash and contamination in cotton (both botanical and field trash/contamination in cotton) [23][24][25].However, even with the breadth of studies mentioned here, no literature was found dedicated to identifying the source of origin for cotton samples.
Principal component analysis (PCA) is a chemometric method that involves the construction of linear multivariate models of complex data sets [26].These models are developed based on principal components, which are eigenvectors.The PCA method models both relevant and random variations in the data set [27].In addition, PCA can reveal relationships between observed and hypothetical variables, while most other chemometric methods are centered on relationships among observed variables only [28].PCA interrelationships are signified graphically by the formation of coherent subgroups along the principal component axes.
Previously, PCA has been successfully used to classify white and colored Australian cottons using NIR and Raman spectroscopy [14,15].The present study seeks to reveal the feasibility of using NIR spectroscopy and chemometrics, specifically PCA analysis, to discriminate between, separate and identify domestic and international cotton samples.

Cotton Samples
A diverse sample set from several countries of origin was prepared, and a total of 205 raw cotton samples were analyzed.Samples of high, medium and/or low grade from foreign countries included Australia, Zimbabwe, Brazil, China, Japan, India, Pakistan, South Africa, Spain, Paraguay, Turkey, Tanzania and Zambia from crop years 2007 and 2009 (n = 54).U.S. cottons included in the study were from Texas, Georgia, Louisiana and Mississippi from crop years 2001-2004 and 2009 (n = 164).All samples were conditioned for a minimum of 24 h and measured at standard conditions (21 ± 1 °C and 65% ± 2% relative humidity (RH)).Five measurements were made on each sample, and the average spectra for each sample were obtained.

FT-NIR Spectroscopy
The FT-NIR spectra were acquired using a bench top Bruker MPA instrument equipped with an integrating sphere accessory (Bruker Optics, Billerica, MA, USA).Measurements were taken by placing the fiber on top of the integrating sphere with a weight placed on top to provide equivalent pressure across the sample surface.The spectra were collected over a spectral range of 4000-12,500 cm −1 (800-2500 nm), with a resolution of 8 cm −1 and 32 scans.NIR spectra were collected using the Bruker OPUS IDENT software package [29].

Chemometrics
NIR spectra were imported into the CAMO Unscrambler version 9.7, chemometric software package, (Camo, Woodbridge, NJ, USA).In the Unscrambler software, the spectral types were separated into groups representing U.S. (domestic) and international groups.Various preprocessing methods, including standard normal variate (SNV) with and without a second derivative, were investigated to normalize the spectral data.The principal component analysis was carried out using a full cross-validation with leave-one-out sampling to test the validity of the calibration model.

Fiber Physical Property Measurements
The primary fiber physical properties and quality attributes of each cotton sample were measured using the Uster ® High Volume Instrument-1000 (HVI™-1000) and Uster ® Advanced Fiber Information System (AFIS) (Uster Technologies, Inc., Knoxville, TN, USA).Vendor recommended protocols and procedures were used.For the HVI™ measurements, each sample was measured five times (5 replicates) and the average of each property reported.For the AFIS measurements, each sample was measured five times (5 replicates), 5000 fibers per rep (5 × 5000 measurement) and the average of each property reported.

NIR Investigations
The NIR absorption spectra of several diverse international and domestic cottons are depicted in Figure 1.It is readily observed that the spectra are very similar, with potentially only minor spectral differences.This was to be expected based on the similar molecular makeup of the foreign and domestic cottons.The pre-processing manipulations attempting to separate the international and U.S. cottons were applied using the NIR absorbance spectra and the Bruker OPUS IDENT software program.OPUS IDENT software allowed for the creation of spectral libraries and sub-libraries, which could separate out sample spectral data.Based on our previous success of uniquely-identified cotton fiber and trash components frequently found comingled with cotton lint, the OPUS IDENT software was explored to try and classify U.S. and international cottons.

India China Zambia
In order to obtain a robust OPUS IDENT spectral library of foreign and U.S. cottons, many samples had to be included.Thus, 122 samples of U.S. cottons were included in the calibration set, and 42 samples were included in the U.S. cotton prediction set.For the international cottons, 40 samples were included in the calibration set and 14 samples were included in the prediction set.In the U.S. sample sets, cotton samples were from Mississippi, Louisiana, Georgia and Texas.Representative samples included in the foreign cotton sample sets were India, China, Zambia, Paraguay, Pakistan and Zimbabwe, among others.In the spectral library, threshold values were calculated for the domestic and foreign cottons.These threshold values were calculated based on statistical measurements of the spectra.These statistical measurements, including calculating the Euclidean distances, were aimed at limiting the variability between reference spectra of different groups, while capitalizing on the similarities that existed between spectra in a specific group.The Euclidean distance was a measure of the similarity between the calibration spectra of a specific group in the library.In effect, calculating the Euclidean distances set up the confidence limit of the threshold values.If the individual reference spectra were below the confidence limit of the average reference spectra for a certain group ("hit quality"), the individual spectrum was said to be a member of that group.After the thresholds were determined for each respective group in the library, validating the library was necessary to determine how well the library members are separated [28].All data points were included, and consequently, no data were considered as outliers.
Pre-processing of the spectral data is often performed to enhance minute/slight differences between samples.Many pre-processing methods were explored, including vector normalization, derivative math, as well as combinations of these two manipulations.Figure 2 depicts a representative group of spectra where second derivative math pre-processing was applied.Again, like the NIR absorbance spectra, minor differences were observed, but much spectral overlap and similarities between the foreign and domestic samples was also observed.Thus, this OPUS IDENT method alone did not successfully classify domestic and foreign cottons due to the high spectral similarity of the NIR spectra.Thus a new method that could uniquely identify U.S. cottons from international cottons was investigated.
Principal component analysis was then applied to the NIR spectra as depicted in the score plot in Figure 3.The U.S. cottons are labeled US2, and the international cottons are labeled Intl.Based on this score plot, complete separation of international and U.S. cotton was observed.The principal component axes represented were eight and two.It appears that there are two groups of international cottons and one large group of U.S. cottons.Since the NIR OPUS IDENT spectral library could not be used to explain these differences observed with the PCA analysis, other explanations were explored.As can be seen in Figure 4, I09 is along the western axis (PC2), I07 is located at the northern y-axis (PC8) and U09 exists along the eastern and center axes (PC2).In addition, a cluster of samples exist between the center and western axis.However, these distinctions were not definitive.Therefore, seed crop years are not the defining factor for the separation of the foreign and domestic groups due to the overlap observed for these groups.Thus, to further elucidate potential sources for these observed separations/differences, the impact of standard fiber physical property measurements were performed on all samples.The physical properties and contaminants were analyzed using the Uster ® Advanced Fiber Information System (AFIS) and Uster ® High Volume Instrument (HVI).HVI and AFIS data for the domestic and international cottons were included in this study to investigate whether the country of origin of cottons can be determined using conventional techniques.The AFIS data are presented in Table 1.Briefly, representative AFIS parameters include seed coat nep count (SCN), mean length by number in inches (L(n)), coefficient of variation of mean length by number in % (L(n)CV), short fiber content by number in % (SFC(n)), mean length by weight in inches (L(w)), coefficient of variation of mean length by weight in % (L(w)CV), short fiber content by weight in % (SFC(w)), upper quartile length by weight in inches (UQL(w)) and visible foreign matter in % (VFM).A few minor differences in mean values were observed between the domestic cottons and the international cottons.For example, the AFIS numerical data suggest that the international cottons, on average, tend to have more trash/trash components (e.g., dust, VFM) associated with them.However, all samples were within one standard deviation for each AFIS fiber property.A closer look at the principal component relationships with the AFIS parameters was investigated.No clear trends were observed between trash/trash components and PC2 and PC8 (significant overlap).
An investigation of the potential relationships between HVI parameters and the NIR spectral principal components was performed.Table 2 shows the numerical HVI data for domestic and international cottons.Representative HVI parameters include the uniformity index (UI) and the upper half mean length (UHML).Overall, the international cottons had higher mean values for micronaire, Therefore, in this study, the observed NIR principal component separation between the international and domestic cotton were not a result of commonly-measured fiber properties and contaminants, based on HVI and AFIS results.

Conclusions
A program was implemented to determine the feasibility of applying NIR spectroscopy and chemometrics to classify cottons based on their production in the U.S. (domestic cottons) or outside the U.S. (international cottons).NIR spectroscopy and principal component analyses (PCA) were employed to determine the feasibility of separating a diverse set of cottons into "domestic" and "international" groupings.As a proof of concept, PCA analysis combined with second derivative pre-processing successfully separated and identified the domestic and international cottons.Crop year, HVI parameters and AFIS parameters were investigated to determine their effect on the separation of domestic and international cottons.None of these comparisons were successful at describing the complete separation of U.S. and international cottons.

Figure 2 .
Figure 2. Representative NIR second derivative spectra of international and U.S. cottons.

Figure 4
Figure 4 shows group separations of domestic and international cottons based on crop year.The U.S. cottons are represented by crop year 2001 (U01), 2002 (U02), 2003 (U03), 2004 (U04), 2005 (U05) and 2009 (U09).The international cottons are represented by crop years 2007 (I07) and 2009 (I09).As can be seen in Figure4, I09 is along the western axis (PC2), I07 is located at the northern y-axis (PC8) and U09 exists along the eastern and center axes (PC2).In addition, a cluster of samples exist between the center and western axis.However, these distinctions were not definitive.Therefore, seed crop years are not the defining factor for the separation of the foreign and domestic groups due to the overlap observed for these groups.Thus, to further elucidate potential sources for these observed separations/differences, the impact of standard fiber physical property measurements were performed on all samples.