Calibration Transfer Based on Affine Invariance for NIR without Transfer Standards

Calibration transfer is an important field for near-infrared (NIR) spectroscopy in practical applications. However, most transfer methods are constructed with standard samples, which are expensive and difficult to obtain. Taking this problem into account, this paper proposes a calibration transfer method based on affine invariance without transfer standards (CTAI). Our method can be utilized to adjust the difference between two instruments by affine transformation. CTAI firstly establishes a partial least squares (PLS) model of the master instrument to obtain score matrices and predicted values of the two instruments, and then the regression coefficients between each of the score vectors and predicted values are computed for the master instrument and the slave instrument, respectively. Next, angles and biases are calculated between the regression coefficients of the master instrument and the corresponding regression coefficients of the slave instrument, respectively. Finally, by introducing affine transformation, new samples are predicted based on the obtained angles and biases. A comparative study between CTAI and the other five methods was conducted, and the performances of these algorithms were tested with two NIR spectral datasets. The obtained experimental results show clearly that, in general CTAI is more robust and can also achieve the best Root Mean Square Error of test sets (RMSEPs). In addition, the results of statistical difference with the Wilcoxon signed rank test show that CTAI is generally better than the others, and at least statistically the same.


Introduction
With the characteristics of high efficiency, low cost and non-destructivity, near-infrared (NIR) spectroscopy has been widely used in control of food and pharmaceutical quality [1][2][3][4]. Multivariate calibration methods are commonly used to obtain quantitative or qualitative information from near-infrared spectra, such as principal component regression (PCR) [5,6] and partial least squares (PLS) [7][8][9][10]. Since changes of the instruments and measurement conditions may result in poor applicability of the model. Recalibration can be utilized to solve this problem, but recalibration is time consuming and takes an immense amount of work. In order to reduce consumption of the recalibration, calibration transfer has been widely studied and applied [11]. There are two main situations about calibration transfer: (1) The uniform calibration model is used to predict spectra being measured on multiple instruments; (2) the new spectra are measured on the same instrument after a period of time.
A number of related methods for calibration model transfer have been proposed, which are divided into two categories. Ones require transfer standards and ones not require transfer standards. The first category of methods has the characteristic that a set of samples are separately measured on the master and slave instrument. A great variety of transfer methods with standard samples have been proposed. For examples, SBC [12,13] assumes a linear relationship between predicted values of different instruments. First, the regression coefficient between the spectra and the response values on the master instrument is calculated. Then the predicted values of the master and slave setting are computed based on the regression coefficient. Finally, a linear equation is fitted between the predicted values. PDS proposed by Wang et al. is employed to correct the spectral differences [14]. In PDS [15][16][17][18], each wavelength of the master instrument is related to the wavelength window of the slave instrument, and a band transfer matrix is finally formed based on the regression coefficients of each window. The observation is consistent with this assumption that in various transfer methods the spectral correlation between master and slave is limited to smaller regions. The keys to PDS are the selection of window size and the number of standard samples. Due to the construction of multiple regression models, a huge amount of calculations are desired. The calibration model transfer for near-infrared spectra based on canonical correlation analysis [19] is proposed by Liang et al. The PLS model is built using the master instrument calibration set, and a part of the calibration set of master and slave instrument is taken as standard samples. Then, the features extracted respectively by canonical correlation analysis (CCA) [20,21]. The relationship between master and slave data is established with ordinary least squares (OLS) [22,23], and the test set is finally corrected. For CCA, SBC and PDS, a good result can be achieved with standard samples, but standard samples are difficult to obtain in some cases. For the transfer methods such as calibration transfer via extreme learning machine auto-encoder (TEAM) [24] method, calibration transfer by generalized least squares (GLSW) [25] method and spectral space transform (SST) [26,27] and so on, standard samples are also required, although the principles of these methods are different.
The second category is the methods without transfer standards. For examples, multiplicative scatter correction (MSC) [28][29][30] proposed by Bouveresse et al. first calculates the mean spectra of the calibration set as the reference spectra, then the linear relationship is found between every spectra and the reference spectra, and the slope and bias are obtained; finally, the slope and bias are utilized to correct slave spectra. While the standard samples are not required in MSC, it is difficult to handle complex situations. MSC is a transfer method using pre-processing techniques, and more pre-processing approaches include finite impulse response (FIR) [31] filtering and multivariate filtering via orthogonal signal correction (OSC) [32,33], etc. TCR [34] is also a standard-free method which combines transfer component analysis (TCA) [35] and ordinary least squares (OLS). The basic idea of TCA is to project the data of two instruments in a Reproducing Kernel Hilbert Space, where the data are distributed as close as possible at the same time preserving the key attributes of the original data. TCR is a robust model with good generalization abilities, but does not achieve more accurate predictions. Other techniques belonging to this category include kernel principal component analysis (KPCA) [36,37], domain generalization via invariant feature representation (DICA) [38] and so on. Different from the above methods, this paper studies the relationship of regression coefficients between the feature vector and predicted values on two spectrometers. Samples of the calibration transfer method based on affine invariance without transfer standards (CTAI) are shown in Figure 1A.
The response values of the slave spectrometer are not required, and the map is not necessary between master and slave samples. The samples are further processed under the PLS model. The spectral features and prediction values are respectively obtained, and the processed samples are shown in Figure 1B. We obtain the linear models between the feature vector and the predicted values respectively. According to the linear models of two instruments, the relationship between the predicted values is further obtained. Firstly, the PLS model is built on the master instrument; secondly, the score matrices and predicted values are extracted according to the PLS model, respectively; further, the angles and biases are calculated between two regression coefficients; finally, the prediction values are corrected by affine transformation. If the concentration information of the master spectra and the slave spectra are in the same range, CTAI can achieve more accurate predicted results and more robust model even without standard samples compared with other methods. The predictive performance of CTAI is verified by two near-infrared (NIR) datasets.  . We assume the data to be available in (A), and the data after being processed based on PLS model of the master instrument is shown in (B).

Analysis of the Corn Dataset
The training errors, prediction errors, cross-validation errors, biases and the correlation coefficients for the predicted vs. actual results about the PLS model of the corn dataset are shown in Table 1. Large correlation coefficients and small biases can be seen in all results. The results reflect a good linear relationship between the spectra and measured values of the corn dataset. There are no significant differences between Root Mean Square Error of calibration set (RMSEC), Minimum Root Mean Square Error of Cross-Validation (RMSECV) and Root Mean Square Error of test set (RMSEP), indicating that there is no over-fitting and under-fitting phenomenon, which can explain the reasonable selection of the number of latent variables. Moreover, we can see that RMSEP m of the PLS on the instrument m5spec are smaller than the RMSEP m of the instrument mp6spec. For most calibration transfer methods, it is important that the master instrument has more accurate prediction results. Thus, m5spec as the master instrument and mp6spec as the slave instrument is a more reasonable choice.  . We assume the data to be available in (A), and the data after being processed based on PLS model of the master instrument is shown in (B).

Analysis of the Corn Dataset
The training errors, prediction errors, cross-validation errors, biases and the correlation coefficients for the predicted vs. actual results about the PLS model of the corn dataset are shown in Table 1. Large correlation coefficients and small biases can be seen in all results. The results reflect a good linear relationship between the spectra and measured values of the corn dataset. There are no significant differences between Root Mean Square Error of calibration set (RMSEC), Minimum Root Mean Square Error of Cross-Validation (RMSECV) and Root Mean Square Error of test set (RMSEP), indicating that there is no over-fitting and under-fitting phenomenon, which can explain the reasonable selection of the number of latent variables. Moreover, we can see that RMSEP m of the PLS on the instrument m5spec are smaller than the RMSEP m of the instrument mp6spec. For most calibration transfer methods, it is important that the master instrument has more accurate prediction results. Thus, m5spec as the master instrument and mp6spec as the slave instrument is a more reasonable choice.
In order to more fully assess the predicted performance of CTAI, the methods MSC, TCR, CCA, SBC and PDS are tested. In this work, when PDS was performed, PLS was utilized to compute the transformation function. For the PLS model, the optimal number of latent variables is shown in Table 1. The optimal dimensionality of the subspace in TCR is 4, 6, 10 and 10. In addition, optimal window sizes of PDS are all 3. We set the standard samples in range [5,30]. When the model is stable, the number of standard samples is selected for modeling based on the smallest RMSEC criteria.
As shown in Table 2, we can see the correlation coefficients r pre and corresponding p pre values, which indicate the prediction values between the master instrument and the slave instrument are linearly correlated. We can also see that the t pre is greater than the t critical value. We then know the bias adjustment in predicted results should be implemented. Furthermore, the RMSE of prediction without any correction for the slave instrument shows more error of prediction than the master instrument. The corrected results of CTAI result in a significant reduction in RMSE of prediction. The same situation can be found between y m and y n in It is further proved that the adjustment of bias is very important. For the corn dataset, the effect of correction in CTAI is vividly described in Figure 2. It can be seen that the corrected predicted values of CTAI more close to the straight line, and RMSEP is greatly reduced. In order to more fully assess the predicted performance of CTAI, the methods MSC, TCR, CCA, SBC and PDS are tested. In this work, when PDS was performed, PLS was utilized to compute the transformation function. For the PLS model, the optimal number of latent variables is shown in Table  1. The optimal dimensionality of the subspace in TCR is 4, 6, 10 and 10. In addition, optimal window sizes of PDS are all 3. We set the standard samples in range [5,30]. When the model is stable, the number of standard samples is selected for modeling based on the smallest RMSEC criteria.
As shown in Table 2, we can see the correlation coefficients rpre and corresponding ppre values, which indicate the prediction values between the master instrument and the slave instrument are linearly correlated. We can also see that the tpre is greater than the t critical value. We then know the bias adjustment in predicted results should be implemented. Furthermore, the RMSE of prediction without any correction for the slave instrument shows more error of prediction than the master instrument. The corrected results of CTAI result in a significant reduction in RMSE of prediction. The same situation can be found between m y and n y  in Table  It is further proved that the adjustment of bias is very important. For the corn dataset, the effect of correction in CTAI is vividly described in Figure 2. It can be seen that the corrected predicted values of CTAI more close to the straight line, and RMSEP is greatly reduced.   Moreover, the results listed in Tables 3 and 4 show the difference between the 16 predictive corn samples by different methods. In general, the results of CTAI exhibit the best performance for prediction compared to other five methods. When moisture is used as the property, CTAI achieves the lowest RMSEP (0.21095). More specifically, the RMSEP improvements provided by CTAI with respect to MSC, TCR, CCA, SBC and PDS are as high as 87.35%, 46%, 9.48%, 50.45% and 12.96%, respectively. Though there are no statistically significant differences, CTAI is greatly improved in predictive accuracy compared with CCA and TCR. There is a significant difference at the 95% confidence level between CTAI and MSC, SBC and PDS. When oil is used as the property, it can be seen that there is no significant difference between RMSEC and RMSEP in different transfer methods, so the over-fitting phenomenon does not appear. CTAI also produces the lowest RMSECV (0.08141) and RMSEC (0.08233). The results by Wilcoxon signed rank test reveal that CTAI is significantly different from MSC and TCR and has similar performance compared with CCA, SBC and PDS. It is noticeable that the RMSEP improvement rates of CTAI compared with CCA, SBC and PDS are 27.98%, 1.52% and 13.28%, respectively. Other properties are similar with the property of oil; CTAI achieves better predictive performance. The master instrument; RMSEP u pre : RMSEP of uncorrected slave instrument relative to primary instrument prediction; RMSEP pre : RMSEP of CTAI corrected slave instrument relative to primary instrument prediction; k pre : The slope between predicted values of uncorrected slave instrument and primary prediction; r pre : Correlation coefficient of uncorrected slave prediction relative to master prediction; p pre : p values corresponding to the Pearson correlation coefficient are obtained by test; t pre : The result of One-Sample t-Test between uncorrected slave prediction and master prediction; RMSEP u : RMSEP of uncorrected slave instrument relative to primary actual values; RMSEP: RMSEP of CTAI corrected slave instrument relative to primary actual values; k: The slope between predicted values of uncorrected slave instrument and primary actual values; r: Pearson correlation coefficient of uncorrected slave prediction relative to primary actual values; p: p values corresponding to the Pearson correlation coefficient are obtained by test; t: The result of One-Sample t-Test between uncorrected slave prediction and master actual values; t critical_value : The t critical value for n-1 degrees of freedom at the significance level alpha = 0.05. Table 3. Summary of Root Mean Square Error of test set (RMSEP) and Root Mean Square Error of calibration set (RMSEC) of different methods. The m5spec was used as the master spectra, and the mp6spec was used as the secondary spectra for corn dataset. The protein content was chosen as the property for wheat dataset.  In order to compare the predictive stability of various methods, Figures 3-6 show the plots of measured vs. predicted values for the calibration set and the test set. If the model predicts better, the point will be closer to the straight line. When moisture is used as the property, it is observed from Figure 3 that CTAI is in general closer to the straight line than the other models. It confirms that the CTAI achieves the best overall performance. When oil is used as the property, it is clear that CTAI provides satisfactory results not only in the calibration set but also in the test set. It reconfirmed that CTAI achieves more accurate prediction results. In addition, the standard error has also achieves good results in CTAI compared with others. From the discussion above, one can easily conclude that CTAI can achieve the best performance in all models and has better generalization ability.

Analysis of the Wheat Dataset
The RMSEP of the PLS model is listed in Table 1. We can see that the predicted performance of the instrument B1 is better than B3 and the instrument B3 is better than B2. Thus, three combinations (B1-B2; B1-B3; B3-B2) of the instruments B1, B2 and B3 are used to analyze the wheat dataset. The first instrument of every combination stands for master instrument and the second instrument stands for slave instrument. For PLS model, the optimal number of latent variables is 14, 15 and 15, respectively, and the corresponding optimal dimensionality of the subspace in TCR is 17, 12 and 17, respectively. Moreover, the optimal number of window sizes for B1-B2, B1-B3 and B3-B2 is 3, 9 and 13, respectively.
For the three combinations of instruments (B1-B2; B1-B3; B3-B2), we can see between y m and y n the correlation coefficients r pre are large and p pre are close to zero in Table 2. Hence, there is a linear relationship between the predicted values of the two instruments for wheat dataset. For all combinations, the absolute value of t is greater than t critical_value . So there is a significant bias between uncorrected predicted values of the slave instrument and predicted values of the master instrument. So we can correct the predicted values of the slave instrument by affine transformation. The experimental results show that the prediction performance of CTAI is significantly enhanced. We found the same phenomenon for the uncorrected prediction values of the slave instrument relative to the master instrument actual values. Furthermore, for the predicted performance of CTAI, Figure 7 shows the difference between uncorrected and corrected predicted values for B1-B2, B1-B3 and B3-B2. It can be seen that CTAI plays an important role in the correction of predicted values.
In addition, Table 3 lists the results of different methods for calibration set and test set. For the B1-B2, CTAI produces the lowest RMSEP (0.41419) and the second lowest RMSEC (0.55682). For PDS and CCA, it is worth noting that RMSEP is significantly larger than RMSEC. Therefore, the predictive performance of PDS and CCA are poor under this setting. Further, a statistical testing is utilized to evaluate the RMSEP difference between the CTAI and other methods for the wheat dataset. The Wilcoxon signed rank sum test was performed and at the significance level alpha = 0.05. It can be seen from Table 4 that there is a statistically significant difference compared with CCA, SBC and PDS. In addition, the improvement rates of prediction provided by CTAI for MSC and TCR are up to 55.07% and 52.32%, respectively. For the combination (B1-B3), CTAI displays the lowest RMSEP (0.68215), followed by TCR (0.72996) and SBC (0.79294). For PDS, we can see that under-fitting still existed under this setting, and for CCA, this phenomenon also exists, but it is not particularly serious. The results by Wilcoxon signed rank test show that CTAI is significantly different from MSC, CCA, SBC and PDS (shown in Table 4). Compared with TCR, RMSEP improvement rates of CTAI can reach 6.55%. For the last combination, both RMSEP and RMSEC achieve the best predicted results. Further, except for PDS, the differences between CTAI and other models are statistically significant at the 95% confidence level. Compared with PDS, the RMSEP improvements of CTAI are as high as 79.05%. It is also worth noting that there is no under-fitting phenomenon in PDS under the current setting, but the predicted results are still poor. Therefore, the predictive performance of PDS is worse for wheat datasets under the current model.
The RMSEP of the PLS model is listed in Table 1. We can see that the predicted performance of the instrument B1 is better than B3 and the instrument B3 is better than B2. Thus, three combinations (B1-B2; B1-B3; B3-B2) of the instruments B1, B2 and B3 are used to analyze the wheat dataset. The first instrument of every combination stands for master instrument and the second instrument stands for slave instrument. For PLS model, the optimal number of latent variables is 14, 15 and 15, respectively, and the corresponding optimal dimensionality of the subspace in TCR is 17, 12 and 17, respectively. Moreover, the optimal number of window sizes for B1-B2, B1-B3 and B3-B2 is 3, 9 and 13, respectively.
For the three combinations of instruments (B1-B2; B1-B3; B3-B2), we can see between m y and n y  the correlation coefficients rpre are large and ppre are close to zero in Table 2. Hence, there is a linear relationship between the predicted values of the two instruments for wheat dataset. For all combinations, the absolute value of t is greater than tcritical_value. So there is a significant bias between uncorrected predicted values of the slave instrument and predicted values of the master instrument. So we can correct the predicted values of the slave instrument by affine transformation. The experimental results show that the prediction performance of CTAI is significantly enhanced. We found the same phenomenon for the uncorrected prediction values of the slave instrument relative to the master instrument actual values. Furthermore, for the predicted performance of CTAI, Figure  7 shows the difference between uncorrected and corrected predicted values for B1-B2, B1-B3 and B3-B2. It can be seen that CTAI plays an important role in the correction of predicted values. In addition, Table 3 lists the results of different methods for calibration set and test set. For the B1-B2, CTAI produces the lowest RMSEP (0.41419) and the second lowest RMSEC (0.55682). For PDS and CCA, it is worth noting that RMSEP is significantly larger than RMSEC. Therefore, the predictive performance of PDS and CCA are poor under this setting. Further, a statistical testing is utilized to evaluate the RMSEP difference between the CTAI and other methods for the wheat dataset. The Wilcoxon signed rank sum test was performed and at the significance level alpha = 0.05. It can be To further display the predictive abilities of different models, the correlation between measured and predicted values obtained in Figures 8-10. Zero differences between measured and predicted values result in points over the straight line of the plot. It can be seen that good correlations are found between expected and predicted concentrations, which confirm the good performance of CTAI. CTAI achieved the lowest standard error for three combinations. Moreover, the predictive abilities of PDS and CCA are poor for wheat dataset. For SBC, PDS and CCA, they require standard samples and TCR requires reference values of the slave instrument samples, both of which are expensive and difficult to obtain. Obviously, this means that CTAI shows much more outstanding performance. values result in points over the straight line of the plot. It can be seen that good correlations are found between expected and predicted concentrations, which confirm the good performance of CTAI. CTAI achieved the lowest standard error for three combinations. Moreover, the predictive abilities of PDS and CCA are poor for wheat dataset. For SBC, PDS and CCA, they require standard samples and TCR requires reference values of the slave instrument samples, both of which are expensive and difficult to obtain. Obviously, this means that CTAI shows much more outstanding performance.

Corn Dataset
The corn dataset, which contains 80 samples, was measured on three NIR spectrometers (m5, mp5 and mp6). Each sample consists of four components: Moisture, oil, protein, and starch. The wavelength range is 1100-2400 nm with interval 2 nm (700 channels). The spectra measured in m5spec were used as the master spectra, and the spectra measured by mp6spec were used as the secondary spectra. The data can be obtained from http://www.eigenvector.com/data/Corn/. The dataset was divided into a calibration set of 64 samples and a test set of 16 samples based on Kennard-Stone (KS) algorithm. The NIR spectra are shown in Figure 11A, which represents the difference between m5 and mp6.

Wheat Dataset
The wheat dataset was used as the shootout data for the International Diffuse Conference 2016, and the protein content was chosen as the property. Related information about the wheat dataset at http://www.idrc-chambersburg.org/content.aspx?page_id=22&club_id=409746&module_id=191116 can be easily accessed. 248 samples of the wheat dataset from three different NIR instrument manufacturers (B1, B2 and B3) were analyzed. According to KS algorithm, 198 samples were chosen as the calibration set and the remainder of samples formed the test set. The wavelength range is 570-1100 nm with an interval of 0.5 nm. The spectral difference between B1 and B2 is shown in Figure 11B. The spectral difference between B1 and B3 is shown in Figure 11C. The spectral difference between B2 and B3 is shown in Figure 11D.
can be easily accessed. 248 samples of the wheat dataset from three different NIR instrument manufacturers (B1, B2 and B3) were analyzed. According to KS algorithm, 198 samples were chosen as the calibration set and the remainder of samples formed the test set. The wavelength range is 570-1100 nm with an interval of 0.5 nm. The spectral difference between B1 and B2 is shown in Figure  11B. The spectral difference between B1 and B3 is shown in Figure 11C. The spectral difference between B2 and B3 is shown in Figure 11D.

Determination of the Optimal Parameters
Latent variables of PLS in CTAI are allowed to take values in the set [1,15], and it is determined by the 10-fold cross-validation. The optimal number of latent variables is selected only when the lowest RMSECV.
Five methods were used for comparison, where the latent variable range and parameter optimization all of SBC, CCA, PDS and MSC in PLS are consistent with CTAI. In particular, the window size in PDS is searched for from 3 to 16 in increments of 2, and is selected by 5-fold cross-validation. In addition, the dimensionality of the TCA space in TCR is estimated in the range [1,24] and the optimization criteria are consistent as described in [24].

Model Performance Evaluation
In this experiment, root mean squared error RMSE is employed as indicators for parameter selection and model evaluation. Furthermore, RMSEC is the training error, RMSECV denotes the cross-validation error and RMSEP indicates the prediction error of the test set. The RMSE calculation method is written as: whereŷ is the predict value, y is the measured value and n represents the number of samples. Bias and standard error (SE) are also utilized as reference indicators for model evaluation. The bias and SE are as follows: Moreover, the Pearson correlation coefficient and corresponding test is used to determine if there is a linear relationship between the master instrument and the slave instrument. One-Sample t-Test is also utilized to determine whether a bias adjustment in predicted results should be implemented [11].
In order to compare CTAI and other methods further, another important parameter (h) is cited in order to compare the rate of improvement, defined as follows: where RMSEP represents the prediction error of CTAI and RMSEP other represents the others. In addition, the Wilcoxon signed rank sum test at the 95% confidence level is used to determine whether there is a significant difference between CTAI and the others.

Computational Environment
All experimental procedures were implemented on a personal computer by python language, software version python 2.7, and run on an acer notebook with a 2.60 GHz Intel (R) Core (TM) i5-3230M CPU, 8 GB RAM and a Microsoft Windows 7 operating system (Acer Incorporated, Taiwan, China). Normalization and cross-validation are performed using the sklearn package. The Wilcoxon signed rank test is implemented using the scipy package and other programs are implemented by the individual.

Notation
In the following text, matrices are represented by bold capital letters (e.g., X), column vectors by bold lower case letters (e.g., y) and scalars by italic letters (e.g., empha). The transposition operation is indicated by superscript T .

Overview of PLS
PLS is used to establish the linear relationship between the input space and the response space. The purpose of the PLS model is to ensure the optimal number of latent variables. The latent variables are linear combinations of the primitive variables. The latent variables are calculated in this way so that they contain a maximum of relevant information concerning the relation between X and y. Mathematically, this is shown by the following objective function.
where w represents the weight vector. This objective is a maximization problem under one constraint, which can be settled in virtue of the Lagrange multiplier method. Assuming a PLS model is built between spectral matrix X ∈ n×p and concentration vector y ∈ n×1 , the model is named PLS1 (n denotes the number of samples and p represents the optimal numbers of latent variables). In the algorithm, the first weighting vector must be the primary eigenvector of the matrix X T yy T X. From the second latent variable on, it requires the following latent variables to be orthogonal (uncorrelated) to the former ones. Hence, the following weighting vectors will be the dominant eigenvectors of the matrix X T yy T X; also, repeat a sequence of the steps until convergence. The PLS1 is built using the following model: where T is the score matrix and P and Q represent the X-loading matrix and y-loading vector, respectively; E and F denote the matrix of residuals; A is the optimal number of principal components over the master instrument PLS model. Finally, the regression coefficient β of the model can be written as follows: where W = [w 1 , w 2 , . . . , w A ] represents the weight matrix.

Affine Transformation
This paper focuses on the rotation and translation properties of two-dimensional affine transformation [39]. After transformation, the original line is still a straight line and the original parallel line is still parallel. Affine transformation is a transformation of coordinates. Based on Figure 12, the derivation is written as follows:  Figure 12. Derivation of affine transformation. In the coordinate system, the counterclockwise rotation of P is equivalent to the clockwise rotation of the coordinate system.
Point P in the original coordinate system (black) is (x, y). A counterclockwise rotation of the point P is equivalent to clockwise rotation of the coordinate system. Thus, the point P in the black coordinate system is equivalent with the point P in the red coordinate system after the rotation. Based on this conclusion, we can determine the coordinates of the point P by simple stereo geometry, and then add the offset of the X-axis and the Y-axis based on this position; the formula is as follows: where θ is the angle of rotation, x Δ is the offset on the X axis and y Δ is the offset on the Y axis; x′ and y′ are coordinate in the new coordinate system.  Point P in the original coordinate system (black) is (x, y). A counterclockwise rotation of the point P is equivalent to clockwise rotation of the coordinate system. Thus, the point P in the black coordinate system is equivalent with the point P in the red coordinate system after the rotation. Based on this conclusion, we can determine the coordinates of the point P by simple stereo geometry, and then add the offset of the X-axis and the Y-axis based on this position; the formula is as follows: x = x cos θ − y sin θ + ∆x y = y cos θ + x sin θ + ∆y (7) where θ is the angle of rotation, ∆x is the offset on the X axis and ∆y is the offset on the Y axis; x and y are coordinate in the new coordinate system.

Calibration Transfer Method based on Affine Transformation
Based on the inputs and outputs X m , y m from the master instrument, and the inputs X s from the slave instrument, our task is to predict the unknown outputs ŷ s in the slave instrument. We assume that X m and X s are the spectra of two similar substances, and y m andŷ s are in the same range. Due to the difference between two instruments, the observed spectral data are different. The observations from the perspective of the master instrument model are as follows: where F is the linear prediction function, which is obtained by partial least squares in this paper; β m is the coefficient of the master model andŷ m , t m i and q m i are the predicted values, the i-th column score vector and the loading vector, respectively. Accordingly, y s and t s i are the biased predicted values and the i-th biased column score vector for the slave instrument, respectively. Therefore, the score vectors and predicted values both of the two instruments are different. As a result, there is a certain bias that needs to be corrected in the coefficient between the score vector and predicted values.
When correcting the bias, direct calculation will produce large errors. In order to solve this problem, we need to transform the score vectors and predicted values of the master and slave instrument into the range [0, 1] and thus keep the same scale between different values. The corresponding equations are given as follows: Two linear regression equations between score vector and predicted values are as follows: where tan θ m i and tan θ s i are the regression coefficients (slopes) computed on the two instrument; b m i and b s i are the intercepts. In order to more intuitively reflect the difference between two instruments, it can be better understood from Figure 13. The blue line is the regression coefficient between the score vector and predicted values. The black and red coordinate systems are the observations of the master and slave instrument, and there is a difference from different observations.   In order to more intuitively reflect the difference between two instruments, it can be better understood from Figure 13. The blue line is the regression coefficient between the score vector and predicted values. The black and red coordinate systems are the observations of the master and slave instrument, and there is a difference from different observations. The unknown angles and biases between two instruments are solved as follows: Firstly, the regression coefficient β m , the weight W m and loading P m matrix of PLS are obtained. Secondly, a linear regression both of master and slave instrument is performed and slopes and intercepts are determined, respectively.
On the grounds of the PLS model, the score matrices and predicted values are calculated as shown below: where T m and T s represent the score matrices of two instruments. The score matrix T m , predicted valuesŷ m , the score matrix T s and predicted values y s are pre-processed using Equation (9).
According to score vector of each column and predicted values, the least square is used to compute the corresponding slopes and intercepts, respectively. The equations are as follows: where T m aug is an augmented matrix t m−norm i , 1 ; T s aug is an augmented matrix t s−norm i , 1 ; 1 is the column vector with all ones. Finally, the angle and biases between the two instruments are obtained. The equations for calculating the angles and biases are as follows: where ∆θ i is the angle of the two coefficients; ∆b i is the corresponding bias. The score matrix and predicted values of the test set are extracted by Equation (11). The angles and biases obtained by Equation (13) are brought into the affine transformation to correct the predicted values. Since the rotation angle is relative to the origin of the coordinate, each sample needs to be adjusted before rotation. The equation is shown as follows: where the matrix M i = Each column score vector and predicted values are solved separately, and a prediction matrix is obtained. The mean of the prediction matrix is the final predicted values.
Therefore, according to the expansion of the predicted values, β s is as follows:

Summary of CTAI
Given calibration set of the master (X m cal , y m cal ), calibration set of the slave X s cal and test set (X s test , y s test ). 1.
The PLS model is built on the calibration set (X m cal , y m cal ) and the coefficient β m ; the weight matrix W m and the loading matrix P m can be obtained.

2.
Modeling of affine transformation; it consists of the two datasets (X m cal , y m cal ) and X s cal . Computing ∆θ i angle and ∆b i bias between master and slave instrument by Equation (13).
(a) ( T s test , y s test ) is obtained by Equation (11). (a) The matrix M i is introduced to correct predicted values by Equation (14). (c) The corrected prediction values are accumulated. The mean values are the last result.

Conclusions
In this study, the relationship of regression coefficients between feature vector and predicted values on different instruments was investigated and CTAI was proposed for calibration transfer based on affine invariance without transfer standards (CTAI). Based on the PLS model of the master instrument, the score matrix and the predicted values of the master spectra, the pseudo score matrix and the pseudo predicted values of the slave spectra are obtained. Then, angles and biases between the coefficients of the master instrument and the corresponding coefficients of the slave instrument are computed. Finally, new samples are corrected by affine transformation. Different transfer methods are tested with two NIR datasets, CTAI achieves the lowest RMSEP and standard error, and the results of statistical difference indicate that CTAI is generally better than other methods, which proves that CTAI is successfully used to correct the difference on different instruments. Hence, the proposed method may provide an efficient way for calibration transfer when standard samples are unavailable in practical applications.