A Numerical Procedure for Multivariate Calibration Using Heteroscedastic Principal Components Regression
Abstract
:1. Introduction
2. Methodology
2.1. Theoretical Framework
2.2. The Proposed Heteroscedastic Technique
2.3. Experimental
3. Results and Discussion
3.1. Preliminary Geometrical Interpretation of H-PCR
3.2. Analysis of NIR Spectra
- The covariance matrixes of measurement fluctuations depend on the measurement condition and on the considered spectral region;
- The increase of stirring velocity increases the variability of spectral measurements because of the unavoidable shaking of mechanical parts and possible formation of air bubbles (as in real monitoring environments);
- The increase of temperature increases the variability of spectral measurements, possibly because of the lower system viscosity and increased rates of air bubbles formation;
- Spectral measurements in the NIR region are subject to strongly correlated fluctuations, so that measurement error fluctuations are not independent, which must be considered during quantitative analyses.
3.2.1. Classical Linear Squares (CLS)
3.2.2. Analysis of the Vx Matrix
3.3. PLS, PCR and H-PCR Calibrations
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Nomenclature
Symbol | Description |
y | Vector model responses |
xi | Available data (or inputs) |
n | Total number of data |
σij | Standard deviation at experimental condition i |
σ2 | Variance |
μ | Contracting factor to convergence control |
ρ | Correlation factor |
Fobj | Objective function |
λ | Eigenvalues |
Λ | Diagonal matrix with eigenvalues |
D | Matrix with eigenvectors |
φi | Variability fraction along the ith direction |
Vxi | Matrix of variance |
p | Principal direction |
amin, amax | Amplitude interval |
bmin, bmax | Lag interval |
fmin, fmax | Frequency interval |
δ | Tolerance |
σr | Reduction factor |
References
- Pasquini, C. Near Infrared Spectroscopy: Fundamentals. Practical Aspects and Analytical Applications. J. Braz. Chem. Soc. 2003, 14, 198–219. [Google Scholar] [CrossRef] [Green Version]
- Bellon, V.; Vigneau, J.L.; Leclercq, M. Feasibility and Performances of a New. Multiplexed. Fast and Low-Cost Fiber-Optic NIR Spectrometer for the On-Line Measurement of Sugar in Fruit. Appl. Spectroscop. 1993, 47, 1079–1983. [Google Scholar] [CrossRef]
- Büttner, G. Use of NIR Analysis for refineries. Process Control Qual. 1997, 9, 197–203. [Google Scholar]
- Roche, M.; Helle, M.; Saxén, H. Principal Component Analysis of Blast Furnace Drainage Patterns. Processes 2019, 7, 519. [Google Scholar] [CrossRef] [Green Version]
- Reis, M.S.; Gins, G. Industrial Process Monitoring in the Big Data/Industry 4.0 Era: From Detection. to Diagnosis. to Prognosis. Processes 2017, 5, 35. [Google Scholar] [CrossRef] [Green Version]
- Palací-López, D.; Borràs-Ferrís, J.; da Silva de Oliveria, L.T.; Ferrer, A. Multivariate Six Sigma: A Case Study in Industry 4.0. Processes 2020, 8, 1119. [Google Scholar] [CrossRef]
- Severson, K.A.; Molaro, M.C.; Braatz, R.D. Principal Comp Analysis of Process Datasets with Missing Values. Processes 2017, 5, 38. [Google Scholar] [CrossRef] [Green Version]
- Pearson, K. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef] [Green Version]
- Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 1933, 24, 417–441. [Google Scholar] [CrossRef]
- Jolliffe, I.T. Principal Component Analysis, 2nd ed.; Springer: New York, NY, USA, 2002. [Google Scholar]
- McDonald, B.F.; Prebble, K.A.J. Some applications of near-infrared reflectance analysis in the pharmaceutical industry. J. Pharm. Biomed. Anal. 1993, 11, 1077–1085. [Google Scholar] [CrossRef]
- Gupta, M.R. Understanding Stochastic Optimization of PCA, PLS and CCA Problem with a Focus on their Performance in Noisy Settings; Visvesvaraya Technological University: Karnataka, India, 2019. [Google Scholar] [CrossRef]
- Padilha, L.; Carretoni, C.F.; Machado, F.; Nele, M.; Pinto, J.C. Analysis of polyolefin compositions through near infrared spectroscopy. J. Appl. Polym. Sci. 2014, 131, 40127. [Google Scholar] [CrossRef]
- Hong, D.; Balzano, L.; Fessler, J.A. Assymptotic performance of PCA for high-dimensional heteroscedastic data. J. Multivar. Anal. 2018, 167, 435–452. [Google Scholar] [CrossRef] [PubMed]
- Santos, A.F.; Pinto, J.C.; Silva, M.F.; Lenzi, M.K. Monitoring and control of polymerization reactors using NIR Spectroscopy. Polym.-Plast. Technol. Eng. 2005, 44, 1–61. [Google Scholar] [CrossRef]
- Bhatt, N.P.; Mitna, A.; Narasimhan, S. Multivariate calibration of Non replicated measurements for heteroscedastic errors. Chemom. Intell. Lab. Syst. 2005, 85, 70–81. [Google Scholar] [CrossRef] [Green Version]
- Rinnan, A.; van der Berg, F.; Engelsen, S.B. Review of the most common pre-processing techniques for near-infrared spectra. Trends Anal. Chem. 2009, 8, 1201–1222. [Google Scholar] [CrossRef]
- Monteiro, A.R.D.; Feital, T.S.; Pinto, J.C. Statistical Aspects of Near-Infrared Spectroscopy for Characterization of Errors and Model Building. Appl. Spectrosc. 2017, 71, 1665–1676. [Google Scholar] [CrossRef]
- Shah, S.L.; Narasimhan, S. Model identification and error covariance matrix estimation from noisy data using PCA. Control Eng. Pract. 2008, 16, 146–155. [Google Scholar]
- Wentzell, P.D.; Andrews, D.T.; Hamilton, D.C.; Faber, K.; Kowalsky, B.R. Maximum Likelihood Component Analysis”. J. Chemom. 1997, 11, 339–366. [Google Scholar] [CrossRef]
- Wentzell, P.D.; Legar, L.; Vega-Montoto, M.N. Methods for systematic investigation of measurement error covariance matrices. Chemom. Intell. Lab. Syst. 2005, 77, 181–205. [Google Scholar]
- Hong, D.; Gilman, K.; Balzano, L.; Fessler, J.A. HePPCAT: Probabilistic PCA for data with Heteroscedastic noise. IEEE Transactions on Signal Processing 2021, 69, 4819–4834. [Google Scholar] [CrossRef]
- Larentis, A.L.; Bentes, A.M.P., Jr.; Resende, N.S.; Salim, V.M.M.; Pinto, J.C. Analysis of Experimental Errors in Catalytic Tests for Production of Synthesis Gas. Appl. Catal. A Gen. 2003, 2, 365–379. [Google Scholar] [CrossRef]
- Ros, S.D.; Jones, M.D.; Mattia, D.; Schwaab, M.; Coutinho, E.B.; Neto, R.C.R.; Noronha, F.B.; Pinto, J.C. Microkinetic Analysis of Ethanol to 1.3-Butadiene Reactions over MgO-SiO2 Catalysts Based on Characterization of Experimental Fluctuations. Chem. Eng. J. 2017, 308, 988–1000. [Google Scholar] [CrossRef] [Green Version]
- Haaland, D.M.; Chambers, W.B.; Keenan, M.R.; Melgaard, D.K. Multi-window classical least-squares multivariate calibration methods for quantitative ICP-AES analyses. Appl. Spectrosc. 2000, 54, 1291–1302. [Google Scholar] [CrossRef] [Green Version]
- Haaland, D.M.; Melgaard, D.K. New prediction-augmented classical least-squares (PACLS) methods: Application to unmodeled interferents. Appl. Spectrosc. 2000, 54, 1303–1312. [Google Scholar] [CrossRef] [Green Version]
- Haaland, D.M.; Melgaard, D.K. New Classical Least-Squares/Partial Least-Squares Hybrid Algorithm for Spectral Analyses. Appl. Spectrosc. 2001, 55, 1–8. [Google Scholar] [CrossRef]
- Melgaard, D.K.; Haaland, D.M.; Wehlburg, C.M. Concentration Residual Augmented Classical Least Squares (CRACLS): A Multivariate Calibration Method with Advantages over Partial Least Squares. Appl. Spectrosc. 2002, 56, 615–624. [Google Scholar] [CrossRef]
- Bouckaert, R.R.; Frank, E.; Holmes, G.; Fletcher, D. A comparison of methods for estimating prediction intervals in NIR spectroscopy: Size matters. Chemom. Intell. Lab. Syst. 2011, 109, 139–145. [Google Scholar] [CrossRef] [Green Version]
- Enke, D.; Zhong, X. Forecasting daily stock market return using dimensionality reduction. Expert Syst. Appl. 2017, 67, 126–139. [Google Scholar]
- Dervilis, N.; Shi, H.; Worden, K.; Cross, E.J. Exploring Environmental and Operational Variations in SHM Data Using Heteroscedastic Gaussian Processes. In Dynamics of Civil Structures; Pakzad, S., Juan, C., Eds.; Conference Proceedings of the Society for Experimental Mechanics Series; Springer: Cham, Switzerland, 2016; Volume 2, pp. 145–153. [Google Scholar]
- Marbach, R. On Wiener filtering and the physics behind statistical modeling. J. Biomed. Opt. 2002, 7, 130–147. [Google Scholar] [CrossRef]
- Marbach, R. A new method for multivariate calibration. J. Near Infrared Spectrosc. 2005, 13, 241–254. [Google Scholar] [CrossRef]
- Feital, T.; Kruger, U.; Dutra, J.; Pinto, J.C.; Lima, E.L. Modeling and Performance Monitoring of Multivariate Multimodal Processes. AIChE J. 2013, 59, 1557–1569. [Google Scholar] [CrossRef]
- Wentzell, P.D.; Andrews, D.T.; Kowalsky, B.R. Maximum likelihood multivariate calibration. Anal. Chem. 1997, 69, 2299–2311. [Google Scholar] [CrossRef]
- Riley, M.R.; Crider, H.M. The Effect of Analyte Concentration Range on Measurement Errors Obtained by NIR Spectroscopy. Talanta 2000, 52, 473–484. [Google Scholar] [CrossRef]
- Bro, R. Multivariate calibration—What is in chemometrics for the analytical chemist? Anal. Chim. Acta 2003, 500, 185–194. [Google Scholar] [CrossRef]
- Ferreira, M.M.C.; Antunes, A.M.; Melgo, M.S.; Volpe, P.L.O. Quimiometria I: Calibração multivariada. um tutorial. Quim. Nova 1999, 22, 724–731. [Google Scholar] [CrossRef]
- Overall, J.E.; Klett, C.J. Applied Multivariate Analysis; McGraw-Hill Book Company: New York, NY, USA, 1972. [Google Scholar]
- Tobias, R.D. An Introduction to Partial Least Squares Regression; SAS Institute Inc.: Cary, NC, USA, 1995. [Google Scholar]
- Abdi, H. Partial Least Squares (PLS) Regression. In Encyclopedia for Research Methods for the Social Sciences; Sage: Thousand Oaks, CA, USA, 2003; pp. 792–795. [Google Scholar]
- Shlens, J. A Tutorial on Principal Component Analysis; Center of Neural Science: New York, NY, USA, 2009. [Google Scholar]
- Ge, Z.; Song, Z. Process Monitoring Based on Independent Component Analysis—Principal Component Analysis (ICA-PCA) and Similarity Factors. Ind. Eng. Chem. Res. 2007, 46, 2054–2063. [Google Scholar] [CrossRef]
- Hatcher, L. A Step-by-Step Approach to Using the SAS System for Factor Analysis and Structural Equation Modeling; SAS Institute Inc.: Cary, NC, USA, 1994. [Google Scholar]
- Feital, T. Monitoramento da Condição de Processos Industriais. Ph.D. Thesis, Rio de Janeiro Federal University, Rio de Janeiro, Brazil, 2014. [Google Scholar]
- Schwab, M.; Pinto, J.C. Análise de Dados Experimentais I. Fundamentos de Estatística e Estimação de Parâmetros, 1st ed.; E-Papers: Rio de Janeiro, Brazil, 2007. [Google Scholar]
- Wentzell, P.D.; Lohnes, M.T. Maximum Likelihood principal component analysis with correlated measurement errors: Theoretical and practical considerations. Chemom. Intell. Lab. Syst. 1999, 45, 65–85. [Google Scholar] [CrossRef]
- Lima, E.L. Álgebra Linear. In Coleção Matemática Universitária, 10th ed.; IMPA Publication: Rio de Janeiro, Brazil, 2020. [Google Scholar]
- Figueredo, D.G. Análise de Fourier e Equações Diferenciais Parciais. In Coleção Projeto Euclides; IMPA: Rio de Janeiro, Brazil, 1977. [Google Scholar]
- SAS Institute INC. SAS/STATTM SAS User’s Guide; SAS Institute Inc.: Cary, NC, USA, 2021. [Google Scholar]
Xylene Concentration [v/v%] | Toluene Concentration [v/v%] | Temperature [°C] | Stirring Speed [rpm] | ||
---|---|---|---|---|---|
0 | 100 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
10 | 90 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
20 | 80 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
30 | 70 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
40 | 60 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
50 | 50 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
60 | 40 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
70 | 30 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
80 | 20 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
90 | 10 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 | ||
100 | 0 | 30 | 250 | 350 | 450 |
60 | 250 | 350 | 450 | ||
90 | 250 | 350 | 450 |
Concentration [wt Fraction] | Prediction Variance [wt Fraction]2 | F Value * |
---|---|---|
0 | 0.001424 | 4.387 |
0.10 | 0.001723 | 5.331 |
0.20 | 0.002378 | 7.329 |
0.30 | 0.002293 | 7.068 |
0.40 | 0.007016 | 21.625 |
0.50 | 0.001521 | 4.688 |
0.60 | 0.02823 | 8.700 |
0.70 | 0.002033 | 6.267 |
0.80 | 0.001014 | 3.127 |
0.90 | 0.0000418 | 1.289 |
1.00 | 0.000324 | 1.000 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Monteiro, A.d.R.D.; Feital, T.d.S.; Pinto, J.C. A Numerical Procedure for Multivariate Calibration Using Heteroscedastic Principal Components Regression. Processes 2021, 9, 1686. https://doi.org/10.3390/pr9091686
Monteiro AdRD, Feital TdS, Pinto JC. A Numerical Procedure for Multivariate Calibration Using Heteroscedastic Principal Components Regression. Processes. 2021; 9(9):1686. https://doi.org/10.3390/pr9091686
Chicago/Turabian StyleMonteiro, Alessandra da Rocha Duailibe, Thiago de Sá Feital, and José Carlos Pinto. 2021. "A Numerical Procedure for Multivariate Calibration Using Heteroscedastic Principal Components Regression" Processes 9, no. 9: 1686. https://doi.org/10.3390/pr9091686