Next Article in Journal
Assessing the Selection of Digital Learning Materials: A Facet of Pre-Service Teachers’ Digital Competence
Previous Article in Journal
Analysis of Vibration Characteristics of the Grading Belt in Wolfberry Sorting Machines
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Miniaturized Near-Infrared Analyzer for Quantitative Detection of Trace Water in Ethylene Glycol

1
School of Optoelectronic Engineering, Guilin University of Electronic Technology, Guilin 541004, China
2
Guangxi Key Laboratory of Optoelectronic Information Processing, Guilin 541004, China
3
School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin 541004, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(11), 6023; https://doi.org/10.3390/app15116023
Submission received: 7 March 2025 / Revised: 14 May 2025 / Accepted: 26 May 2025 / Published: 27 May 2025

Abstract

:
To address the limitations of a traditional Fourier-transform infrared (FTIR) spectrometer, including its bulky size, high cost, and unsuitability for on-site industrial detection, this study developed a Fourier-transform near-infrared (FT-NIR) absorption testing system utilizing Micro-Electro-Mechanical System (MEMS) technology for detecting trace water content in ethylene glycol. The modeling performances of three algorithms including Support Vector Machine Regression (SVMR), Principal Component Regression (PCR), and Partial Least Squares Regression (PLSR) were systematically evaluated, with PLSR identified as the optimal algorithm. To enhance predictive accuracy of water trace, spectral data were preprocessed using smoothing combined with first-derivative processing, and optimal selection of absorption wavelength feature was performed using interval Partial Least Squares (iPLS). Cross-batch external validation demonstrated a Limit of Detection (LOD) of 0.026% with 95% confidence which satisfies the rapid screening requirements for water exceedances (>0.1%) in industrial applications. These findings provide a robust technical foundation for developing handheld, in situ water detection devices.

1. Introduction

Ethylene glycol is a critical, chemical, raw material and a strategic resource for various industries. It is utilized in the production of polyesters, explosives, ethylene glycol aldehyde, as well as antifreeze, plasticizers, hydraulic fluids, and solvents [1]. Controlling water content during ethylene glycol production is essential for ensuring the stability and efficiency of the process. The water content serves as a vital indicator for assessing the quality grade of polyester-grade ethylene glycol. According to relevant standards, first-grade ethylene glycol must contain less than 0.1% (w) water [2], which is thus pivotal for quality control. Regularly measuring the water content of ethylene glycol samples during production enables manufacturers to promptly identify and address issues, optimize production processes, and enhance both product quality and output. Currently, the standard method for measuring water content in industrial liquids is Karl Fischer (K-F) titration. While effective, this traditional technique is relatively complicated and time-consuming [3,4].
Water molecules demonstrate significant absorption characteristics in the near-infrared (NIR) region, primarily due to O-H stretching and combination vibrations [5,6,7]. This property makes NIR spectroscopy a powerful tool for quantitative moisture analysis. The NIR wavelength range, extending from 780 to 2500 nm, captures overtone and combination vibration absorption information from hydrogen-containing chemical groups such as C-H, O-H, and N-H, encompassing almost all organic compounds and their mixtures [8,9,10,11,12].
When materials are irradiated with NIR light at wavelengths sensitive to water, different moisture levels produce varying absorption intensities. Consequently, measuring light intensity allows for the indirect determination of moisture content. The strong absorption of O-H groups in the NIR region renders NIR spectroscopy particularly effective for determining trace water content. By analyzing the reflected or transmitted light intensity after NIR irradiation, moisture content can be assessed effectively [13,14,15].
Numerous studies highlight the versatility of NIR spectroscopy in moisture analysis. For instance, Lanjewar et al. employed portable NIR spectroscopy combined with machine learning to predict water adulteration in milk [16]. Mallet et al. used NIR spectroscopy with nonlinear methods to rapidly and robustly characterize moisture effects in raw organic waste [17]. Cichosz et al. applied simple mathematical models in conjunction with NIR spectroscopy to calculate moisture content in cellulose [18]. Furthermore, Sheng et al. implemented a data fusion strategy that combined micro-NIR spectroscopy and machine vision for the rapid prediction of moisture content during black tea drying [19]. In summary, NIR spectroscopy offers high precision, quick response times, no-contact online measurement, and poses no radiation risk to humans. These advantages make it widely applicable across various production fields, revolutionizing moisture analysis [20,21].
The above conventional near-infrared water analyzers are typically large and rely on a light path structure that incorporates a halogen lamp and diffraction grating, often with motorized components [22]. This design limits their portability and confines their use to laboratory environments. In contrast, Fourier-transform near-infrared spectrometers (FT-NIR) based on MEMS technology offer significant advantages [23,24]. For example, a product launched by the Hamamatsu Corporation measures just 8.7 cm ∙ 4.9 cm and achieves a spectral resolution (FWHM) of 5.7 nm. Although this resolution is slightly lower than that of traditional instruments, it is adequate for routine testing. Successful applications of portable FT-NIR spectrometers include online monitoring of water content and mass transfer rates in fluidized beds [25], as well as water detection in biodiesel using portable infrared spectrometers (FTIR) [26]. These applications have shown excellent detection accuracy and stability. Such case studies underscore the potential of MEMS-based FT-NIR systems for solvent water detection, effectively overcoming the limitations of traditional FT-NIR spectrometers, which tend to be too large and expensive for widespread on-site use [27,28,29].
This study focuses on ethylene glycol solutions by developing an absorption testing system using a portable near-infrared MEMS spectrometer. The absorption spectra of ethylene glycol were measured in the wavelength range of 1100 to 2500 nm. To address the incoherent and redundant information present in the spectra, interval partial least squares (iPLS) was employed to optimize the selection of characteristic wavelength variables and identify wavelength bands that are highly correlated with water content. Building on this, a quantitative analysis model for water content was established using partial least squares regression (PLSR), enabling efficient detection of water in ethylene glycol.

2. Materials and Methods

2.1. Experimental Materials and Instruments

  • Materials: ethylene glycol (analytical grade, purity ≥99.8%, National Pharmaceutical Group Chemical Reagent Co., Ltd., Nanjing, China) and mineral water (Hangzhou Wahaha Group Co., Ltd., Hangzhou, China).
  • Instruments: A C15511-01 Fourier-transform near-infrared spectrometer utilizing MEMS technology (Hamamatsu Corporation, San Jose, CA, USA) was employed, with a spectral range of 1100 to 2500 nm and a resolution of 25 cm−1. The specific parameters of the spectrometer are shown in Table 1. Other equipment included a halogen light source (fiber-optic output) and a quartz cuvette with a path length of 10 mm.

2.2. Sample Preparation

According to the experimental requirements, ethylene glycol was used as the matrix, and the required amount of water was precisely weighed using an analytical balance and added to the ethylene glycol. The prepared solution was stirred on a magnetic stirrer at 500 rpm for 3 min, resulting in 36 samples of ethylene glycol with varying water concentration (called theoretical value). The moisture concentration range was set between 0.002% and 1%, ensuring a gradient distribution of water content across the samples. Among these, 28 sets of data were designated as the training set for model development, and 8 sets of data were reserved as the validation set for model verification. A detailed overview is provided in Table 2.

2.3. Near-Infrared Spectral Acquisition

The laboratory-built testing setup is depicted in Figure 1 and comprises a halogen light source, a cuvette holder, and a miniature Fourier-transform near-infrared spectrometer with a spectral range of 1100 to 2500 nm and a resolution of 25 cm−1. Initially, the background spectrum was recorded using an empty quartz cuvette with a 10 mm path length to eliminate interference. Subsequently, the ethylene glycol aqueous solution samples were placed in the cuvette, and their absorption spectra were collected. Each sample was measured six times to ensure data reliability. Laboratory temperature and humidity were maintained at relatively constant levels throughout the experiment.
This detection scheme simplifies experimental setup, which can be integrated into a compact and lightweight design to facilitate on-site rapid detection, significantly enhancing the convenience and flexibility of experimental operations. Throughout the spectral acquisition process, laboratory temperature and humidity were maintained at relatively constant levels.

2.4. Near-Infrared Modeling Methods

2.4.1. Spectral Data Preprocessing

Raw near-infrared spectral data contain valuable information about chemical composition, but they can also be influenced by interferences such as background noise, baseline drift, and molecular scattering. To mitigate these interferences and enhance the predictive accuracy of the model, preprocessing of spectral data is a crucial step prior to modeling [30,31]. In this study, several preprocessing techniques were employed, including Savitzky–Golay first derivative (FD), Savitzky–Golay smoothing (S-G-S), standard normal variate (SNV), and unit vector normalization [32]. Each method has unique characteristics and yields different effects: the FD effectively removes baseline drift and background interference, the S-G-S improves the signal-to-noise ratio, the SNV corrects for light scattering, and normalization mitigates the effects of path length variations or sample dilution on the spectra.

2.4.2. Model Establishment and Analysis

In this study, we first established models using partial least squares regression (PLSR), principal component regression (PCR), and support vector machine regression (SVMR) based on raw data without preprocessing, and then compared their performance. By evaluating the correlation coefficient for the training set (Rc), the root mean square error of calibration (RMSEC), the correlation coefficient for the validation set (Rp), and the root mean square error of prediction (RMSEP), the optimal algorithm (PLSR) was selected. Subsequently, the model’s performance was further optimized by integrating various spectral preprocessing methods (e.g., smoothing, first derivative, etc.), ultimately constructing a high-precision predictive model. Additionally, the Prediction Residual Error Sum of Squares (PRESS) was introduced as a key evaluation metric to quantify the cumulative prediction errors across validation samples, providing a comprehensive measure of the model’s generalization ability. The results indicate that the closer the R2 value is to 1 and the lower the root mean square error, the stronger the model’s predictive capability, while the PRESS value offers supplementary validation of the model’s robustness from a statistical perspective [33].
All multivariate calibration models (PLSR, PCR, and SVMR) were developed using Python 3.8 programming language within the PyCharm Community Edition 2020.3.5 x64 integrated development environment (JetBrains, Prague, Czech Republic). The scikit-learn library (version 1.0.2) was employed for algorithm implementation, including spectral preprocessing (Savitzky–Golay smoothing, first-derivative transformations) and model training. Interval Partial Least Squares (iPLS) analysis was conducted using custom scripts adapted from the pyChemometrics package (version 0.1.5). Statistical calculations for the Limit of Detection (LOD) utilized SciPy’s statistical module (version 1.7.3), ensuring rigorous validation of results. Data visualization and spectral analysis were performed using Matplotlib (version 3.5.1) and pandas (version 1.3.5).

2.4.3. Limit of Detection (LOD)

The Limit of Detection (LOD) refers to the minimum concentration or quantity of a target substance that can be reliably detected at a specified confidence level [34]. As a critical metric for assessing the sensitivity of an analytical method, the LOD ensures the quality and reliability of experimental data, particularly when dealing with complex datasets and multivariate regression models [35]. To calculate the LOD rigorously, we employed a prediction error-based approach in conjunction with cross-validation to enhance the robustness of our results. The specific formula used is as follows:
L O D = t α , n p · R M S E P s l o p e
where
t α , n p represents the critical value of Student’s t-distribution at a confidence level α , with n p degrees of freedom (n: total number of samples; p: number of latent variables).
RMSEP (Root Mean Square Error of Prediction) quantifies the predictive accuracy of the model on an external validation set. Slope denotes the regression coefficient of the PLSR calibration curve, reflecting the linear relationship between the target concentration and the spectral response.
This formulation ensures statistical rigor in determining the detection threshold while aligning with industrial requirements for practical applicability.

3. Results and Discussion

3.1. Division of Sample Sets

In this study, 80% of the samples were randomly divided into a training set, while the remaining 20% were allocated to a validation set. To improve the model’s predictive capability and accuracy, we intentionally included samples with both the highest and lowest water content in the training set. As a result, 28 sample groups were used for model training, with 8 groups set aside for validation. The detailed distribution of the sample groups is summarized in Table 2.
As shown in Table 1, the water content of the samples ranged from 0.002% to 1.000%. The training set encompassed the entire concentration range (0.002% to 1.000%), while the validation set covered 0.005% to 0.850%. This distribution ensures a comprehensive evaluation of the model’s generalizability across varying water levels.

3.2. Absorption Fingerprint Analysis

Figure 2 presents the raw near-infrared (NIR) spectra of 36 ethylene glycol samples. The observed absorption peaks result from molecular anharmonic vibrations, which facilitate transitions from the ground state to higher energy levels [36]. The spectral profiles indicate that the absorption intensity at 1940 nm significantly increases with water content. This peak is attributed to the O-H stretching and combination band absorption of water molecules, serving as a primary indicator of water-specific spectral responses and illustrating the high sensitivity of the NIR spectroscopy to trace amounts of water [37].
The characteristic absorption peak of ethylene glycol is observed at 1700 nm, corresponding to the C-H combination band. This sharp peak, which is distinct from water-related signals with minimal overlap, serves as a reliable marker for the identification of ethylene glycol. Furthermore, in the 2200–2500 nm range, the combination bands of C-H and O-H vibrations in ethylene glycol are clearly discernible. These peaks result from the combination modes of C-H and O-H bonds, providing essential spectral evidence for the recognition of ethylene glycol.
By comparing the water absorption peak at 1940 nm with the C-H characteristic peak of ethylene glycol at 1700 nm, the spectral signals of water and ethylene glycol can be clearly distinguished, allowing for precise differentiation between the two components.

3.3. Establishment and Optimization of Water Content Prediction Models

3.3.1. Comparison of Different Models

Three near-infrared spectroscopy models were developed to analyze the water content of ethylene glycol using Partial Least Squares Regression (PLSR), Support Vector Machine Regression (SVMR), and Principal Component Regression (PCR) across the entire spectral range. The performance of these models was assessed and compared based on key metrics, including the training set correlation coefficient (Rc), root mean square error of calibration (RMSEC), validation set correlation coefficient (Rp), root mean square error of prediction (RMSEP), and Prediction Residual Error Sum of Squares (PRESS). A correlation coefficient closer to 1 indicates stronger predictive capability, while a smaller root mean square error signifies higher model accuracy. The comparative results are summarized in Table 3.
As shown in Table 2, the partial least squares regression (PLSR)-based model exhibited a lower root mean square error of prediction (RMSEP) compared to principal component regression (PCR) and support vector machine regression (SVMR). The superiority of PLSR stems from its unique mathematical framework—by maximizing the covariance between spectral variables and water content, it effectively addresses the multicollinearity prevalent in near-infrared (NIR) spectral data. Unlike PCR (which relies solely on dimensionality reduction in spectral variables) and SVMR (which depends on kernel function mapping), PLSR accurately extracts key information associated with water content even under highly correlated variable conditions, thereby significantly enhancing predictive robustness. Although deep learning approaches such as artificial neural networks (ANN) may offer higher accuracy in large-sample scenarios, their computational complexity (requiring >10⁴ iterations) and overfitting risks with small sample sizes (n = 36) render them impractical for real-world applications in this study.

3.3.2. Comparison of Different Pretreatment Methods

Following the identification of PLSR as the optimal modeling method in the preceding comparison, preprocessing of the NIR spectral data was conducted to eliminate external interference and enhance model performance. The NIR spectral acquisition can be affected by various external factors, including sample color, particle size, physical state, and light scattering effects, all of which can compromise spectral accuracy to varying degrees. Baseline drift and signal fluctuations, primarily induced by these factors, necessitate appropriate preprocessing to mitigate their impact. Techniques such as the Savitzky–Golay first derivative (FD), Savitzky–Golay smoothing (S-G-S), Standard Normal Variate (SNV), and unit vector normalization were applied to the raw spectra. Each method possesses distinct characteristics, and their combinations produce varied outcomes.
As shown in Table 4, The model that combined FD with PLSR demonstrated the best predictive performance, achieving (Rc = 0.989) and (RMSEC = 1.414 ∙ 10−2%) for the training set, as well as (Rp = 0.992) and (RMSEP = 1.451 ∙ 10−2%) for the validation set. As illustrated in Figure 3, the preprocessed spectra after FD smoothing showed a significant enhancement of the O-H combination band absorption peak near 1940 nm, which is closely associated with water content. This approach effectively suppressed noise and baseline drift, accentuating subtle variations in water-related spectral features while minimizing background interference. Consequently, this preprocessing method allows the PLSR model to achieve optimal performance in predicting trace water concentrations in ethylene glycol solutions.

3.4. Selection of Feature Bands

Selecting appropriate spectral bands for quantitative analysis is a critical preprocessing step in NIR spectroscopy. Full-spectrum data often contain excessive redundant information, which can compromise model accuracy and stability while increasing processing time and computational demands, thereby hindering rapid online detection. Consequently, wavelength screening is essential to prioritize bands that contain characteristic absorption peaks of the target analyte, ultimately enhancing predictive precision [38,39].
In this study, the spectral range of 1100–2500 nm was divided into equidistant sub-bands. The number of sub-bands was incrementally increased until further segmentation ceased to enhance feature band selection. Ultimately, the range was partitioned into 12 equidistant sub-bands. Partial Least Squares Regression (PLSR) was applied to each sub-band to develop quantitative calibration models for ethylene glycol water content [40]. Feature bands were identified by comparing the root mean square error of cross-validation (RMSECV) across sub-bands. Those with RMSECV values less than or comparable to the full-spectrum model were deemed optimal.
As shown in Figure 4, the 1800–1900 nm and 1900–2050 nm bands exhibited the lowest RMSECV values (both below 0.5), indicating a strong responsiveness to water content. To further refine the model, these adjacent bands were merged into a continuous spectral region (1800–2050 nm), which was ultimately designated as the primary feature band for modeling.

3.5. Model Development and Evaluation for Full-Spectrum and Feature Bands

Based on the previously discussed selected feature bands, we compared the modeling performance across three spectral ranges: the full spectrum (1100–2500 nm), 1100–1470 nm, and 1800–2050 nm. The 1100–1470 nm range includes the O-H stretching first overtone absorption peak of water molecules at 1450 nm, while the 1800–2050 nm range contains the O-H combination band characteristic peak at 1940 nm. Three Partial Least Squares Regression (PLSR) models corresponding to these bands were developed to predict ethylene glycol water content. The models’ performance was evaluated using the training set correlation coefficient (Rc), root mean square error of calibration (RMSEC), validation set correlation coefficient (Rp), and root mean square error of prediction (RMSEP).
As shown in Table 5, the PLSR model utilizing the 1800–2050 nm feature band outperformed the full-spectrum model, exhibiting higher Rc and Rp values alongside significantly lower RMSEC and RMSEP. This improvement is primarily attributed to the inclusion of the O-H combination band (1940 nm), which demonstrates the strongest spectral response to water content. In contrast, the 1100–1470 nm band performed less effectively, showing higher RMSEC and RMSEP values than the full-spectrum model. This discrepancy likely results from the weaker signal intensity of the O-H first overtone peak (1450 nm) in the 1100–1470 nm range, which is more susceptible to interference from the ethylene glycol matrix and other factors, thereby diminishing prediction accuracy.
These findings confirm that the 1800–2050 nm band, characterized by the prominent O-H combination absorption peak at 1940 nm, is the optimal spectral interval for quantitative water analysis in ethylene glycol. While the 1100–1470 nm band is unsuitable for standalone use due to weak signals and potential interference, the full-spectrum model still offers practical value in certain scenarios, providing balanced performance.

3.6. Model Validation

To verify the reliability and generalizability of the models, additional tests were conducted using independently prepared ethylene glycol samples with varying water content. The validation set consisted of 13 such samples, with water content ranging from 0.003% to 0.7%. As shown in Figure 5 and Table 6, respectively, the PLSR model achieved a coefficient of determination of RP2 = 0.998 and a root mean square error of prediction (RMSEP) of 1.154 × 10−2%, confirming strong consistency between the predicted and measured values. Table 6 shows that when the water concentrations of these samples were higher than 0.02%, their relative prediction errors were less than 8%.

3.7. Calculation of the Limit of Detection (LOD)

The Limit of Detection (LOD) was calculated using the method described in Section 2.4.2 to evaluate the sensitivity of the analytical approach. Key parameters included the following:
  • Confidence level: α = 0.05 (95% confidence);
  • External validation dataset: n = 13 samples;
  • Number of latent variables: p = 3, resulting in degrees of freedom df = np = 10;
  • Critical t-value: t 0.05,10 ≈ 2.228 (determined using Python’s SciPy library);
  • Root Mean Square Error of Prediction (RMSEP): 1.154 ∙ 10−4;
  • Slope of the calibration curve: 0.997.
These values are substituted into Equation (1):
L O D = t α , n p · R M S E P s l o p e = 2.228 · 1.154 · 10 4 0.997 = 0.026 %
Notably, the calculated LOD (0.026%) integrates both prediction error (RMSEP) and confidence level (α = 0.05), providing a statistically robust detection threshold rather than relying solely on experimental concentration limits. The derived LOD (0.026%) satisfies the rapid screening requirements for water exceedances in industrial ethylene glycol, where the quality control standard for premium-grade products is typically ≤0.1%.
While the model achieves reliable predictions for water content ≥ 0.26% (R2 = 0.997, RMSEP = 0.0115%), performance degrades significantly below this threshold due to spectral noise dominance. Relative errors for samples < 0.026% should be interpreted cautiously, as they exceed the model’s intended operational range. These results emphasize the system’s suitability for rapid screening of non-compliant ethylene glycol (H2O > 0.1%), while highlighting the need for advanced signal processing to address low-concentration limitations.

4. Conclusions

This study demonstrates the successful integration of a MEMS-based NIR spectrometer with chemometric algorithms for rapid water quantification in ethylene glycol. The optimized PLSR model achieved a detection limit (LOD) of 0.026%, four times lower than the industrial control threshold (0.1%), ensuring reliable identification of non-compliant batches. By employing interval Partial Least Squares (iPLS), critical absorption bands at 1450 nm and 1940 nm—corresponding to O-H stretching overtones—were prioritized, enhancing model interpretability compared to conventional “black-box” approaches. Despite these advancements, limitations exist; predictions for ultra-low concentrations (<0.026%, e.g., sample 13 with 0.003% H2O) exhibited amplified errors (375.63%) due to spectral noise dominance, and validation under controlled laboratory conditions may not fully represent industrial field environments with temperature fluctuations or particulate interference.
To address these challenges and advance the technology toward industrial adoption, future efforts will focus on three interconnected objectives. First, algorithmic enhancements will integrate hybrid CNN-PLSR architectures to capture nonlinear spectral relationships in expanded datasets, targeting an LOD reduction to 0.01%. Concurrently, multi-matrix adaptability will be prioritized via transfer learning frameworks, enabling rapid adaptation to related glycol-based solvents (e.g., propylene glycol) with minimal target-domain samples (<50). Critically, the current model supports qualitative screening of non-compliant samples (H2O ≥ 0.026%) and provides reliable quantitative analysis within 0.05–0.7% (RMSEP = 0.0115%, R2 = 0.997). For trace-level detection (<0.026%), cross-validation via GC-MS or sample pre-concentration is strongly recommended to ensure data credibility. These innovations aim to bridge the gap between laboratory validation and field deployment, ensuring scalability across diverse industrial scenarios.

Author Contributions

Conceptualization, Q.L. and Y.R.; methodology, Q.L. and Z.G.; validation, Q.L., Y.R. and Z.G.; data curation, Q.L.; writing—original draft preparation, Q.L.; writing—review and editing, Y.R., D.L., B.C. and Q.L.; visualization, Z.G.; supervision, Y.R. and D.L.; project administration, Y.R. and D.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (Grant no. 62275058), Guangxi Science and Technology Program Project (No. AD25069073), Natural Science Foundation of Guangxi Province (No. 2023GXNSFAA026259). Innovation Project of GUET Graduate Education (2023YCXS228), Funding of Guangxi Key Laboratory of Opto-electronic Information Processing (No. GD24105), Middle-aged and Young Teachers’ Basic Ability Promotion Project of Guangxi (2024KY0802, 2025KY0240).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yang, Q.; Yang, Q.; Xu, S.; Zhu, S.; Zhang, D. Technoeconomic and environmental analysis of ethylene glycol production from coal and natural gas compared with oil-based production. J. Clean. Prod. 2020, 273, 123120. [Google Scholar] [CrossRef]
  2. Yang, H.B. Determination and Exploration of Water Content in the EG Products Process. Guangzhou Chem. Ind. 2012, 40, 127–128. [Google Scholar] [CrossRef]
  3. Rivera-Quintero, P.; Patience, G.S.; Patience, N.A.; Boffito, D.C.; Banquy, X.; Schieppati, D. Experimental Methods in Chemical Engineering: Karl Fischer Titration. Can. J. Chem. Eng. 2024, 102, 2980–2997. [Google Scholar] [CrossRef]
  4. Kestens, V.; Conneely, P.; Bernreuther, A. Vaporisation coulometric Karl Fischer titration: A perfect tool for water content determination of difficult matrix reference materials. Food Chem. 2008, 106, 1454–1459. [Google Scholar] [CrossRef]
  5. Liu, J.X. Practical Near-Infrared Spectroscopy Analysis Technology; Science Press: Beijing, China, 2008. [Google Scholar]
  6. Pyper, J.W. The determination of moisture in solids: A Selected Review. Anal. Chim. Acta 1985, 170, 159–175. [Google Scholar] [CrossRef]
  7. Wang, S.; Altaner, C.; Feng, L.; Liu, P.; Song, Z.; Li, L.; Gui, A.; Wang, X.; Ning, J.; Zheng, P. A Review: Integration of NIRS and Chemometric Methods for Tea Quality Control—Principles, Spectral Preprocessing Methods, Machine Learning Algorithms, Research Progress, and Future Directions. Food Res. Int. 2025, 205, 115870. [Google Scholar] [CrossRef]
  8. Lu, W.Z. Modern Near-Infrared Spectroscopy Analysis Technology, 2nd ed.; China Petrochemical Press: Beijing, China, 2007; Volume 36. [Google Scholar]
  9. Alamar, P.D.; Carames, E.T.S.; Poppi, R.J.; Pallone, J.A.L. Quality evaluation of frozen guava and yellow passion fruit pulps by NIR spectroscopy and chemometrics. Food Res. Int. 2016, 85, 209–214. [Google Scholar] [CrossRef]
  10. Baranwal, Y.; Roman-Ospino, A.D.; Keyvan, G.; Ha, J.M.; Hong, E.P.; Muzzio, F.J.; Ramachandran, R. Prediction of Dissolution Profiles by Non-Destructive NIR Spectroscopy in Bilayer Tablets. Int. J. Pharm. 2019, 565, 419–436. [Google Scholar] [CrossRef]
  11. Bittner, M.; Krahmer, A.; Schenk, R.; Springer, A.; Gudi, G.; Melzig, M.F. NIR Spectroscopy of Actaea racemosa L. Rhizome—En Route to Fast and Low-Cost Quality Assessment. Planta Med. 2017, 83, 1085–1096. [Google Scholar] [CrossRef]
  12. Rodionova, O.Y.; Titova, A.V.; Demkin, N.A.; Balyklova, K.S.; Pomerantsev, A.L. Qualitative and quantitative analysis of counterfeit fluconazole capsules: A non-invasive approach using NIR spectroscopy and chemometrics. Talanta 2019, 195, 662–667. [Google Scholar] [CrossRef]
  13. Liang, J.; Yu, X.; Hong, W.; Cai, Y. Information Extraction of UV-NIR Spectral Data in Waste Water Based on Large Language Model. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2024, 318, 124475. [Google Scholar] [CrossRef]
  14. Wongpromrat, P.; Phuphanutada, J.; Lapcharoensuk, R. Monitoring of Salinity of Water on the THA CHIN River Basin Using Portable Vis-NIR Spectrometer Combined with Machine Learning Algorithms. J. Mol. Struct. 2023, 1287, 135720. [Google Scholar] [CrossRef]
  15. Czarnecki, M.A.; Beć, K.B.; Grabska, J.; Huck, C.W.; Mazurek, S.; Orzechowski, K. State of Water in Various Environments: Aliphatic Ketones. MIR/NIR Spectroscopic, Dielectric and Theoretical Studies. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2023, 302, 123057. [Google Scholar] [CrossRef] [PubMed]
  16. Lanjewar, M.G.; Parab, J.S.; Kamat, R.K. Machine Learning Based Technique to Predict the Water Adulterant in Milk Using Portable Near Infrared Spectroscopy. J. Food Compos. Anal. 2024, 131, 106270. [Google Scholar] [CrossRef]
  17. Mallet, A.; Charnier, C.; Latrille, É.; Bendoula, R.; Roger, J.-M.; Steyer, J.-P. Fast and Robust NIRS-Based Characterization of Raw Organic Waste: Using Non-Linear Methods to Handle Water Effects. Water Res. 2022, 227, 119308. [Google Scholar] [CrossRef] [PubMed]
  18. Cichosz, S.; Masek, A.; Dems-Rudnicka, K. Simple and Effective Mathematical Models for Cellulose Water Content Calculation from Absorbance/Wavenumber Shifts in NIR Spectrum. Polym. Test. 2023, 117, 107874. [Google Scholar] [CrossRef]
  19. Sheng, X.; Zan, J.; Jiang, Y.; Shen, S.; Li, L.; Yuan, H. Data Fusion Strategy for Rapid Prediction of Moisture Content During Drying of Black Tea Based on Micro-NIR Spectroscopy and Machine Vision. Optik 2023, 276, 170645. [Google Scholar] [CrossRef]
  20. Lee, C.; Polari, J.J.; Kramer, K.E.; Wang, S.C. Near-infrared (NIR) spectrometry as a fast and reliable tool for fat and moisture analyses in olives. ACS Omega 2018, 3, 16081–16088. [Google Scholar] [CrossRef]
  21. Karunathilaka, S.R.; Fardin-Kia, A.R.; Roberts, D.; Mossoba, M.M. Determination of water in olive oil: Rapid FT-NIR spectroscopic procedure based on the Karl-Fischer reference method. J. Oleo Sci. 2020, 69, 1373–1380. [Google Scholar] [CrossRef]
  22. Yue, J.; Zhang, H.; Gao, L.; Tian, W.; Luo, J.; Nie, L.; Li, L.; Wu, A.; Zang, H. Benchtop and Different Miniaturized Near-Infrared Spectrometers Application Study: Calibration Transfer and 2D-COS for In-Situ Analysis of Moisture Content in HPMC. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2025, 333, 125889. [Google Scholar] [CrossRef] [PubMed]
  23. Bec, K.B.; Grabska, J.; Pfeifer, F.; Siesler, H.W.; Huck, C.W. Rapid On-Site Analysis of Soil Microplastics Using Miniaturized NIR Spectrometers: Key Aspect of Instrumental Variation. J. Hazard. Mater. 2024, 480, 135967. [Google Scholar] [CrossRef] [PubMed]
  24. Shen, X.; Lan, S.; Zhao, Y.; Xiong, Y.; Yang, W.; Du, Y. Characterization of Skin Moisture and Evaluation of Cosmetic Moisturizing Properties Using Miniature Near-Infrared Spectrometer. Infrared Phys. Technol. 2023, 132, 104759. [Google Scholar] [CrossRef]
  25. Avila, C.R.; Ferré, J.; de Oliveira, R.R.; de Juan, A.; Sinclair, W.E.; Mahdi, F.M.; Hassanpour, A.; Hunter, T.N.; Bourne, R.A.; Muller, F.L. Process Monitoring of Moisture Content and Mass Transfer Rate in a Fluidised Bed with a Low Cost Inline MEMS NIR Sensor. Pharm. Res. 2020, 37, 84. [Google Scholar] [CrossRef]
  26. Mirghani, M.E.S.; Kabbashi, N.A.; Alam, M.Z.; Qudsieh, I.Y.; Alkatib, M.F.R. Rapid method for the determination of Moisture content in biodiesel using FTIR spectroscopy. J. Am. Oil Chem. Socitey 2011, 88, 1897–1904. [Google Scholar] [CrossRef]
  27. Zhang, Z.; Ding, Y.; Hu, F.; Liu, Z.; Lin, X.; Fu, J.; Zhang, Q.; Zhang, Z.-H.; Ma, H.; Gao, X. Constructing In-Situ and Real-Time Monitoring Methods During Soy Sauce Production by Miniature Fiber NIR Spectrometers. Food Chem. 2024, 460, 140788. [Google Scholar] [CrossRef]
  28. Gorla, G.; Ferrer, A.; Giussani, B. Process Understanding and Monitoring: A Glimpse into Data Strategies for Miniaturized NIR Spectrometers. Anal. Chim. Acta 2023, 1281, 341902. [Google Scholar] [CrossRef] [PubMed]
  29. Jin, X.; Wang, L.; Zheng, W.; Zhang, X.; Liu, L.; Li, S.; Rao, Y.; Xuan, J. Predicting the Nutrition Deficiency of Fresh Pear Leaves with a Miniature Near-Infrared Spectrometer in the Laboratory. Measurement 2022, 188, 110553. [Google Scholar] [CrossRef]
  30. Zaukuu, J.-L.Z.; Nkansah, A.A.; Mensah, E.T.; Agbolegbe, R.K.; Kovacs, Z. Non-Destructive Authentication of Melon Seed (Cucumeropsis mannii) Powder Using a Pocket-Sized Near-Infrared (NIR) Spectrophotometer with Multiple Spectral Preprocessing. J. Food Compos. Anal. 2024, 134, 106425. [Google Scholar] [CrossRef]
  31. Yang, Z.; Xiao, H.; Zhang, L.; Feng, D.; Zhang, F.; Jiang, M.; Sui, Q.; Jia, L. Fast Determination of Oxides Content in Cement Raw Meal Using NIR Spectroscopy Combined with Synergy Interval Partial Least Square and Different Preprocessing Methods. Measurement 2020, 149, 106990. [Google Scholar] [CrossRef]
  32. Ezenarro, J.; Riu, J.; Ahmed, H.J.; Busto, O.; Giussani, B.; Boqué, R. Measurement Errors and Implications for Preprocessing in Miniaturised Near-Infrared Spectrometers: Classification of Sweet and Bitter Almonds as a Case of Study. Talanta 2024, 276, 126271. [Google Scholar] [CrossRef]
  33. Zhang, M.; Eddy, C.; Deanda, K.; Finkelstein, M.; Picataggio, S. Metabolic engineering of a pentose metabolism pathway in ethanologenic Zymomonas mobilis. Science 1995, 267, 240–243. [Google Scholar] [CrossRef] [PubMed]
  34. Clua-Palau, G.; Jo, E.; Nikolic, S.; Coello, J.; Maspoch, S. Finding a Reliable Limit of Detection in the NIR Determination of Residual Moisture in a Freeze-Dried Drug Product. J. Pharm. Biomed. Anal. 2020, 183, 113163. [Google Scholar] [CrossRef]
  35. Wu, Z.; Sui, C.; Xu, B.; Ai, L.; Ma, Q.; Shi, X.; Qiao, Y. Multivariate Detection Limits of On-Line NIR Model for Extraction Process of Chlorogenic Acid from Lonicera japonica. J. Pharm. Biomed. Anal. 2013, 77, 16–20. [Google Scholar] [CrossRef]
  36. Wu, Z.S.; Ouyang, G.Q.; Shi, X.Y.; Ma, Q.; Wan, G.; Qiao, Y. Absorption and quantitative characteristics of C-H bond and O-H bond of NIR. Opt. Spectrosc. 2014, 117, 703–709. [Google Scholar] [CrossRef]
  37. Sani, E.; Dell’Oro, A. Optical Constants of Ethylene Glycol Over an Extremely Wide Spectral Range. Opt. Mater. 2014, 37, 36–41. [Google Scholar] [CrossRef]
  38. Pereira, A.F.C.; Pontes, M.J.C.; Gambarra Neto, F.F.; Santos, S.R.B.; Galvão, R.K.H.; Araújo, M.C.U. NIR Spectrometric Determination of Quality Parameters in Vegetable Oils Using iPLS and Variable Selection. Food Res. Int. 2008, 41, 341–348. [Google Scholar] [CrossRef]
  39. Xu, Y.; Dong, Y.; Liu, J.; Wang, C.; Li, Z. Combination of Near Infrared Spectroscopy with Characteristic Interval Selection for Rapid Detection of Rice Protein Content. J. Food Compos. Anal. 2025, 137, 106995. [Google Scholar] [CrossRef]
  40. Nørgaard, L.; Saudland, A.; Wagner, J.; Nielsen, J.P. Interval Partial Least-Squares Regression (iPLS): A Comparative Chemometric Study with an Example from Near-Infrared Spectroscopy. Appl. Spectrosc. 2000, 54, 413–419. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of the experimental test setup.
Figure 1. Schematic diagram of the experimental test setup.
Applsci 15 06023 g001
Figure 2. Raw near-infrared (NIR) spectra of ethylene glycol samples.
Figure 2. Raw near-infrared (NIR) spectra of ethylene glycol samples.
Applsci 15 06023 g002
Figure 3. NIR spectra of ethylene glycol samples after first-derivative smoothing preprocessing.
Figure 3. NIR spectra of ethylene glycol samples after first-derivative smoothing preprocessing.
Applsci 15 06023 g003
Figure 4. Feature band selection plot using interval Partial Least Squares (iPLS).
Figure 4. Feature band selection plot using interval Partial Least Squares (iPLS).
Applsci 15 06023 g004
Figure 5. Scatter plot of predicted vs. measured water content for samples from different batches.
Figure 5. Scatter plot of predicted vs. measured water content for samples from different batches.
Applsci 15 06023 g005
Table 1. Information and data parameters of the C15511-01 instrument.
Table 1. Information and data parameters of the C15511-01 instrument.
ModelC15511-01
Optical interferometerMichelson interferometer
PhotodetectorInGaAs PIN photodiode
Wavelength range1100–2500 nm
Spectral resolution5.7 nm
Signal-to-noise ratio (SNR)>10,000
Spectral acquisition modesAbsorption spectrum/Reflection spectrum
Dimensions49   m m · 57   m m · 76 mm
Table 2. Ethylene glycol water content statistics.
Table 2. Ethylene glycol water content statistics.
DatasetNumber of SamplesMinimum Concentration (%)Maximum Concentration (%)
Full Dataset360.0021
Training Set280.0021
Validation Set80.0050.85
Table 3. Comparison of near-infrared spectroscopy model performance for ethylene glycol water content.
Table 3. Comparison of near-infrared spectroscopy model performance for ethylene glycol water content.
DatasetLVsTraining SetValidation SetPRESS
RCRMSEC (%)RPRMSEP (%)
PLSR30.9943.024 ∙ 10−20.9911.873 ∙ 10−20.713 ∙ 10−4
SVMR-0.9863.624 ∙ 10−20.9854.924 ∙ 10−21.477 ∙ 10−4
PCR30.9874.068 ∙ 10−20.9862.868 ∙ 10−20.889 ∙ 10−4
Table 4. Influence of different spectral pretreatment methods on model performance.
Table 4. Influence of different spectral pretreatment methods on model performance.
Pretreatment MethodTraining SetValidation Set
RCRMSEC (%)RPRMSEP (%)
Normalization0.9781.554 ∙ 10−20.9911.492 ∙ 10−2
S-G Smoothing0.9761.329 ∙ 10−20.9881.539 ∙ 10−2
SNV0.9872.227 ∙ 10−20.9892.416 ∙ 10−2
FD + Smoothing0.9891.414 ∙ 10−20.9921.451 ∙ 10−2
FD + Normalization0.9941.558 ∙ 10−20.9931.662 ∙ 10−2
Table 5. Model evaluation results for different feature bands.
Table 5. Model evaluation results for different feature bands.
Spectral Range (nm)Training SetValidation Set
RCRMSEC (%)RPRMSEP (%)
1100–25000.9873.024 ∙ 10−20.9881.873 ∙ 10−2
1100–14700.9853.678 ∙ 10−20.9874.209 ∙ 10−2
1800–20500.9911.074 ∙ 10−20.9941.212 ∙ 10−2
Table 6. Water content test results for samples from different batches.
Table 6. Water content test results for samples from different batches.
Sample NumberTheoretical Value (%)Predicted Value (%)Relative Error (%) 1
10.7000.6832.46
20.5000.4892.23
30.3000.2835.85
40.2000.2094.57
50.1000.0945.91
60.0700.0768.42
70.0500.0547.75
80.0300.0325.91
90.0200.0216.78
100.0100.01111.46
110.0070.01048.73
120.0050.00855.32
130.0030.014375.63
1 Note: LOD (Limit of Detection) = 0.026%. For samples labeled with “<LOD” (Nos. 9–13), their predicted values are dominated by instrument noise. The relative error calculations are provided for reference only and should not be used to evaluate model performance.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Luo, Q.; Guo, Z.; Lin, D.; Chang, B.; Ruan, Y. Miniaturized Near-Infrared Analyzer for Quantitative Detection of Trace Water in Ethylene Glycol. Appl. Sci. 2025, 15, 6023. https://doi.org/10.3390/app15116023

AMA Style

Luo Q, Guo Z, Lin D, Chang B, Ruan Y. Miniaturized Near-Infrared Analyzer for Quantitative Detection of Trace Water in Ethylene Glycol. Applied Sciences. 2025; 15(11):6023. https://doi.org/10.3390/app15116023

Chicago/Turabian Style

Luo, Qunling, Zhiqiang Guo, Danping Lin, Boxue Chang, and Yinlan Ruan. 2025. "Miniaturized Near-Infrared Analyzer for Quantitative Detection of Trace Water in Ethylene Glycol" Applied Sciences 15, no. 11: 6023. https://doi.org/10.3390/app15116023

APA Style

Luo, Q., Guo, Z., Lin, D., Chang, B., & Ruan, Y. (2025). Miniaturized Near-Infrared Analyzer for Quantitative Detection of Trace Water in Ethylene Glycol. Applied Sciences, 15(11), 6023. https://doi.org/10.3390/app15116023

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop