Application of Visible Aquaphotomics for the Evaluation of Dissolved Chemical Concentrations in Aqueous Solutions

: This paper introduces novel research in aquaphotomics, extending the study of water– light interactions to the visible spectral range. This approach can potentially reduce the cost and increase the speed of spectral measurements, while providing additional information by extending the useful range in spectrophotometry. To demonstrate our method, we investigated the applicability of the visible spectral range for the quantiﬁcation of NaCl dissolved in aqueous samples. Spectral measurements were conducted using a visible spectrometer in the range of 380–730 nm. The evaluation of molecular species concentration was based on multivariate analysis (MVA). Principal component analysis (PCA) showed a separation of all groups of samples by salt concentration. The partial least squares regression (PLSR) model presented high accuracy and a relationship between spectral variables in the visible range and NaCl concentration in water. The validity of the regression model was conﬁrmed through independent prediction of NaCl concentration values in test samples with unknown concentrations. The presented results demonstrate the success of the approach in evaluating concentration changes in visible light, and thus extend the measurable spectral range of such analysis.


Introduction
In this paper, we extend the application of spectrophotometric and aquaphotomics methods into the visible range. Compared with classical near-infrared spectroscopy, the ability to detect changes in visible light is lower; however, the evaluation and interpretation using multivariate data analysis show statistically significant differences.
Spectrophotometry is one of the most useful methods for quantitative analysis in various fields. The novel method for the application of near-infrared spectroscopy, introduced by Professor Tsenkova in 2005, is called aquaphotomics, which has been successfully used to study and systematize the knowledge about water-light interactions [1,2]. The approach used in aquaphotomics is different from the traditional spectroscopy approach. In aquaphotomics, changes in the water spectral patterns are used as the main source of information, while classical spectroscopy methods only look for changes in the absorption of particular types of molecules. The change in the concentration of a particular analyte is reflected in the changes of absorbance at several water absorbance bands, which are then used to build the prediction model [3][4][5]. The majority of aquaphotomics studies have been done by near-infrared (NIR) spectroscopy [6][7][8][9][10].
A study of the mechanisms of light absorption by water molecules showed that there are overtones and combined vibrational absorption of water molecules in the visible range, resulting in six bands [11][12][13][14]. In the visible range, there are v1, v2 and v3 vibrational modes of O-H stretch, where v1 is symmetric stretch, v2 is bend (scissors mode), and v3 is asymmetric stretch; a and b are integers ≥0 and represent the order. The wavelengths of the peaks of the visible vibrational absorption spectrum of liquid water together with their assignments are presented in Table 1. There is a small peak at 739 nm (av1 + bv3; a + b = 4), which corresponds to the third overtone band, plus a smaller fourth overtone band at 606 nm (av1 + bv3; a + b = 5), an extremely small fifth overtone band at 514 nm (av1 + bv3; a + b = 6), a seventh harmonic of the oxygen-hydrogen (OH) stretch at 449 nm (av1 + bv3; a + b = 7), an eighth harmonic of the OH stretch at 401 nm (av1 + bv3; a + b = 8), and a combined overtone band at 660 nm (av1 + v2 + bv3; a + b = 4) [15]. The absorption of overtone bands of water within the visible spectrum is quite small (0.3-0.01 m −1 ), but never reaches exactly zero. The clear absorption of pure water without scattering effect was measured and examined by Pope and Fry in 1997 [16]. This research confirmed existing shoulders at 449 and 401 nm, which are due to the seventh and eighth harmonics of the OH stretch. The mentioned wavelengths represent the apexes of the water peaks; however, the changes of water absorption by dissolved chemicals affect water spectral pattern in total. Based on this, it is assumed that the aquaphotomics approach can be applied in the visible range to determine information related to the higher water overtones. We examined the influence of the path length on the water spectral characteristics in the visible range of the spectrum [15] to choose an optimal technical path length and investigated the correlation between water spectral characteristics in the visible range [17] in our previous research. Using this knowledge, the current manuscript shows the accessibility of the visible range of the spectrum for the quantitative analysis of dissolved chemical in water.
Biological systems can be studied using a non-destructive and integrative approach based on aquaphotomics, i.e., the interaction between water and biomolecules in which spectroscopic techniques combined with multivariate analysis represent a powerful tool [3]. Near-infrared spectroscopy is successfully used for the quantitative analysis of many problems, including dissolved molecular species concentrations [18]. Visible aquaphotomics is proposed as a potential tool to increase the speed and reduce the cost of qualitative measurements. The aim of this study was to determine the application of visible spectroscopy and an aquaphotomics approach using an inexpensive spectrometer with a limited number of investigated bands to measure salt content in water.

Samples
Sodium chloride (NaCl, M = 58.44 g·mol −1 , purity min. 99.5% mass/mass) was purchased from Reanal (Budapest, Hungary). Purified water (MQ) was produced by a Milli-Q apparatus before the experiments (resistivity >18 MΩ·cm; Direct-Q, Millipore, Molsheim, France). Aqueous solutions were prepared from NaCl in a range between 2% and 20% w/v by direct dilution. First, stock solutions were prepared, and then they were further diluted with added MQ to reach the appropriate concentrations. This procedure was repeated three times to obtain three sets (repetitions) of samples, resulting in 30 samples of aqueous NaCl solution.

Spectral Acquisition
A ColorMunki (X-Rite) spectrophotometer was used to collect transflectance spectra. The ColorMunki works in the visible range (380-730 nm with steps of 10 nm = 36 wavelengths) and provides the measured spectrum in the form of relative intensity of each wavelength. Spectral acquisition was performed in transflectance mode using an optical glass window cuvette providing a 4.2 mm thickness of the tested sample (8.4 mm technical path length). As a holder we used reflectance cuvette from the instrument metriNIR ( Figure 1). This cuvette has a circular shape and consists of a "bath" for the sample and a "head" with a reference white reflector.

Samples
Sodium chloride (NaCl, M = 58.44 g·mol −1 , purity min. 99.5% mass/mass) was purchased from Reanal (Budapest, Hungary). Purified water (MQ) was produced by a Milli-Q apparatus before the experiments (resistivity >18 MΩ·cm; Direct-Q, Millipore, Molsheim, France). Aqueous solutions were prepared from NaCl in a range between 2 and 20% w/v by direct dilution. First, stock solutions were prepared, and then they were further diluted with added MQ to reach the appropriate concentrations. This procedure was repeated three times to obtain three sets (repetitions) of samples, resulting in 30 samples of aqueous NaCl solution.

Spectral Acquisition
A ColorMunki (X-Rite) spectrophotometer was used to collect transflectance spectra. The ColorMunki works in the visible range (380-730 nm with steps of 10 nm = 36 wavelengths) and provides the measured spectrum in the form of relative intensity of each wavelength. Spectral acquisition was performed in transflectance mode using an optical glass window cuvette providing a 4.2 mm thickness of the tested sample (8.4 mm technical path length). As a holder we used reflectance cuvette from the instrument metriNIR (Figure 1). This cuvette has a circular shape and consists of a "bath" for the sample and a "head" with a reference white reflector. The light passes through the glass located at the bottom of the cuvette, through the sample, is reflected from the white surface of the reflector, and returns back to the receiver. The thickness of the cuvette can be changed by using reflectors with different heights. The cuvette thickness was chosen based on the results of a previous experiment [15], where we measured the spectra of samples using a cuvette with changeable thickness. The path length has a strong influence on the absorption, and the groups of measured spectra with different path lengths are far from each other on the principal component analysis scatterplot ( Figure 2). The longest path length shows the biggest variance due to poor repeatability [19]. The results with 4.2 mm cuvette thickness provided good repeatability together with sufficiently high values of useful signals. The light passes through the glass located at the bottom of the cuvette, through the sample, is reflected from the white surface of the reflector, and returns back to the receiver. The thickness of the cuvette can be changed by using reflectors with different heights. The cuvette thickness was chosen based on the results of a previous experiment [15], where we measured the spectra of samples using a cuvette with changeable thickness. The path length has a strong influence on the absorption, and the groups of measured spectra with different path lengths are far from each other on the principal component analysis scatterplot ( Figure 2). The longest path length shows the biggest variance due to poor repeatability [19]. The results with 4.2 mm cuvette thickness provided good repeatability together with sufficiently high values of useful signals.
All spectral acquisitions were performed at room temperature, and the temperature and humidity of the room were monitored using a Voltcraft DL-121TH Multi-Data logger to reveal any substantial environmental conditions. Three consecutive scans were conducted for each measurement. Milli-Q water was measured between every 5 samples to detect the influence of environmental changes and device recalibration. All spectral acquisitions were performed at room temperature, and the temperature and humidity of the room were monitored using a Voltcraft DL-121TH Multi-Data logger to reveal any substantial environmental conditions. Three consecutive scans were conducted for each measurement. Milli-Q water was measured between every 5 samples to detect the influence of environmental changes and device recalibration.

Statistical Data Analysis
For each sample, the reference spectrum was taken just before the measurement. As a reference, we used spectra measured from the white reflector.
Subtractive correction of environmental changes was applied. For this, from each sample's spectrum we subtracted the spectrum of the pure Milli-Q water measured just before the set of 5 samples (EC(i)) and added back the ground average spectrum of Milli-Q samples measured during the entire experiment calculated by the equation where ECavg is the calculated average of all relative intensities among whole Milli-Q water spectra measured during the experiment, and ECi is the relative intensities among the Milli-Q water spectrum measured just before each set of 5 samples. Then, subtractive correction of environmental changes can be presented as the equation , where Ic(i) is the relative intensities after applying subtractive correction and Im(i) is the measured relative intensities of the sample spectrum. Relative absorption was calculated by the Lambert-Beer equation from the relative intensity provided by the ColorMunki spectrometer. For our data, we used the equation ,

Statistical Data Analysis
For each sample, the reference spectrum was taken just before the measurement. As a reference, we used spectra measured from the white reflector.
Subtractive correction of environmental changes was applied. For this, from each sample's spectrum we subtracted the spectrum of the pure Milli-Q water measured just before the set of 5 samples (EC (i) ) and added back the ground average spectrum of Milli-Q samples measured during the entire experiment calculated by the equation where EC avg is the calculated average of all relative intensities among whole Milli-Q water spectra measured during the experiment, and EC i is the relative intensities among the Milli-Q water spectrum measured just before each set of 5 samples. Then, subtractive correction of environmental changes can be presented as the equation where I c(i) is the relative intensities after applying subtractive correction and I m(i) is the measured relative intensities of the sample spectrum. Relative absorption was calculated by the Lambert-Beer equation from the relative intensity provided by the ColorMunki spectrometer. For our data, we used the equation where A (i) is the calculated relative absorbance of the aqueous sample, I ref(i) is the relative intensities among the measured spectrum of the white reference, I black is the measured relative intensities with the totally closed spectrophotometer lens, used to avoid device error in the dataset, and I c(i) is the measured relative intensities of the sample after applying subtractive correction.
To analyze the data, we used the Aquap2 R-studio package [20]. Data obtained during our experiment used for the analysis can be found in the supplementary materials in the form required by Aquap2 R-studio package: class-and numerical variables in Table S1 and measured spectral data (after applying the correction described above) in Table S2. In our experiments, the range of the spectrum from 380 to 730 nm was measured. To avoid the UV/VIS boundary region [21], we used only measurements above 410 nm for the processing. Analysis of the spectral range (420-730 nm) was done by following standard Aquaphotomics pipeline, including: inspection of the raw spectra first, principal component analysis, partial least squares regression model, and independent prediction by regression model. Partial least squares regression (PLSR) models were used to predict NaCl concentrations based on the spectral characteristics of the aqueous samples. For the prediction, two separate PLSR models were built. The first model was calculated using all 90 measured spectra from 30 samples (three consecutive scans for each sample) to determine possible spectral outliers. The second model was calculated after removing the outliers (5 outliers were detected), using 85 measurements. To exclude outliers, the built-in function pls.exOut of the Aquap2 package was used. Outlier detection was performed by calculation of the boxplot for each of the PLSR predicted concentration level, and the points out of the 1.5 times the interquartile range from the median are identified as outliers. For both models it was decided to use three latent variables to minimize the chance of overfitting. Validation of both models was done by leave-one-out cross-validation, which means that three consecutive spectra of one sample were excluded from the training dataset at a time, the model was trained, then spectra of the next sample were excluded, and so on.
To simulate the use of the regression model in real conditions, which requires predicting the concentration of NaCl in solutions that were not previously presented to the model, independent prediction was used. For this purpose, the independent model was trained on two out of three repetition sets and tested on the third one.
The first inspection of the raw spectra showed possible grouping, as well as simple principal component analysis (PCA). Figure 3 shows the PCA scatterplot of aqueous samples with different NaCl concentrations. This example illustrates one repetition of samples with the same concentration range (Section 2.1) measured in the visible range (420-730 nm) using a cuvette providing 4.2 mm thickness. PCA successfully found a linear combination of the different spectral characteristics that separates out groups of aqueous solutions with different NaCl concentrations. The first two principal components show a clear trend. With an increased number of repetitions of each sample, the variance within each group grows, the groups do not appear well separated in lower dimensions, and it becomes necessary to use more principal components to cover the variance. To obtain clear images, we preprocessed the dataset by standard methods (removing outliers, averaging, applying subtractive correction, and separating it into training and testing groups for modeling and validation). To obtain clear images, we preprocessed the dataset by standard methods (removing outliers, averaging, applying subtractive correction, and separating it into training and testing groups for modeling and validation).

Results and Discussion
The absorbance spectra of the tested aqueous solution of NaCl for the 420-730 nm spectral range are presented in Figure 4. Spectra are colored according to concentration, where a concentration of 0% is pure Milli-Q water. It can be observed that the changes in salt concentration give baseline shifts and wavelength-dependent variations. To obtain clear images, we preprocessed the dataset by standard methods (removing outliers, averaging, applying subtractive correction, and separating it into training and testing groups for modeling and validation).

Results and Discussion
The absorbance spectra of the tested aqueous solution of NaCl for the 420-730 nm spectral range are presented in Figure 4. Spectra are colored according to concentration, where a concentration of 0% is pure Milli-Q water. It can be observed that the changes in salt concentration give baseline shifts and wavelength-dependent variations. PLSR models used for predicting NaCl concentration using all measured spectra and after removing the outliers are presented in Figure 5.
Cross-validation results of the models are presented in Table 2. The coefficient of determination (R 2 ) represents the accuracy of the model, and root-mean-square error of PLSR models used for predicting NaCl concentration using all measured spectra and after removing the outliers are presented in Figure 5.  Table 2. Performance of PLSR model developed for prediction of NaCl concentration in aqueous samples.  (a) (b) Figure 5. Estimation using PLSR models to regress on dissolved NaCl in aqueous samples using spectra from ColorMunki in range of 420-730 nm, built on spectra of (a) all 90 measured samples and (b) 85 measured spectra after removing outliers. Outliers are in red, and ellipses show groups from which outliers were removed. Figure 5. Estimation using PLSR models to regress on dissolved NaCl in aqueous samples using spectra from ColorMunki in range of 420-730 nm, built on spectra of (a) all 90 measured samples and (b) 85 measured spectra after removing outliers. Outliers are in red, and ellipses show groups from which outliers were removed. Cross-validation results of the models are presented in Table 2. The coefficient of determination (R 2 ) represents the accuracy of the model, and root-mean-square error of cross-validation (RMSE CV ) shows the precision of the cross-validation (minimal step in NaCl concentration, which can be accurately predicted by the model). Table 2. Performance of PLSR model developed for prediction of NaCl concentration in aqueous samples.   Figure 6. Estimation using PLSR model for independent prediction (after outlier removal, using spectra in range of 420-730 nm), using two-thirds of the data for training, where the first set was used for testing.
Regression analysis showed a strong relationship between spectral variables in the visible range and salt concentration in water according to the spectrometer with only 32 measured bands. This suggests the possibility of using the visible part of the spectrum for the analysis of some water parameters in specific tasks. This shows that the visible range of the water spectrum contains reasonable and useful information for qualitative analysis.
Our results demonstrate reasonable predictive accuracy compared with similar studies. For instance, Achata and colleagues [22] conducted a study to investigate hyperspectral imaging in the visible and near-infrared spectral ranges (450-1664 nm) coupled with chemometrics for the classification of brined and non-brined pork loins and prediction of brining salt (NaCl) concentration. They used brining solutions at concentrations of 5, 10, and 15% salt (w/v), prepared using vacuum-dried NaCl and distilled water. For the measurements, a hyperspectral imaging system in the visible-near-infrared range of 400-1000 nm with a spectral resolution of 5 nm was used. PLSR models were developed for the prediction of brine salt concentration. The PLSR model in the visible range (450-960 nm) 2 Figure 6. Estimation using PLSR model for independent prediction (after outlier removal, using spectra in range of 420-730 nm), using two-thirds of the data for training, where the first set was used for testing.
To validate the results of PLSR, the regression model was created with two groups (repetitions) of aqueous solutions and used to estimate the concentration of the third group, which had not been presented to the model before. After removing the outliers, three independent models were created in which the first group (Figure 6), second group ( Figure S1), or third group ( Figure S2) was used for the test. All of these models were trained separately, without presenting the test group. The results of cross-validation and independent prediction of models are presented in Table 2.
Regression analysis showed a strong relationship between spectral variables in the visible range and salt concentration in water according to the spectrometer with only 32 measured bands. This suggests the possibility of using the visible part of the spectrum for the analysis of some water parameters in specific tasks. This shows that the visible range of the water spectrum contains reasonable and useful information for qualitative analysis.
Our results demonstrate reasonable predictive accuracy compared with similar studies. For instance, Achata and colleagues [22] conducted a study to investigate hyperspectral imaging in the visible and near-infrared spectral ranges (450-1664 nm) coupled with chemometrics for the classification of brined and non-brined pork loins and prediction of brining salt (NaCl) concentration. They used brining solutions at concentrations of 5%, 10%, and 15% salt (w/v), prepared using vacuum-dried NaCl and distilled water. For the measurements, a hyperspectral imaging system in the visible-near-infrared range of 400-1000 nm with a spectral resolution of 5 nm was used. PLSR models were developed for the prediction of brine salt concentration. The PLSR model in the visible range (450-960 nm) obtained in that study with the highest R 2 and the lowest root-mean-squares error provided the following results: R 2 C = 0.86, RMSE C = 2.2% w/v, R 2 CV = 0.84, RMSE CV = 2.3% w/v, R 2 p = 0.76, RMSE p = 3.5% w/v. We are using shorter range, bigger step, therefore, smaller number of bands. The ColorMunki device, used in our experiment, measures spectrum in a range 380-730 nm with a step in 10 nm, resulting in 36 bands. Single measurement on this device takes around two-three seconds. Thus, we speed up the measurement process and decrease necessary computational burden. The possible limitation of our method is also related to the bigger measurable step, which could lead to losing some information. Nevertheless, the results presented in this article confirm the applicability of a smaller amount of measured data for the examination and prediction of molecular species concentrations dissolved in water.

Conclusions
Visible aquaphotomics performs quicker analysis, which reduces cost, and is easier to handle compared with the near-infrared approach. After previous attempts to investigate the accessibility of visible aquaphotomics, this paper presents the first reliable results showing the potential to expand the aquaphotomics approach to the visible range. Therefore, it is reasonable to more deeply investigate the information provided by the visible part of the spectrum. The next step is to expand the scope of the possible application of this method. The main question is to select or determine potential groups of dissolvable compounds distinguishable by the methods of visible aquaphotomics and evaluate the measurement precision and accuracy.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/photonics8090391/s1, Figure S1: Estimation using PLSR model for independent prediction (after outlier removal, using spectra in range of 420-730 nm), using two-thirds of the data for training, with the second set used for testing; Figure S2: Estimation using PLSR model for independent prediction (after outlier removal, using spectra in range of 420-730 nm), using two-thirds of the data for training, with the third set used for testing; Table S1: Data structured for Aquap2 R-studio package, containing information about aqueous samples; Table S2: Calculated absorption from measured spectra of aqueous samples used for analysis by Aquap2 R-studio package.