Application of FTIR Spectroscopy for Quantitative Analysis of Blood Serum: A Preliminary Study

The aim of this study was to analyze the possibility of simultaneous determination of the concentration of components from the characteristics of FTIR spectra using the example of a model blood serum. To prepare model solutions, a set of freeze-dried control sera based on bovine blood serum was used, certified for approximately 38 parameters. Based on the values of the absorbance and areas of absorption bands in the FTIR spectra of model solutions, a regression equation was constructed by solving a nonlinear problem using the generalized reduced gradient method. By using the absorbance of the absorption bands at 1717 and 3903 cm−1 and the areas of the absorption bands at 616, 3750, and 3903 cm−1, it is possible to simultaneously determine the concentrations of 38 components with an error of less than 0.1%. The results obtained confirm the potential clinical use of FTIR spectroscopy as a reagent-free express method for the analysis of blood serum. However, its practical implementation requires additional research, in particular, analysis of real blood serum samples and validation of the method.


Introduction
Plasma and serum remain the main clinical specimens of interest. They contain over 300 types of proteins, as well as carbohydrates, lipids and amino acids and over 100,000 metabolites in various concentrations [1]. In addition to a rich source of biomarkers for the diagnosis of diseases, an imbalance of endogenous components of plasma and serum is of great clinical importance [2]. Determination of the clinical parameters of serum and blood plasma is widely used for the diagnosis of various diseases, as well as a way to monitor the effectiveness of treatment [3]. As a result, the demand for clinical assays is growing, leading to the use of automatic analyzers, many of which are based on colorimetric reactions and ELISA determinations. These methods involve the use of expensive and specific reagents. Given the widespread nature of such studies, there is a clear need for new and cheap alternative analytical procedures, especially as screening tools in situations where economic resources are limited and diagnostic evidence is required at the point of care.
For these purposes, vibration spectroscopy methods, in particular infrared (IR) spectroscopy, can be used, since they do not require marking, are economical, easy to operate, and require minimal sample preparation. Their sensitivity to subtle changes in biochemical composition makes them ideal diagnostic tools, and recent advances in technology and data analysis enable fast and non-invasive analysis of body fluids [4][5][6][7][8][9]. However, the determination of clinical parameters in sera using vibration spectroscopy can be difficult due to the high complexity of the matrix and the low concentration of some analytes, since chemometric algorithms are usually required to eliminate matrix effects [10,11]. In general, the idea of determining the clinical parameters of serum from IR spectra is not new and the potential for quantitative and semi-quantitative analysis of metabolites in plasma, serum and whole blood has been confirmed by a number of studies [12][13][14][15]. However, modeling of blood serum has not been performed to date.
In this work, we modeled normal blood serum (SPINREACT normal, 38 components) and analyzed the possibility of simultaneous determination of the concentration of components from the characteristics of IR spectra.

Preparation of Model Solutions
To prepare model solutions, we used a set of freeze-dried control sera based on bovine blood serum, certified for approximately 38 parameters (SPINREACT, S.A. Ctra. Santa Coloma, Girona, Spain). The original model serum solution was diluted with bidistilled water 2, 3, 5, 7, and 10 times (in triplicate).

Receiving and Processing of IR Spectra
Samples of model solutions with a volume of 50 µL were applied to a zinc selenide substrate and dried in an oven at 37 • C for 60 min. The infrared absorption spectra were registered in the range of 500-4000 cm −1 using an FT-801 Fourier IR spectrometer (Simex, Saint Petersburg, Russia). Spectra were recorded with a scan number of 32 and a resolution of 4 cm −1 . A background (air) measurement was taken for every sample processed. The peaks corresponding to CO 2 vibrations were removed using the "straight line generation" option in the ZaIR 3.5 software (Simex, Russia). Three spectra were compared for each sample. The results were presented as an averaged spectrum. ZaIR 3.5 software (Simex, Russia) was used to carry out baseline correction and normalization of FTIR spectra. Raw spectra were pre-processed using a simple two-point linear subtraction baseline correction method. Spectra were then vector normalized.
We selected absorption bands that are present in the FTIR spectra of serum regardless of dilution. Absorbance (H) and area (S) were determined for the respective absorption bands.

Regression Model Development
At the first stage, S and H were selected in such absorption bands, which showed a high correlation with the concentrations of the components (B i ). They turned out to be H20, H49, S2, S42, S49, where H20 and H49 are the absorbance of the absorption bands at 1717 and 3903 cm −1 ; S2, S42 and S49 are the areas of absorption bands at 616, 3750 and 3903 cm −1 . At the second stage, a program was written in Delphi and such a combination of K of these parameters was selected, which has a high correlation with B i with a change in concentration (1,2). Since the operations of multiplication and division of variables can enhance correlations, and multiplication by a constant, addition and subtraction of variables does not affect the correlations. As a result, from all combinations of K, those for which the correlation with each B i is strong (Pearson's correlation coefficient r > 0.999) had their weighted average sum compiled, which has the minimum sum of squares of the deviation of values from B i . The vector of weight coefficients in this sum was sought by solving a nonlinear problem using the generalized reduced gradient method.
The next stage revealed the weight coefficients C i , at which the square of the deviation of B i from the value of K multiplied by C i is minimal. Finally, the model was tested, at which the constant error was corrected by subtracting the error, since the calculated values were always obtained by about 5% more (3).
The model was developed based on model solutions of normal blood serum (SPIN-REACT normal), while model solutions of pathological serum (SPINREACT pathologic) were used for testing. The concentrations of all components in the test model solution corresponded to the concentration range of the initial model solutions.
Comparison of serum spectra in the 1800-1000 cm −1 region with a series of standards of individual substances (albumin, urea, glucose, etc.) at their normal serum concentration levels indicates a great similarity between the spectra of serum and model solutions and albumin, which indicates that the main contribution is from proteins (70% of the non-aqueous portion of the serum) for all spectra. Useful identification bands could be observed for urea at 1630 and 1460 cm −1 and for glucose between 960 and 1180 cm −1 . Typical lipid bands can be identified in the wavenumber range 1700-1800 cm −1 and between 2600 and 3000 cm −1 . Uric acid has a specific band at 1570 cm −1 , and creatinine has two intense bands at 1720 and 1556 cm −1 . However, typical bands of individual compounds were clearly observed in FTIR spectra only at concentration levels 10-100 times higher than those corresponding to normal values. Since the samples are complex mixtures of several compounds with many functional groups, the spectra are a complex set of FTIR absorption bands.
It should be noted that the maximum differences between the FTIR spectra of serum and model solutions are observed in regions III and IV and are due to the absence of a number of carbohydrates and nucleic acids in model mixtures ( Figure 1B).
At the next stage, FTIR spectra of model solutions with different concentrations of constituent substances were obtained ( Figure 2). It was shown that the intensities of absorption bands change ambiguously upon dilution of model solutions (Figure 2A). The FTIR spectra are strongly dominated by a large number of proteins contained in the serum, which are present in high concentrations in comparison with other low molecular mass components. In fact, the peak of amide I at 1650 cm −1 has the highest intensity in the entire spectrum. It is shown that the position of the absorption bands remains unchanged during the dilution process ( Figure 2A). However, some of the absorption bands in the FTIR spectra of model solutions become more intense with a decrease in the protein content.
To construct the regression equation, four absorption bands were selected: H20 and H49-absorbance of the absorption bands at 1717 and 3903 cm −1 ; S2, S42, and S49 are the areas of the absorption bands at 616, 3750, and 3903 cm −1 (Figure 2). Figure         The absorption bands selected in the construction of the model are not previously mentioned in the literature for semi-quantitative or quantitative analysis of blood serum. The absorption band at 616 cm −1 can be attributed to the ring deformation of phenyl [16,17]. The absorption band at 1717 cm −1 corresponds to the vibrations of the C=O bond amide I (arises from C=O stretching vibration), DNA, RNA, and purine base [18]. The absorption band at 3750 cm −1 can be attributed to the vibrations of the free OH-group, while no correspondences were found for the 3903 cm −1 band.
The coefficients (C i ) for the calculation are given in Table 1.  Table 2 shows a complete list of the determined components of the model solution, the concentration range that was used in constructing the model, as well as the true model concentration of the test solution and that found during testing. According to the data presented, in all cases, the error in determining the concentration did not exceed 0.1% (Table 2). Note. *-The concentration range of the components corresponds to their real concentration in the blood serum.

Discussion
Methods for determining a number of serum parameters using FTIR spectroscopy have already been developed for blood [19][20][21]. Thus, blood glucose, an important parameter for the control of diabetes and other common diseases, has been determined in serum with acceptable accuracy [22][23][24]. It has been shown that spectroscopy in the mid-IR range can serve as an alternative basis for the clinical measurement of urea and glucose in blood serum [25]. The literature reports promising results on important serum parameters such as urea, total protein, albumin, triglycerides or total cholesterol [26][27][28][29]. Nevertheless, the possibility of simultaneous determination of several blood parameters has not been studied enough. For example, analysis of dry serum deposits using transmission spectroscopy revealed the possibility of quantitative determination of eight serum analytes [30] and simultaneous determination of malaria parasitemia, glucose and urea [31]. A reagentfree method for the simultaneous and direct detection of three analytes in human blood (glucose, triglycerides, and total cholesterol) based on FT-Raman spectroscopy has been proposed [32]. This study showed that the potential for quantitative determination of blood serum parameters is much wider, in particular, it is possible to simultaneously determine a larger number of analytes with greater accuracy.
In light of routine clinical laboratory use, relative prediction errors can be compared with standard deviations of reference concentrations, which primarily reflect physiological changes in the population [33]. Thus, the standard deviation of the reference values for total protein is 5.4% of the average concentration, while according to FTIR spectroscopy the protein concentration can be predicted with a relative prediction error of 4.7%. Similar conclusions can be drawn for HDL-cholesterol and uric acid, for which the relative prediction errors exceed the biological variability among donors of the studied population by 30% or less. In contrast, the variation in values for cholesterol, triglycerides, LDLcholesterol, and urea is four times less than the biological variation in concentrations. It is for these parameters that the mid-infrared range can be a valuable quantification tool. On average, the accuracy of the determination ranges from 4% for total protein to 16% for LDL-cholesterol [33]. Another study provides comparable values for the accuracy of determining a number of blood analytes: total protein-2.2%, albumin-4.0%, glucose-18.5%, urea-19%, total cholesterol-15.7% [3]. The least accurate is the determination of the triglyceride content by FTIR spectroscopy (error 43.1%) [3]. The low values of the error in the determination of all analytes obtained by us are due to the fact that solutions of normal serum with a balanced average ratio of analytes were taken for modeling, while in real serum samples deviations in the content of individual components are possible even within the normal range, including extreme values against the background of various diseases. When switching to real blood serum samples, there will undoubtedly be a loss of accuracy, however, low error values on the model system provide its reserve.
Overall, infrared spectral datasets are rich in information, highlighting underlying biological and structural differences. Combined with powerful multivariate analysis approaches, they can distinguish between disease classes by extracting relevant information. Spectral data analysis has used a variety of data mining approaches such as principal component analysis (PCA), random forest (RF), and support vector machine (SVM), all of which demonstrate the ability to distinguish patients from non-diseased biofluid samples [34]. Least squares regression analysis (PLSR) is currently one of the most commonly used methods for quantitative modeling due to its ability to detect systematic variation in influencing factors and generate quantitative predictive models [35]. This allows unknowns to be predicted using hidden variables extracted from the regression model [36,37]. A robust regression model is described for establishing multivariate calibration based on a nonlinear iterative partial least squares (NIPALS) algorithm with orthogonal signal correction (OSC) and sample set splitting based on joint distance x-y (SPXY) [32]. Thus, FTIR spectroscopy is able to detect minor differences in biofluid samples with minimal sample preparation, and numerous studies supporting the principle have highlighted the potential clinical use of this method. However, broad uptake of FTIR spec-troscopy has not happened due to a variety of factors, including a lack of acceptance from the clinical environment.
In this study, we have demonstrated for the first time the possibility of simultaneous determination of 38 parameters in blood serum. The concentration range of the components corresponds to their real concentration in the blood serum. This study is preliminary and has a number of limitations. Modeling was carried out on systems in which the ratio between them did not change when the concentration of the components changed. In real clinical samples, this consistency will not be observed. At the next stage of the study, we plan to analyze model systems with different ratios of constituent substances, as well as with an extreme content of individual components, which is observed in various pathologies. We assume that the results obtained will require changes to the model. Only after that will the proposed model be tested on real clinical blood serum samples. At the same time, we predict a significant decrease in the accuracy of determining the concentrations of individual components of serum. It should be noted that even in the case of successful approbation and validation of the method, its clinical use is possible mainly for express analysis and in any case will require additional confirmation by standard laboratory methods.

Conclusions
Modeling of normal blood serum was carried out. It was shown that using the absorbance of the absorption bands at 1717 and 3903 cm −1 and the areas of the absorption bands at 616, 3750, and 3903 cm −1 , it is possible to simultaneously determine the concentrations of 38 components with an error of less than 0.1%. The results obtained confirm the potential clinical use of FTIR spectroscopy as a reagent-free express method for the analysis of blood serum. However, its practical implementation requires additional research, in particular, analysis of real blood serum samples and validation of the method.