The New Approach to a Pattern Recognition of Volatile Compounds: The Inflammation Markers in Nasal Mucus Swabs from Calves Using the Gas Sensor Array

This paper discusses the application of two approaches (direct and inverse) to the identification of volatile substances by means of a gas sensor array in a headspace over nasal mucus swab samples taken from calves with differing degrees of respiratory damage. We propose a unique method to visualize sensor array data for quality analysis, based on the spectra of cross mass sensitivity parameters. The traditional method, which requires an initial sensor array trained on the vapors of the individual substances (database accumulation)—with their further identification in the analyzed bio-samples through the comparison of the analysis results to the database—has shown unsatisfactory performance. The proposed inverse approach is more informative for the pattern recognition of volatile substances in the headspace of mucus samples. The projection of the calculated parameters of the sensor array for individual substances in the principal component space, acquired while processing the sensor array output from nasal swab samples, has allowed us to divide animals into groups according to the clinical diagnosis of their lung condition (healthy respiratory system, bronchitis, or bronchopneumonia). The substances detected in the gas phase of the nasal swab samples (cyclohexanone, butanone-2,4-methyl-2-pentanone) were correlated with the clinical state of the animals, and were consistent with the reference data on disease markers in exhaled air established for destructive organism processes.


Introduction
Various mathematical algorithms, from classical chemometrics to computer vision, have been widely used for data processing from sensor systems [1]. The processing algorithms for large data arrays generally require cloud resources, while simpler ones may be incorporated into portable devices, including portable analyzers. Beyond the traditional method used in search of close or similar samples, based on data distribution in space, more detailed information is required in order to estimate a sample's composition, and the presence of particular compounds and markers within it. Such a task can be solved by means of traditional analytical chemistry methods, such as the standard additive method, the internal standard method, or via sensor array training and the identification of specific responses to the calibration function [2][3][4][5][6]. However, due to the instability and variability of gaseous mixture compositions (especially the ones emitted from samples of complex natures), the training method with the cross-selectivity of sensors is inaccurate for individual substances. Nevertheless, such an approach is still widely used in a great

Device and Sensor Array Characteristics
The analysis of the gas phase over the biosamples was carried out with the odor analyzer «Diagnost-Bio-8» (Ltd. "Sensino", Kursk, Russia) in the "frontal analyte input" mode (frontal spontaneous intake of highly volatile compounds, VCs) in the pre-sensory space of a closed detection cell), Figure 1. The sensor array included a set of 8-piezoelectric BAWtype quartz crystal microbalance resonators, with a 10.0 MHz basic oscillation frequency. In order to vary the sensitivity of the sensors, the silver electrodes of 5 mm diameter used to hold a quartz crystal were covered with various solid-state nanostructured sorbents («Living system», «LS©», Russia) [30,31]. In particular, sensors 1, 8 were modified with carboxylate carbon nanotube phases of different masses, marked in the tables and in the text as MCNT1 and MCNT2; sensors 2, 7 were covered with phases of zirconium nitrate of different masses (Zr1 and Zr2); sensor 3 had a dicyclohexane-18-crown-6 (DCH18Cr6) sorbent film; sensors 4, 5 were modified with bio hydroxyapatite phases of different masses (HA1, HA2), and, finally, sensor 6 was covered with polyethylene glycol succinate (PEGsc).
where ΔF is the change in the sensor oscillation frequency after the coating deposition, MHz; F0 is the base oscillation frequency, MHz; 2.27·10 −6 is a calibration constant of piezoelectric quartz resonator at normal conditions, cm 2 /g; A is the area of Ag electrodes, cm 2 . The film mass deposited on electrodes was up to 20 μg/cm 2 .
As it was demonstrated in our previous investigations [32], the chosen sensors have shown a high sensitivity to various classes of highly volatile organic compounds (alcohols, aldehydes, acids, ketones, amines, arenes, etc.). More details on sensor manufacturing, including technical characteristics, reproducibility details and sorbents synthesis, are specified in [33]. The sensor array response was registered in a form of chrono-frequencygrams-output curves of piezoelectrical quartz sensors within the total time of measurement, representing the variations of sensor vibration frequency over time (− ΔF, Hz) (Figure 1с). The active measurement time was 80 s, and during this period the baseline responses of the sensors were stable (± 1 Hz).  The coatings were uniformly deposited to the electrodes of piezoelectric quartz resonators, fat-free with acetone or chloroform, by immersion in solutions of sorbents. The deposition included the following steps: (i) the basic oscillation frequency of the piezoelectric resonator, F 0 , Hz was accurately reordered; (ii) the suspension of 0.5 g of sorbent in 10 mL of solvent was prepared and kept in an ultrasonic bath for 15 min at 90 W power; (iii) the Ag electrode surfaces of the piezoelectric sensor were exposed to the suspension for 15 s; (iv) coatings were dried vertically and held in an oven at 50 • C for 40 min; (v) the final coating mass (∆m) mass was calculated by measuring the sensor oscillation frequency according to the Sauerbrey equation [34]: where ∆F is the change in the sensor oscillation frequency after the coating deposition, MHz; F 0 is the base oscillation frequency, MHz; 2.27·10 −6 is a calibration constant of piezoelectric quartz resonator at normal conditions, cm 2 /g; A is the area of Ag electrodes, cm 2 . The film mass deposited on electrodes was up to 20 µg/cm 2 .
As it was demonstrated in our previous investigations [32], the chosen sensors have shown a high sensitivity to various classes of highly volatile organic compounds (alcohols, aldehydes, acids, ketones, amines, arenes, etc.). More details on sensor manufacturing, including technical characteristics, reproducibility details and sorbents synthesis, are specified in [33]. The sensor array response was registered in a form of chrono-frequencygrams-output curves of piezoelectrical quartz sensors within the total time of measurement, representing the variations of sensor vibration frequency over time (−∆F, Hz) ( Figure 1c). The active measurement time was 80 s, and during this period the baseline responses of the sensors were stable (±1 Hz).

Analysed Samples
The nasal swab samples were taken from 5 red-spotted cow breed calves aged from 14 to 20 days; the reselection was made 7 days later (the total number of samples was 10). The sampling of nasal mucus took within 4-5 s according to the methodology developed in our laboratories though the use of the sterile cotton swabs, usually employed in bacteriological research, which were then stored in sterile tubes. Approximately 20-30 min prior to the analysis, the cotton swabs with collected nasal mucus were extracted from storage tubes and placed on a glass Petri plate; hereafter, the consecutive samples measurements were conducted.

Measurements with Sensor Array
Samples of calves' nasal swabs were scrutinized with a natural (without forced flow) frontal input of vapors into the pre-sensory space of the detection cell at 20 ± 1 • C. The time between the moment the cotton swab was removed from the storage tube and the measurement itself was strictly controlled to 10 s. The measurements were performed in combined mode: the sensors were kept above a sample with a frontal input of volatile components into the pre-sensory space during the first 80 s, followed by the spontaneous desorption of volatile compounds from the sensor coatings in the open detection cell without sample between 81 s and 200 s. The sorption/desorption of vapors of distilled water and of twenty-one volatile compounds (p. a. Alfa Aesar), the markers of respiratory pathologies [10][11][12][13][14][15][16][17][18][19][20][21][22], were preliminarily tested in a wide range of concentrations, as shown in Table 1. During these preliminary tests, on the plate tightly adjacent to the detection cell, 1 to 3 µL of individual substances were injected and the sensor signals were recorded from the moment the substance was introduced into the detection cell according to the measurement mode described above: the first 80 sec substance vapors sorption were followed by the spontaneous desorption for the next 120 s. The detection limits of substances, defined as a significant change (more than 3σ) in sensor signals compared to a blank measurement, correspond to the values of the minimum concentration of substances in Table 1.

Clinical and Laboratory Tests of Calves
The general clinical state of the calves was evaluated at each stage of the investigation according to the point system (WI score), developed at the University of Wisconsin-Madison (USA) [35], and followed by the obligatory laboratory control of hematological and biochemical parameters of blood inflammation, such as leukocytosis, the shift of the leukocyte Chemosensors 2021, 9, 116 5 of 16 formula to the left and the increased concentration of haptoglobin. These last measurements were performed at FSBSE «All-Russian Scientific Research Veterinary Institute of Pathology, Pharmacology and Therapy», Russia. Additionally, the respiratory failure index of calves was estimated according to the recommendations given in [36], and a chest X-ray using a high-frequency portable X-ray device 50Ma (Brand Shinova model MX101) was performed. All the results of clinical examination and the laboratory analyzes of animals are available on request.

Data Treatment
The maximum changes in the vibration frequency of sensors, ∆Fmax (Figure 1c), were calculated from the sensors' chrono-frequency-grams registered with the device software, which were then used to calculate the parameters of sorption efficiency A(i/j) (where I, j are the numbers of sensors in the array) [37,38]. More details on A(i/j) parameter estimation is given in our recent work [39]. Briefly, A(i/j) determines the sorption efficiency of an individual compound as the ratio of the maximum signals of a pair of sensors in a determined range of concentrations for a fixed measurement time: where ∆F max,i(j) represents the changes in the oscillation frequency of sensors i and j, respectively, expressed in Hz. The parameters A(i/j) are independent from the analyte concentration and are permanent in the range of the linear response and constant sensitivity of sensors; it is necessary to assume a priori the concentration range of the defined substances. In order to identify one substance in a mixture, at least one A(i/j) value is required. Overall, for each sample, there were calculated 28 sorption efficiency parameters A(i/j), corresponding to all the possible combinations of 8 sensors without repetitions. Data processing was carried out with the Unscrambler program (v.10.0.4. CAMO software AS, Oslo, Norway) using principal component analysis (PCA) with the full crossvalidation of models. The singular value decomposition (SVD) algorithm was employed in PCA. Mathematical applications of the SVD include computing the pseudoinverse, matrix approximation, and determining the rank, range, and null space of a matrix. In order to compare the data obtained at the sorption of pure substances and nasal swabs, the projected PCA was implemented. The results from sample projections onto an existing PCA model can be interpreted in the same way as usual PCA results. For this, however, the loadings values must be fixed based on the established PCA model. Then, the new data were projected through the fixed PCA loadings and the new scores were computed for the projected samples. The main difference of the projected PCA compared to standard PCA results is that the variance plot now depicts calibration, validation and projection.

The Clinical State of the Calves
Prior to the sensory analysis, the health state and the degree of respiratory organ damage according to the results of the clinical and laboratory tests were established for each calf in each test point. Table 2 summarizes the established diagnoses. Moreover, in order to evaluate the similarity of respiratory system conditions in animals with analogous clinical diagnosis, the results of clinical and laboratory tests were processed by PCA, as shown in Figure 2. On a PCA score plot, reported in Figure 2 and representing 71% of total variance on the first and second principal components, PC1 and PC2, the three groups of animals can be clearly distinguished. Group 1 (green circled) includes an animal № 4 (involving two control points-№ 4.1, № 4.2) and animal № 5 (involving the first control point-№ 5.1) and represents the group without clinical symptoms of a lung damage. Group 2 (red circled) contains animals № 2 and № 3 (two control points-№ 2.1, 2.2, 3.1, 3.2) in an acute phase of pneumonia. Group 3 (blue circled) corresponds to chronic lung damage, and it is represented by animal № 1 (involving two control points). As it can be noted from Table 2, group 1 includes animals with bronchi inflammation (№ 4.2 and № 5.1); however, according to clinical indicators, slight deviations from a healthy state are observed, which do not specify the necessity of any drug treatment. The nasal mucus from calves was then analyzed with the piezoelectric sensor array. The mucus samples for analysis were collected twice with a one-week interval from all five calves with different diagnoses in respect to the animal's health state. Nasal mucus, as a biosample, is not stable in its quality and quantity content. The volatile composition of the gaseous phase over the mucus depends on the amount of water in a sample and the sample's viscosity. For this reason, it is impossible to use the absolute values of sensor signals in order to compare the qualitative composition of the gas phase over the tested samples. We hence have decided to use a different approach through the calculation of sensor efficiency parameters A(i/j) for individual volatile compounds and by employing the projection of sensor array output data to the PCA space obtained by the processing of signals of a sensor array for vapors of pure substances. Additionally, the efficiency of the inverse procedure, consisting in the projection of multivariate signals of a sensor array for vapors of pure substances onto the principal component space, obtained for the nasal swabs samples was estimated.  The nasal mucus from calves was then analyzed with the piezoelectric sensor array. The mucus samples for analysis were collected twice with a one-week interval from all five calves with different diagnoses in respect to the animal's health state. Nasal mucus, as a biosample, is not stable in its quality and quantity content. The volatile composition of the gaseous phase over the mucus depends on the amount of water in a sample and the sample's viscosity. For this reason, it is impossible to use the absolute values of sensor signals in order to compare the qualitative composition of the gas phase over the tested samples. We hence have decided to use a different approach through the calculation of sensor efficiency parameters A(i/j) for individual volatile compounds and by employing the projection of sensor array output data to the PCA space obtained by the processing of signals of a sensor array for vapors of pure substances. Additionally, the efficiency of the inverse procedure, consisting in the projection of multivariate signals of a sensor array for vapors of pure substances onto the principal component space, obtained for the nasal swabs samples was estimated.

Sensors Efficiency Parameters A(i/j) for Individual Volatile Compounds
The approach based on the calculation of A(i/j) sensor parameters of sorption efficiency has shown its effectiveness and reliability for solving identification tasks for individual highly volatile molecules in gas mixtures [29,[37][38][39]. The fundamental difference of the proposed approach to substance identification from the one proposed earlier for the analysis of equilibrium gas phases is that the referent values for substance identification are determined taking into account wide limits of substance concentrations (from 1 ppm to 10 ppm) and the number of parameters for which the calculated value falls into the reference values for the substance. The reproducibility of the maximum sensor signals in vapors of the selected test substances was estimated by the coefficients of variation, which did not exceed 25% for the lower range of concentrations of substances reported in Table 1.
With an increase in the concentration of substances in the detection cell, the coefficient of sensor signal variation was lower than 6.5% [36,40]. Table 3 represents referent limits of A(i/j) parameters for various classes of volatile compounds in the wide range. The set of A(i/j) parameters is unique for each substance, and it is defined by the presence and by the amount of the substance in the gas phase over a biosample. Table 3. Reference values of sorption efficiency parameters A(i/j) for different classes of volatile substances in the concentration ranges indicated in Table 1.

A(i/j)
Water Ethyl-Acetate Acetone Ketones Alcohols Acetaldehyde Organic Acids Ammonia Amines To identify a substance or a group of substances, a kind of a spectrum is formed out of all sets of A(i/j) parameters ordered in a strictly defined sequence. As a result, the cross mass-sensitivity parameter spectra are obtained, as shown in Figure 3. In the spectrum, the position number of the sensor in the array is used as a reference point; nevertheless, when the order of the sensors changes, analytical information is not lost, but only the sequence of parameters in the row is adjusted. It is apparent from the example on Figure 3 that the proposed spectra differ for various substances greatly, wherein absolute values of A(i/j) parameters in the spectra may change depending on the vapor concentration in a detection cell; yet the spectra shape-the ratio of A(i/j) parameter values relative to each other-remains the same for the exact substance. In this way, the application of the A(i/j) parameters spectra allows one to solve the identification tasks through the employment of the cross-sensitive sensor array.
quence of parameters in the row is adjusted. It is apparent from the example on Figure 3 that the proposed spectra differ for various substances greatly, wherein absolute values of A(i/j) parameters in the spectra may change depending on the vapor concentration in a detection cell; yet the spectra shape-the ratio of A(i/j) parameter values relative to each other-remains the same for the exact substance. In this way, the application of the A(i/j) parameters spectra allows one to solve the identification tasks through the employment of the cross-sensitive sensor array.

Sensors Efficiency Parameters A(i/j) for Individual Volatile Compounds
At the first step, the uniqueness of the cross mass-sensitivity parameter spectra for vapors of pure substances was evaluated by PCA analysis. Data array for the individual substances was preliminarily auto-scaled in order to reduce the influence of random errors. On the PCA score plot in Figure 4a, representing the two first PCs with a total explained variance of 70%, the several groups of individual compounds, such as distilled water, organic acids, ammonia and methylamine, as well as normal alcohols С3-С5, heavy ketones, including cyclic ones, and other substances are clearly separated. Meanwhile, the score plot of the first and third PCs did not provide any additional information on the separation of substances into groups.    Figure 3. Spectra of A(i/j) cross mass-sensitivity parameters for vapors of (a) cyclohexanone and (b) distilled water.

Sensors Efficiency Parameters A(i/j) for Individual Volatile Compounds
At the first step, the uniqueness of the cross mass-sensitivity parameter spectra for vapors of pure substances was evaluated by PCA analysis. Data array for the individual substances was preliminarily auto-scaled in order to reduce the influence of random errors. On the PCA score plot in Figure 4a, representing the two first PCs with a total explained variance of 70%, the several groups of individual compounds, such as distilled water, organic acids, ammonia and methylamine, as well as normal alcohols C3-C5, heavy ketones, including cyclic ones, and other substances are clearly separated. Meanwhile, the score plot of the first and third PCs did not provide any additional information on the separation of substances into groups.
quence of parameters in the row is adjusted. It is apparent from the example on Figure 3 that the proposed spectra differ for various substances greatly, wherein absolute values of A(i/j) parameters in the spectra may change depending on the vapor concentration in a detection cell; yet the spectra shape-the ratio of A(i/j) parameter values relative to each other-remains the same for the exact substance. In this way, the application of the A(i/j) parameters spectra allows one to solve the identification tasks through the employment of the cross-sensitive sensor array.
(a) (b) Figure 3. Spectra of A(i/j) cross mass-sensitivity parameters for vapors of (a) cyclohexanone and (b) distilled water.

Sensors Efficiency Parameters A(i/j) for Individual Volatile Compounds
At the first step, the uniqueness of the cross mass-sensitivity parameter spectra for vapors of pure substances was evaluated by PCA analysis. Data array for the individual substances was preliminarily auto-scaled in order to reduce the influence of random errors. On the PCA score plot in Figure 4a, representing the two first PCs with a total explained variance of 70%, the several groups of individual compounds, such as distilled water, organic acids, ammonia and methylamine, as well as normal alcohols С3-С5, heavy ketones, including cyclic ones, and other substances are clearly separated. Meanwhile, the score plot of the first and third PCs did not provide any additional information on the separation of substances into groups.    As it can be seen from the loadings plot, reported in Figure 4b, the most significant influence on for the PC1 were the A(i/j) parameters for the eighth and second sensors, modified with MCNT2 and Zr1 coatings correspondingly, while for the PC2, the A(i/j) parameters for the fourth and fifth sensors with the biohydroxyapatite coating gave the highest influence. Additionally, the biohydroxyapatite coated sensors have shown the greatest sensitivity to vapors of alcohols, acids, aldehydes, and acetone. Sensors 2 and 7 coated with zirconium oxide nitrate have revealed the high sensitivity to the vapors of nitrogen-and sulfur-containing compounds; sensor 1 coated by processed MCNT1 showed a selective response to the vapors of both cyclic and heavy ketones, and alcohols. Notably, the differentiation of substances increases with coating mass variation. Therefore, the selected sensor set is suitable for the detection of volatile metabolites in the gas phase, including the ones connected to destructive processes, and can be applied to control the appearance and accumulation of particular substances the calf's nasal swabs.

Qualitative Evaluation of the Headspace Composition over the Nasal Swabs Samples
In the next step, the output of gas sensor array over the nasal swab's samples were projected on principal component space obtained with A(i/j) spectra in order establish the similarity of the mucus gas phase composition for calves with different clinical conditions. Additionally, the spectra of cross mass-sensitivity parameters A(i/j) were used to determine the different volatile substances in mucus samples. Wherein, since the different mucus samples' analysis was carried out on different days, to reduce the influence of the time drift of sensors in the array with an open detection cell, the matrix of A(i/j) parameters were normalized and scaled to the standard deviation by the days of the experiment. Thereafter, the data array for nasal swabs samples was auto-scaled the same as the data for individual substances.

Projection of Cross Mass-Sensitivity Parameters Spectra for Nasal Swabs Samples onto the Principal Components Space for Spectra of Individual Substances
The nasal swab sample from calf № 3 on the first day of the experiment was excluded from the sampling as an outlier due to its anomalous properties, probably connected with contamination of the sample during transportation. Based on the results of the projection, it was found that the explained variance by the first principal component is equal to 5%, which indicates the low similarity of spectrum changes of the parameters of the cross mass-sensitivity for biosamples and individual substances, as shown in Figure 5.
highest influence. Additionally, the biohydroxyapatite coated sensors have shown the greatest sensitivity to vapors of alcohols, acids, aldehydes, and acetone. Sensors 2 and 7 coated with zirconium oxide nitrate have revealed the high sensitivity to the vapors of nitrogen-and sulfur-containing compounds; sensor 1 coated by processed MCNT1 showed a selective response to the vapors of both cyclic and heavy ketones, and alcohols. Notably, the differentiation of substances increases with coating mass variation. Therefore, the selected sensor set is suitable for the detection of volatile metabolites in the gas phase, including the ones connected to destructive processes, and can be applied to control the appearance and accumulation of particular substances the calf's nasal swabs.

Qualitative Evaluation of the Headspace Composition over the Nasal Swabs Samples
In the next step, the output of gas sensor array over the nasal swab's samples were projected on principal component space obtained with A(i/j) spectra in order establish the similarity of the mucus gas phase composition for calves with different clinical conditions. Additionally, the spectra of cross mass-sensitivity parameters A(i/j) were used to determine the different volatile substances in mucus samples. Wherein, since the different mucus samples' analysis was carried out on different days, to reduce the influence of the time drift of sensors in the array with an open detection cell, the matrix of A(i/j) parameters were normalized and scaled to the standard deviation by the days of the experiment. Thereafter, the data array for nasal swabs samples was auto-scaled the same as the data for individual substances.

Projection of Cross Mass-Sensitivity Parameters Spectra for Nasal Swabs Samples onto the Principal Components Space for Spectra of Individual Substances
The nasal swab sample from calf № 3 on the first day of the experiment was excluded from the sampling as an outlier due to its anomalous properties, probably connected with contamination of the sample during transportation. Based on the results of the projection, it was found that the explained variance by the first principal component is equal to 5%, which indicates the low similarity of spectrum changes of the parameters of the cross mass-sensitivity for biosamples and individual substances, as shown in Figure 5.  Hence, the sample separation takes place in the second principal component mainly. Wherein, the spectra from almost all nasal swab samples were projected into one group, close in principal component space to the spectra of vapors of diethyl amine, 1-phenylbutanone-2, acetaldehyde, and acetone. The projections of the samples from calf № 2 were placed near the projection for acetaldehyde vapor. In the second monitoring point, the samples for calves № 3, 4, and 5 were close to vapors of 1-phenyl-butanone-2 in the principal component space, while the projection of calf № 4.1's sample was close only to acetone projection. The percentage of the explained variance for the first two principal components (34%) for nasal swab samples when projecting on the PCA model for individual substances indicates that the greatest alterations and differences in the substance spectra do not coincide with the spectra changes for nasal mucus samples.
The poor identification and description of the real biosamples based on the model for individual substances might be explained by the significant impact from the matrix of the biosample itself, which influences the redistribution of volatile components and their ratio in the gas phase, and thus alters A(i/j) parameters. Physical and chemical properties of individual compounds are incomparable with analogous properties for the biosample matrix. Evidently, the application of the traditional direct approach using the PCA model based on individual substances is unsatisfactory, since the sample grouping of nasal swabs from calves does not confirm their clinical state and the established diagnosis. Therefore, we decided to apply an inverse approach: to project the cross mass-sensitivity parameters spectra for individual substances onto the principal component space of real biosamples.

Projection of Cross Mass-Sensitivity Parameters Spectra for Individual Substances on the Principal Component Space Obtained on Real Biosamples
At first, the PCA model was constructed for nasal swab samples. As it is apparent from Figure 6, on a PCA score plot represented by the first two PCs with explained variance of 65%, the groups of samples consistent with different clinical states and diagnoses of the animals can be clearly distinguished.
the samples for calves № 3, 4, and 5 were close to vapors of 1-phenyl-butanone-2 in the principal component space, while the projection of calf № 4.1's sample was close only to acetone projection. The percentage of the explained variance for the first two principal components (34%) for nasal swab samples when projecting on the PCA model for individual substances indicates that the greatest alterations and differences in the substance spectra do not coincide with the spectra changes for nasal mucus samples.
The poor identification and description of the real biosamples based on the model for individual substances might be explained by the significant impact from the matrix of the biosample itself, which influences the redistribution of volatile components and their ratio in the gas phase, and thus alters A(i/j) parameters. Physical and chemical properties of individual compounds are incomparable with analogous properties for the biosample matrix. Evidently, the application of the traditional direct approach using the PCA model based on individual substances is unsatisfactory, since the sample grouping of nasal swabs from calves does not confirm their clinical state and the established diagnosis. Therefore, we decided to apply an inverse approach: to project the cross mass-sensitivity parameters spectra for individual substances onto the principal component space of real biosamples.

Projection of Cross Mass-Sensitivity Parameters Spectra for Individual Substances on the Principal Component Space Obtained on Real Biosamples
At first, the PCA model was constructed for nasal swab samples. As it is apparent from Figure 6, on a PCA score plot represented by the first two PCs with explained variance of 65%, the groups of samples consistent with different clinical states and diagnoses of the animals can be clearly distinguished.  Accounting for the third principal component allows one to increase the explained variance up to 83%, and it also permits the isolation of chronic pathologies, which is reflected in the similarity of the samples in the plot for animal № 1 in different control points. The loadings of parameters A(i/j) for sensor 7 coated with MCNT2 film, for bio hydroxyapatite HA1, HA2 films of sensors 4,5, and the PEGsc film-modified sensor 8 were highest in distinguishing amines of various structures, and had the greatest impact on the two first principal components (Figure 7), as well as for the PCA model for individual substances (see Figure 4b). Since the distribution of nasal swabs along the first PC coincides well with the clinical states of animals, A(i/j) parameters for sensor signals result in being informative not only for the determination of compound markers of pathological processes, but very probably may also discriminate the composition of the volatile fraction of the biosamples matrix (mucin. proteins).
highest in distinguishing amines of various structures, and had the greatest impact on the two first principal components (Figure 7), as well as for the PCA model for individual substances (see Figure 4b). Since the distribution of nasal swabs along the first PC coincides well with the clinical states of animals, A(i/j) parameters for sensor signals result in being informative not only for the determination of compound markers of pathological processes, but very probably may also discriminate the composition of the volatile fraction of the biosamples matrix (mucin. proteins). In order to better visualize the differences in the gas phase content over the mucus samples, we pointed out of the total set of A(i/j) parameters which can maximally differentiate the differences associated with the clinical condition of an animal in the chemical content of gas phase of the samples. For this, the most representative loadings, and hence the most informative A(i/j) parameters for which in both plots the load value exceeds 0.2 for the first or second principal components were selected from the loadings plot in Figures 4b and 7. In total, six parameters were selected: Next, from the selected ten parameters, the new spectra of cross mass-sensitivity for two extreme states (the calf with bronchopneumonia (№ 5.2) and the calf demonstrating respiratory health (№ 4.1)) were built, as shown in Figure 8а. Due to the high affinity of sensors with MCNT and HA films to volatile biomolecules, for the visual evaluation of differences in parameter spectra of cross mass-sensitivity, the values of parameters А(1/3), А(2/6), А(5/7), А(6/7) were scaled to the maximum values of the corresponding parameters in biosamples and individual substances. These spectra reflect A(i/j) parameters, which collectively differ the most for the gas phase composition over nasal swab samples of the calves in different clinical condition. Using the similar set, we built spectra of cross masssensitivity for vapors of pure individual substances. It was established that the water spectrum (Figure 8b) is qualitatively similar to the spectrum of nasal swab samples for the conditionally healthy calf (sample № 4.1), while the cyclohexanone spectrum (Figure 8c), being a marker of destructive processes [19,20], looks like a spectrum of a biosample of In order to better visualize the differences in the gas phase content over the mucus samples, we pointed out of the total set of A(i/j) parameters which can maximally differentiate the differences associated with the clinical condition of an animal in the chemical content of gas phase of the samples. For this, the most representative loadings, and hence the most informative A(i/j) parameters for which in both plots the load value exceeds 0.2 for the first or second principal components were selected from the loadings plot in Figures 4b and 7. In total, six parameters were selected: A(1/3), A(1/4), A(1/5), A(2/6), A(5/7), and A(6/8). To calculate the influence of the mucus sample matrix on the composition of the volatile fraction, the significant parameters of the PCA model for the vapors of the individual substances (loading value is 0.1-0.2 according to the first or second principal components), which coincide with the loading values with parameters of the PCA model for the samples of nasal swabs, namely A(2/8), A(6/7), A(6/8), A(7/8), were additionally chosen.
Next, from the selected ten parameters, the new spectra of cross mass-sensitivity for two extreme states (the calf with bronchopneumonia (№ 5.2) and the calf demonstrating respiratory health (№ 4.1)) were built, as shown in Figure 8a. Due to the high affinity of sensors with MCNT and HA films to volatile biomolecules, for the visual evaluation of differences in parameter spectra of cross mass-sensitivity, the values of parameters A(1/3), A(2/6), A(5/7), A(6/7) were scaled to the maximum values of the corresponding parameters in biosamples and individual substances. These spectra reflect A(i/j) parameters, which collectively differ the most for the gas phase composition over nasal swab samples of the calves in different clinical condition. Using the similar set, we built spectra of cross mass-sensitivity for vapors of pure individual substances. It was established that the water spectrum (Figure 8b) is qualitatively similar to the spectrum of nasal swab samples for the conditionally healthy calf (sample № 4.1), while the cyclohexanone spectrum (Figure 8c), being a marker of destructive processes [19,20], looks like a spectrum of a biosample of the calf № 5.2 with bronchopneumonia. The identity of these spectra may indicate the correctness of the proposed approach of narrowing the information area for individual substances by calculated parameters for nasal swab samples. Finally, the cross mass-sensitivity parameter spectra for individual substances were projected onto the principal component space for nasal swab samples. As it can be seen from Figure 9, in the obtained projection, the explained variance for individual substances is quite small (around 11%), yet the logical dependence of the change in the sorption properties of substances is reflected along the first principal component in the plot: water < alcohols, acids < ethers, light ketones, aldehydes < heavy ketones, cyclic compounds. the calf № 5.2 with bronchopneumonia. The identity of these spectra may indicate the correctness of the proposed approach of narrowing the information area for individual substances by calculated parameters for nasal swab samples. Finally, the cross mass-sensitivity parameter spectra for individual substances were projected onto the principal component space for nasal swab samples. As it can be seen from Figure 9, in the obtained projection, the explained variance for individual substances is quite small (around 11%), yet the logical dependence of the change in the sorption properties of substances is reflected along the first principal component in the plot: water < alcohols, acids < ethers, light ketones, aldehydes < heavy ketones, cyclic compounds. Since calf № 4 was a respiratory healthy subject in the first point, the volatile markers of inflammation processes were not detected in its nasal swabs, and presence of other metabolites was below the 1ppmv. Therefore, the projection of cross mass-sensitivity parameter spectra for individual substances demonstrated the correspondence of the gas  Finally, the cross mass-sensitivity parameter spectra for individual substances were projected onto the principal component space for nasal swab samples. As it can be seen from Figure 9, in the obtained projection, the explained variance for individual substances is quite small (around 11%), yet the logical dependence of the change in the sorption properties of substances is reflected along the first principal component in the plot: water < alcohols, acids < ethers, light ketones, aldehydes < heavy ketones, cyclic compounds. Since calf № 4 was a respiratory healthy subject in the first point, the volatile markers of inflammation processes were not detected in its nasal swabs, and presence of other metabolites was below the 1ppmv. Therefore, the projection of cross mass-sensitivity parameter spectra for individual substances demonstrated the correspondence of the gas Since calf № 4 was a respiratory healthy subject in the first point, the volatile markers of inflammation processes were not detected in its nasal swabs, and presence of other metabolites was below the 1ppmv. Therefore, the projection of cross mass-sensitivity parameter spectra for individual substances demonstrated the correspondence of the gas phase composition over this sample only to water vapors. According to the model, masssensitive parameter spectra for samples from calves with acute bronchopneumonia (№ 2.1, 2.2, 5.2) are closest to spectra of heavy ketones, cyclic compounds (1-phenyl-butanon-2, 3methylcyclohexanone, benzyl amine); sample № 3.2 is the closest to spectra of butyric acid, acetaldehyde, ethanol, which indicates the presence of pathogenic agents and destructive processes in organs and tissues [18][19][20]. Parameter spectra of mass-sensitivity for butanol-2, 4-methylpentanone-2 and 5-methyl hexanone-2 are close to spectra for samples of acute bronchitis (4.2, 5.1), which indicates the metabolism alterations in the upper respiratory tract and the presence of inflammation [22]. In chronic lung damage (calf № 1 in two control points), the A(i/j) spectra are near the spectra for ethyl acetate, pentanol-1, propanol-1, butyric acid and water. When considering the image shift of nasal swab samples from an animal in the space of principal components in dynamics, it is possible to observe that the deterioration of the clinical condition of the animal is accompanied by an increase in the amount of heavy branched and cyclic ketones, amines and alcohols. This corresponds to the reference data [16,22] and to X-ray results of the chests of the calves (for calves, it is 1 < 3 < 2 increased area of lung damage). To quantify the degree of similarity of the truncated spectra of the cross mass-sensitivity of nasal swab samples and individual substances, the similarity parameters δ, estimating the degree of coincidence of the two sets of parameters A(i/j), were calculated according to the [40], and the obtained values are listed in Table 4. Parameters δ were calculated according to the Equation (3): x i , y i -values of parameters A(i/j) for biosample and test-substance, respectively; n-number of parameters A(i/j), by which comparison is made. It was established that the value higher than 0.444 for 10 similarity parameter δ indicates a significant coincidence (by t-test, p ≤ 0.05) of the spectra of the cross mass-sensitivity for nasal swab samples with the spectra for individual substances. In this case, the coincidence of more than three identification parameters and a high value of the similarity parameter unambiguously indicates the presence of a substance in the gas phase. It was found also that a large amount of acetone (7 or more ppmv) and methyl ethyl ketone was reliably identified in the gas phase over samples of nasal swabs from calves with respiratory inflammation (samples № 1.2., 2.2., 4.2). Additionally, 3-methylcyclohexanone was present in the gas phase of nasal swab samples from calves with a chronic source of infection (samples № 1.1, 1.2, 2.1, 3.2, 5.1), while cyclopentanone was identified in the gas phase over samples from calves with bronchopneumonia (№ 2.2). Water vapor was identified in all samples except sample 3.1. due to its pollution.

Conclusions
In the present work, the application of a piezoelectric gas sensor array in combination with a frontal sample input have allowed one to evaluate the presence and negative dynamics in animal health state alterations down to severe cases of bronchopneumonia. It was demonstrated that the volatile fraction of nasal mucus swab samples, tested with the sensor array, contains useful information on the presence of strongly marked sources of inflammation in a calf's body. The employed sensor array was highly informative for the evaluation of metabolism of the upper respiratory tract, and also for distinguishing chronic or subclinical forms of respiratory diseases. Taking into account the simplicity in operation, fast response time, the no need for sample pretreatment and user-friendly software, the array can be applied in situ (directly in a farm) to monitor an animal's health state. An application of spectra of cross mass-sensitivity parameters A(i/j), calculated on the signals of the sensor array, has allowed one to differentiate and identify water and metabolite vapors of natural and pathogenic processes in the tested biosamples. The traditional approach for the identification of substances in biosamples, which is used in other analytical methods that include mandatory preliminary sample pretreatment and data post-processing via chemometrics methods, cannot be applied while analyzing the samples without sample preparation. The proposed approach of the parameter spectra projection of cross mass-sensitivity for test substances on the principal component space, obtained from similar spectra for swab biosamples (inverse projection), describes the presence of metabolites in the gas phase more adequately and corresponds to the results of the clinical and laboratory studies, especially when assessing changes in the clinical state of an animal, which proves the correctness of the suggested approach. Moreover, the proposed method can be used to narrow the information area in the case of an untargeted search for metabolites; the inverse projection algorithm can be used to process a large number of results for bioassays. The regression models may be constructed in order to predict specific metabolites quantitatively. The reproducibility, limit of detection and sensitivity of the analysis will depend on cross mass-sensitivity parameters A(i/j), calculated from the signals of the sensors array. When comparing the direct and inverse approaches for the pattern recognition of volatile compounds in the gas phase over biological samples, the adequate approach reflecting the state of biological samples is the inverse. Its main advantage in comparison with the direct methodology is an increase in the sensitivity of substance recognition due to the extraction of the most significant information from the data set for test substances, which correlates with the clinical state of the animal. We believe that the sensitivity and specificity of substance identification as binary response variables using the inverse algorithm will be higher and check this hypothesis in the following study. However, the disadvantage is the necessity to build a PCA model for biosamples that correctly classifies the samples on the score plot. Currently, the tests on developed method application for routine swab analysis are run in our laboratories and will be presented in the next article.