A Chemiresistor Sensor Array Based on Graphene Nanostructures: From the Detection of Ammonia and Possible Interfering VOCs to Chemometric Analysis

Sensor arrays are currently attracting the interest of researchers due to their potential of overcoming the limitations of single sensors regarding selectivity, required by specific applications. Among the materials used to develop sensor arrays, graphene has not been so far extensively exploited, despite its remarkable sensing capability. Here we present the development of a graphene-based sensor array prepared by dropcasting nanostructure and nanocomposite graphene solution on interdigitated substrates, with the aim to investigate the capability of the array to discriminate several gases related to specific applications, including environmental monitoring, food quality tracking, and breathomics. This goal is achieved in two steps: at first the sensing properties of the array have been assessed through ammonia exposures, drawing the calibration curves, estimating the limit of detection, which has been found in the ppb range for all sensors, and investigating stability and sensitivity; then, after performing exposures to acetone, ethanol, 2-propanol, sodium hypochlorite, and water vapour, chemometric tools have been exploited to investigate the discrimination capability of the array, including principal component analysis (PCA), linear discriminant analysis (LDA), and Mahalanobis distance. PCA shows that the array was able to discriminate all the tested gases with an explained variance around 95%, while with an LDA approach the array can be trained to accurately recognize unknown gas contribution, with an accuracy higher than 94%.


Introduction
In the last few decades, being set at front-end in IoT (Internet of Things), gas sensors have become quite important in everyday life and therefore, the interest of researchers in their development has noticeably increased. Market demands have pushed scientists to put their efforts in the development of extremely sensitive, low cost, reliable, stable, highly selective, and with rapid response and recovery, single sensors [1,2], exploiting different materials, from metal oxide nanostructures [3][4][5] and semiconductor layers [6,7] to polymers [8,9] and carbon-based nanomaterials [10][11][12], as well as different working configurations, i.e., capacitors [13,14], field effect transistors [15,16], or chemiresistors [8,14].
For several applications, including breathomics, food quality tracking, or environmental monitoring, a single sensor's high selectivity is particularly important, since the analyte under investigation should be recognized and detected among many interfering gases [17,18]. Additionally, some applications specifically required the capability of a device to simultaneously monitor different compounds [17,19].
For these application fields, the development of a single chemiresistor sensor may not be the ultimate solution, on one hand due to the difficulties to maintain a high selectivity in presence of many interfering gases [17,18,20], on the other hand because of the inability of a single chemiresistor sensor to detect more than one specific analyte at the same time [20][21][22][23]. In order to prepare the sensors, 15 droplets of each graphene-based solution have been drop-casted on 5 interdigitated alumina substrates and let dry upon solvent evaporation at room temperature. The prepared samples will be labelled as Gr dispersion, Gr_nanoplatelets, Gr_Fe3O4, Gr_CoPt, and Gr_TiO2.  In order to prepare the sensors, 15 droplets of each graphene-based solution have been drop-casted on 5 interdigitated alumina substrates and let dry upon solvent evaporation at room temperature. The prepared samples will be labelled as Gr dispersion, Gr_nanoplatelets, Gr_Fe 3 O 4 , Gr_CoPt, and Gr_TiO 2 .

Sample Characterization
Raman spectra have been collected on the CGS device with a Renishaw-Invia system, equipped with a 532 nm laser source. The laser light has always been focused onto the sample with a 100× objective. An 1800 lines/mm grating and a laser power of 1 mW have been used for the measurements.

Gas Sensor Measurements
The five prepared samples have been mounted on a properly designed platform and they can work simultaneously (scheme in Figure 1c), allowing a direct comparison of the behaviours of the sensors under the same environmental conditions. Two commercial sensors have been put on the platform: a relative humidity (RH) sensor (humidity sensor HIH-4000 series-Honeywell Sensing) and a temperature sensor (Thermistor NTC PCB 5K-Murata).
The developed graphene-based array works in a chemiresistor configuration ( Figure 1b): the gas analytes are detected by measuring the resistance changes of the sensing layers induced by the interaction with the gas molecules; the electronic circuit of each sensor comprises a load resistor (R L ) in series with the sensor and by applying a constant voltage (V = 5 V) and monitoring the output voltage across the sample (V OUT ), it is possible to track the resistance R of the sensor. The response of the sensor is then defined as ∆R/R 0 = (R − R 0 )/R 0 , where R 0 is the baseline sensor resistance before the gas exposure, and ∆R= R − R 0 is the resistance variation due to the interaction with the gas molecules.
Gas exposures have been performed at room temperature (T = (25 ± 4) • C) and with a relative humidity (RH) value around 50 ± 5%, in order to investigate the sensing behaviour of the array in a working condition close to the application environment. Indeed, both the temperature and humidity tested range could be considered suitable for application in environmental monitoring, safety, and food quality tracking, where the array would work in STP conditions.
The recovery is always achieved in the same condition, fluxing humid air on the array. A scheme of the set up for the gas exposure is reported in Figure S1. The gas analyte, contained in certified cylinder (S.I.A.D. S.p.A.), is fluxed in a sealed homemade chamber (V = 1 L) through a mass flow controller (MFC) (MKS Instruments) connected to the sealed chamber. The MCF connected to the air cylinder has a maximum flow of 500 sccm, while the maximum flow of the MFC connected to the analyte cylinders is 200 sccm. In all the gas measurements, air has been used for both chamber purges after the exposure and to dilute the analyte in order to obtain different concentration exposures. Exposure time has been set at about 3 min. Considering several exposures, calibration curves for each sensor could be obtained by plotting the sensor response ∆R/R 0 versus the gas concentration.
A commercial sensor (Figaro, Mod. TGS 2602) has been exploited to monitor and evaluate ammonia concentration. The reliability of this sensor calibration curve has been cross-checked by exposing the sensor in a testing chamber with calibrated mass flow controllers (MFCs) and certified ammonia and air cylinders.

Chemometric Analysis
To assess the discrimination capability of the array, the data collected from exposures to six different gases have been analyzed at first with principal component analysis (PCA) and then with linear discriminant analysis (LDA) and Mahalanobis distance, all implemented into the R software.
PCA is a chemometric method, which is unsupervised and can reduce the problem dimensionality while maximizing the variance of the data; it reorganizes the data into a set of principal orthogonal components (PCs), where PC1 is the component that has the greatest variance, PC2 represents the component with the second greatest variance, and so on. The aim of this statistical tool is to obtain a 2-dimensional or 3-dimensional space where a clear discrimination of the tested gases could be identified [58,59].
LDA is a supervised chemometric tool, therefore it takes into account the class properties of each sample; it assumes that each class has an identical covariance, and unlike the PCA, it deals only with the covariance of the data matrix. The optimal clustering among different classes (i.e., in this case different gases) is obtained by maximizing the distance between classes and minimizing the scattering of the data inside each class, resulting in a reduction of the problem dimensionality, and obtaining a new space generated by new coordinates, called discriminant factors [60]. LDA gives a better separation among different classes than PCA, allowing for an easier classification. Furthermore, it also allows for training and testing analysis on the dataset.
The Mahalanobis distance has been selected to evaluate the distance between a gas contribution and each cluster in an LDA space, with the aim to present an alternative approach for the identification of unknown data. Indeed, the Mahalanobis distance is the lowest for the class which a datum belongs to. The algorithm has been written in R as in ref. [61].
The data collected from the gas exposures have been pre-treated only by column mean-centering to feed the PCA, LDA, and Mahalanobis distance algorithm and a set of 32 exposures to six different gases has been considered. Figure 2 shows the representative Raman spectra collected on the five samples. In all spectra the typical peaks related to graphene are visible: the G-band is located at 1590 cm −1 and related to the C-C stretching of the sp 2 atoms in the honeycomb lattice of graphene [62], and the 2D band is at around 2660 cm −1 , a second order band related to the breathing mode of carbon atoms in the plane of graphene [62]. Regarding the 2D band, it is worth pointing out that the peak is not symmetric, as expected, due to the fact that the graphene flakes are randomly distributed on the alumina substrate after the dropcasting, and they do not form a uniform monolayer. The D-band, located around 1340 cm −1 , can be ascribed to some disorder in the graphene structure [62], and it is present in all the samples, although its intensity is stronger in the Gr dispersion (red spectrum) and Gr_nanoplatelets (yellow spectrum) samples. A strong D-band presence corresponds also the appearance of the D'-band around 1615 cm −1 , which is also related to the presence of defects [62].

Results and Discussion
Regarding the Gr_TiO 2 spectrum (green curve), the contributions of the TiO 2 NPs are visible at around 143 cm −1 , 445 cm −1 , and 615 cm −1 , and in particular the peaks are related to the anatase and rutile phase, respectively [63,64]. The Fe 3 O 4 contribution in the Gr_Fe 3 O 4 sample (purple spectrum) could be found in the broad band at around 750 cm −1 [65]. Finally, for the Gr_CoPt spectrum (blue curve), a weak peak is present at around 680 cm −1 , and it is related to the CoPt composites presence [66].
After the Raman characterization, the 5 samples have been set on a properly designed board, able to host all the 5 layers under investigation for the simultaneous detection of the sensor response during, at first, ammonia exposures.
As an example of the sensor array response, the resistance change measured upon exposure to three different ammonia concentrations (i.e., 14 ppm, 17 ppm, and 36 ppm) is shown in Figure 3 left panel. The exposure time has been set around 160 s. pointing out that the peak is not symmetric, as expected, due to the fact that the graphene flakes are randomly distributed on the alumina substrate after the dropcasting, and they do not form a uniform monolayer. The D-band, located around 1340 cm −1 , can be ascribed to some disorder in the graphene structure [62], and it is present in all the samples, although its intensity is stronger in the Gr dispersion (red spectrum) and Gr_nanoplatelets (yellow spectrum) samples. A strong D-band presence corresponds also the appearance of the D'-band around 1615 cm −1 , which is also related to the presence of defects [62]. Regarding the Gr_TiO2 spectrum (green curve), the contributions of the TiO2 NPs are visible at around 143 cm −1 , 445 cm −1 , and 615 cm −1 , and in particular the peaks are related to the anatase and rutile phase, respectively [63,64]. The Fe3O4 contribution in the Gr_Fe3O4 sample (purple spectrum) could be found in the broad band at around 750 cm −1 [65]. Finally, for the Gr_CoPt spectrum (blue curve), a weak peak is present at around 680 cm −1 , and it is related to the CoPt composites presence [66].
After the Raman characterization, the 5 samples have been set on a properly designed board, able to host all the 5 layers under investigation for the simultaneous detection of the sensor response during, at first, ammonia exposures. First of all, all the sensors increase their resistance upon ammonia exposure, disclosing an overall p-type behaviour. Considering the recovery, defined as the time required for the sensor to return to 80% of the original baseline resistance after the gas exposure, it has always been achieved by all sensors at room temperature in about 15 min, Gr_CoPt, Gr_Fe 3 O 4 , and Gr_TiO 2 being even faster.
In order to draw the calibration curves, several exposures to ammonia at different concentrations (from 2 ppm to 36 ppm) have been carried out and the results are reported in Figure 3 right panel. A Freundlich isotherm (∆R/R 0 = y + A [NH 3 ] pow ) has been selected to interpolate the data and the fitting parameters, reported in Table S1, have been used to evaluate the limit of detection (LOD) according to the formula: 5σ/R 0 = y + A [LOD] pow [33,67], where σ is the fluctuation of the electrical signal. The evaluated limit of detection for all the samples is in the ppb range (Table S1) and in particular, the lowest values have been obtained for Gr_CoPt and Gr_Fe 3 O 4 : 0.1 ppb and 7.2 ppb, respectively. The higher values observed for Gr_TiO 2 , Gr dispersion, and Gr_nanoplatelets are mainly due to the larger fluctuations of the electrical signal (σ), which is noticeable in Figure 3 left panel.
The calibration curves present a sublinear behaviour, which is quite common in the field of gas sensor based on carbon nanomaterials [49,68,69].
The stability and reproducibility of the sensor response have been proven for ammonia detection and the results clearly demonstrated a good stability and reproducibility up to 3 months after the samples preparation (see supporting file, Figure S2). As an example of the sensor array response, the resistance change measured upon exposure to three different ammonia concentrations (i.e., 14 ppm, 17 ppm, and 36 ppm) is shown in Figure 3 left panel. The exposure time has been set around 160 s. First of all, all the sensors increase their resistance upon ammonia exposure, disclosing an overall p-type behaviour. Considering the recovery, defined as the time required for the sensor to return to 80% of the original baseline resistance after the gas exposure, it has always been achieved by all sensors at room temperature in about 15 min, Gr_CoPt, Gr_Fe3O4, and Gr_TiO2 being even faster.
In order to draw the calibration curves, several exposures to ammonia at different concentrations (from 2 ppm to 36 ppm) have been carried out and the results are reported in Figure 3 right panel. A Freundlich isotherm (∆R/R0 = y + A [NH3] pow ) has been selected to interpolate the data and the fitting parameters, reported in Table S1, have been used to evaluate the limit of detection (LOD) according to the formula: 5σ/R0 = y + A [LOD] pow , [33,67] where σ is the fluctuation of the electrical signal. The evaluated limit of detection for all the samples is in the ppb range (Table S1) and in particular, the lowest values have Moreover, the stability of the sensors' response to ammonia exposures in a working temperature range suitable for application in environmental monitoring, breathomics, and also food quality tracking (i.e., 21 • C < T < 29 • C) has been assessed (see supporting information file, Figure S3).
As for the sensing mechanisms, the resistance increase upon ammonia exposure indicates that the all layers exhibit an overall p-type nature, consistently with the fact that the electron injection from ammonia reduces the density of holes in these layers.
Regarding Gr dispersion and Gr_nanoplatelets layers, the sensing mechanism is well known in literature: an electron transfer from ammonia to graphene occurs when the N atom of ammonia faces the graphene lattice, as shown in Figure S4a, resulting in an overlap between the HOMO of ammonia and the graphene orbitals [70,71]. Furthermore, being that the graphene under investigation is not a perfect monolayer, it is possible to guess that the high number of defects, observed also by Raman spectroscopy, and edges present in the layer favors the interaction with ammonia [71,72].
Regarding Gr_Fe 3 O 4 and Gr_TiO 2 layers, although both Fe 3 O 4 and TiO 2 nanoparticles act as catalytic active centers, while graphene provides the conductive pathway, in addition to being itself a site for the interaction, the sensing mechanism is slightly different. In the case of Fe 3 O 4 , as reported in [73], the N lone pair in ammonia is supposed to donate electrons to the trivalent iron atom of Fe 3 O 4 creating a pair of soliton electrons, while H atoms in ammonia interacts with the oxygen atoms in Fe 3 O 4 forming a quite strong bond (see scheme in Figure S4b). Finally, as observed in other 2D materials functionalized with Fe 3 O 4 NPs, these NPs allow for a fast electron transfer to the graphene layer [73].
Regarding the Gr_TiO 2 sensor, two mechanisms have been suggested for ammonia sensing: (i) TiO 2 is an n-type layer [74][75][76][77], when coupled with the p-type graphene, or in general with a p-type 2D-layer [77], it forms a p-n junction, which results in a depletion region positively charged on the TiO 2 surface at the interface with the graphene layer, due to an electron transfer, and a consequent reduction of the activation energy for ammonia adsorption near the TiO 2 surface ( Figure S4c) [74,75]; (ii) TiO 2 is considered as a Lewis' acid and, being NH 3 a Lewis's base, a strong bond can be formed upon their interaction [76]. In both mechanisms, the electron exchanged between ammonia and TiO 2 is easily transferred to graphene, and it is a reversible process [75].
Finally, no work has been published to date on gas sensors based on Gr_CoPt nanocomposites or CoPt NPs alone, therefore a sensing mechanism has not been proposed so far. It is possible to guess that the interaction with ammonia would occur in a twofold manner, as for Gr_TiO 2 and Gr_Fe 3 O 4 sensors: CoPt probably principally acts as a catalytic centre for ammonia interaction and then it will transfer the donated electron to the graphene layer ( Figure S4d); at the same time ammonia can also interact with the graphene active sites.
Finally, a benchmarking with literature data has been performed considering the sensitivity parameter, defined as S = 100 × (∆R/R 0 )/[NH 3 ]. Of note, only papers based on graphene and clearly reporting gas concentration and sensor response/sensitivity, operating at room temperature and in a chemiresistor configuration have been taken into account for this benchmarking.
As already mentioned, another important characteristic of single sensors is the selectivity, therefore, the prepared layers have been exposed to some of the most common interfering gases: acetone, ethanol, 2-propanol, sodium hypochlorite, and water vapours. Results are summarized in Figure 4, which reports the responses of the sensors array to the selected target molecules expressed as the sensor response ∆R/R 0 . First of all, the possibility to test the 5 sensors simultaneously allows for the identification of the best performing sensors: in this array, Gr dispersion shows a huge response to acetone as compared to the other sensors, while Gr_nanoplatelets appears to be a promising layer for 2-propanol detection.
Secondly, all the sensors change their resistance during exposure to all the tested gases, except Gr_nanoplatelets exposed to acetone, ethanol, and water and Gr_TiO 2 exposed to 2-propanol; indeed, in these cases a resistance change has not been observed.
On the basis of these results, the present sensing layers do not display high selectivity: the extent of the response to ammonia (Figure 4a) is higher than the response to the other gases, as it is expected since the response to alcohols is usually low for carbon-based sensors [38,46], nevertheless the latter responses are not completely negligible ( Figure 4b).
As often reported in literature [17,18,41] and already mentioned in the introduction, a way to overcome the selectivity problem for single chemiresistor sensors is to deal with an array coupled with chemometric analysis on the whole dataset collected with all chemiresistor sensors upon exposure to the selected target gas molecules at different concentrations, i.e., to deal with an electronic nose.
terfering gases: acetone, ethanol, 2-propanol, sodium hypochlorite, and water v Results are summarized in Figure 4, which reports the responses of the sensors a the selected target molecules expressed as the sensor response ΔR/R0. First of all, t sibility to test the 5 sensors simultaneously allows for the identification of the be forming sensors: in this array, Gr dispersion shows a huge response to acetone a pared to the other sensors, while Gr_nanoplatelets appears to be a promising laye propanol detection. Secondly, all the sensors change their resistance during exposure to all the gases, except Gr_nanoplatelets exposed to acetone, ethanol, and water and Gr_T posed to 2-propanol; indeed, in these cases a resistance change has not been obser On the basis of these results, the present sensing layers do not display high s ity: the extent of the response to ammonia (Figure 4a) is higher than the respons other gases, as it is expected since the response to alcohols is usually low for carbon sensors [38,46], nevertheless the latter responses are not completely negligible ( Fig  As often reported in literature [17,18,41] and already mentioned in the introd a way to overcome the selectivity problem for single chemiresistor sensors is to de an array coupled with chemometric analysis on the whole dataset collected w chemiresistor sensors upon exposure to the selected target gas molecules at differe centrations, i.e., to deal with an electronic nose.
In order to enrich the data set, exposures to different concentrations of the sa have been considered with the aim to perform PCA and LDA on the sensors' resp In order to enrich the data set, exposures to different concentrations of the same gas have been considered with the aim to perform PCA and LDA on the sensors' responses.
In particular, a 5 × 32 response matrix has been used to perform chemometric analysis, with 5 sensors composing the array and 32 exposures performed at different concentrations of the six tested gases. The 32 exposures include 6 exposures to ammonia, 6 to water vapour, 5 to acetone, 5 to 2-propanol, 5 to sodium hypochlorite, and 5 to ethanol.
The concentration range is reported in Table S2.
The results of the PCA are shown in Figure 5. Considering the space generated by PC1 and PC2 (Figure 5a), as well as the space created by PC1 and PC3 (Figure 5b), all the tested gas molecules are clearly discriminated, since no overlap among different target gas clusters is observed. Furthermore, as previously found for arrays based on carbon nanomaterials [41,49], a concentration trend is established in each cluster, i.e., each gas concentration decreases going towards the center of the reference system, as indicated by the arrows in Figure 5. It is worth mentioning that although the developed sensors do not show a particular high response to all gases but ammonia, when their response is combined in the PCA analysis, the whole array is able to completely discriminate each gas contribution.
PCA loading plots can provide key information on the importance of each sensor in the discrimination capability of the whole array.
As shown in Figure 5c, which reports the loading of each sensor for each of the three components, all sensors equally contribute to PC1; therefore, they are all clearly involved in the discrimination of ammonia and water vapour from all the other gas contributions along the PC1 axis of both the PC1-PC2 and PC1-PC3 space. The second component is mainly defined by Gr dispersion and Gr_TiO 2 , respectively, while the Gr_nanoplatelets layer is the main responsible for PC3 discrimination. In particular, PC2 clearly separates acetone contribution from all the other gases, while along PC3 it is possible to discriminate sodium hypochlorite. It is worth noting that Gr_nanoplatelets and Gr dispersion sensors are the ones showing the best responses to sodium hypochlorite and acetone, respectively, and they are also the main responsible of their discrimination in a PCA space. gas clusters is observed. Furthermore, as previously found for arrays based on carbon nanomaterials [41,49], a concentration trend is established in each cluster, i.e., each gas concentration decreases going towards the center of the reference system, as indicated by the arrows in Figure 5. It is worth mentioning that although the developed sensors do not show a particular high response to all gases but ammonia, when their response is combined in the PCA analysis, the whole array is able to completely discriminate each gas contribution. To further support the claim on the role of each sensor in the discrimination of a specific target gas, PCA has been performed removing from the dataset the response of the sensor that better define PC2 or PC3.
When removing Gr dispersion and Gr_TiO 2 responses from the dataset, acetone is not discriminated anymore, but its contribution overlaps with ethanol (Figure 5e), whereas when removing Gr_nanoplatelets responses the discrimination of sodium hypochlorite on PC3 worsen considerably, resulting in an overlap between sodium hypochlorite and 2-propanol datasets (Figure 5f).
PCA alone is a chemometric tool that does not provide a probability on the output and, therefore, it is not possible to quantify its performance; indeed, PCA needs to be coupled to other algorithms, such as supported vector machine (SVM), in order to become predictive on the probability of a point to belong to a specific cluster [68].
On the contrary, LDA is a chemometric analysis that can be considered predictive and quantitative. In general, a predictive algorithm should be trained with an initial dataset (i.e., training dataset), which should provide the optimal linear combination of sensor results and the best separation between different clusters. In this context, LDA can be used to better separate the clusters related to the different gases, since it works maximizing the distance between classes and minimizing the scattering of the data inside each class.
After the training, a test dataset can be projected on the newly built model; in this way, unknown data can be recognized.
Performing LDA on the 32 exposures dataset, we also perform internal cross validation (CV) of the model with CV segment equal to 6; this mean that to build the model, the LDA algorithm randomly removes 6 datapoints and uses these data to test the newly built model, providing an accuracy of the model itself. Figure 6a shows the results of LDA performed on the whole dataset, which has been built with a cross validation accuracy of 100%. After using the whole dataset to investigate the capability of the array to identify unknown data through LDA, we randomly split the dataset into two subgroups: a training and a test dataset, both containing data from the all the 6 tested gases, and we perform LDA on the training set only. The test set is then projected on the LDA direction to assess the capability of the array to correctly identify the nature of an unknown.
As an example, Figure 6b reports the results of the LDA performed on a training dataset that contains 26 data points, with a cross validation accuracy of 88% (see Tables  S3 and S4 for the confusion matrix of cross validation and accuracy index); it is worth observing that the data point position in Figure 6b is not exactly the same of Figure 6a, since the LDA has not been carried out on the same dataset.
After the LDA model has been drawn, the test dataset composed of 6 unidentified data (in this particular case one point from each gas class has been randomly selected to be part of the test dataset) has been used to validate the model. The model has been able to recognize and correctly assign each point to the correct class, as reported in Table S5, therefore the prediction accuracy is 100%.
Moreover, after external validation it is possible to investigate the distance of each point in the test dataset from the different gas clusters, generated by the LDA model, and recognize the belonging class as function of the distance: the smallest distance identifies the class which the point belongs to. To carry out this task, Mahalanobis distance algorithm, being the most exploited for classification, has been considered and the results are reported in Figure 7. After using the whole dataset to investigate the capability of the array to identify unknown data through LDA, we randomly split the dataset into two subgroups: a training and a test dataset, both containing data from the all the 6 tested gases, and we perform LDA on the training set only. The test set is then projected on the LDA direction to assess the capability of the array to correctly identify the nature of an unknown.
As an example, Figure 6b reports the results of the LDA performed on a training dataset that contains 26 data points, with a cross validation accuracy of 88% (see Tables S3 and S4 for the confusion matrix of cross validation and accuracy index); it is worth observing that the data point position in Figure 6b is not exactly the same of Figure 6a, since the LDA has not been carried out on the same dataset.
After the LDA model has been drawn, the test dataset composed of 6 unidentified data (in this particular case one point from each gas class has been randomly selected to be part of the test dataset) has been used to validate the model. The model has been able to recognize and correctly assign each point to the correct class, as reported in Table S5, therefore the prediction accuracy is 100%.
Moreover, after external validation it is possible to investigate the distance of each point in the test dataset from the different gas clusters, generated by the LDA model, and recognize the belonging class as function of the distance: the smallest distance identifies the class which the point belongs to. To carry out this task, Mahalanobis distance algorithm, being the most exploited for classification, has been considered and the results are reported in Figure 7. In this example, it is clear that point 1 belongs to the ammonia cluster, point 2 to acetone class, point 3 to the 2-propanol class, and point 6 to water class. Regarding points 4 and 5, the smallest distances are for ethanol and sodium hypochlorite, respectively, but those points are quite close also to sodium hypochlorite and ethanol groups, respectively. This result is in agreement also with the LDA model presented in Figure 6b, where it is possible to notice that ethanol and sodium hypochlorite contributions cluster very closely.
Of note is also that the highest distance values are obtained considering point 1 and water and point 6 and ammonia; also in this case, this result is quite expected since point 1 belongs to ammonia cluster and point 6 to water group and ammonia and water contributions are quite distant in the LDA model (Figure 6b).
Finally, Figure 8 shows the accuracy of internal validation and prediction for several different combinations of training-test datasets dimension (i.e., 32-0, 30-2, 28-4, 26-6, 24-8, 21-11, 19-13, and 17-15), by picking 10 random subsets for each case. The internal CV ranges from 100% to 67%, while prediction accuracy goes from 100% to 94%. In this example, it is clear that point 1 belongs to the ammonia cluster, point 2 to acetone class, point 3 to the 2-propanol class, and point 6 to water class. Regarding points 4 and 5, the smallest distances are for ethanol and sodium hypochlorite, respectively, but those points are quite close also to sodium hypochlorite and ethanol groups, respectively. This result is in agreement also with the LDA model presented in Figure 6b, where it is possible to notice that ethanol and sodium hypochlorite contributions cluster very closely.
Of note is also that the highest distance values are obtained considering point 1 and water and point 6 and ammonia; also in this case, this result is quite expected since point 1 belongs to ammonia cluster and point 6 to water group and ammonia and water contributions are quite distant in the LDA model (Figure 6b).
Finally, Figure 8 shows the accuracy of internal validation and prediction for several different combinations of training-test datasets dimension (i.e., 32-0, 30-2, 28-4, 26-6, 24-8, 21-11, 19-13, and 17-15), by picking 10 random subsets for each case. The internal CV ranges from 100% to 67%, while prediction accuracy goes from 100% to 94%. The present results then support the application potential for this sensor array, even with a relatively small dataset, such as the one reported in this study. While PCA performs remarkably well in the discrimination of each gas contribution, for a practical application where prediction is required with specific accuracy, LDA should be preferred.

Conclusions
The goal of the present work was to develop a graphene-based sensor array in a chemiresistor configuration and to explore the discrimination capability of this array in the identification of different gases, related to specific applications, such as breathomics, environmental monitoring, and food quality tracking, exploiting few chemometric tools. The array has been prepared by dropcasting graphene nanostructures and graphene nanocomposites solutions on interdigitate alumina/silver substrates and after characterization, five different samples have been put on a properly designed platform. The goal has then been achieved in two steps: (i) ammonia exposures have been performed, the calibration curve of each sensor has been drawn, the limit of detection has been evaluated and it is in the low ppb range for all the sensors, a remarkable stability up to 3 month of all sensors response has been found and finally, the obtained results have been compared to literature, disclosing a good sensitivity for the present chemiresistors, which is in line with what has been so far published; (ii) exposures to acetone, 2-propanol, ethanol, sodium hypochlorite, and water have been carried out with the aim to prepare a dataset for chemometric analyses. These gases have been selected because of their interfering nature in many applications, including the ones for which the presented array is aimed to. In detail, at first it has been proven that the array can clearly discriminate the tested gases in a PCA space, considering both PC1-PC2 and PC1-PC-3 2D subspaces. Moreover, the role of each sensor in the discrimination is investigated and discussed, supporting the claims with experimental evidence. In particular, thanks to the loading plots, it has been found that all sensors are responsible for the discrimination of ammonia and water vapour, while Gr dispersion and Gr_TiO2 are required to discriminate acetone, and Gr_nanoplatelets is essential for sodium hypochlorite discrimination. The capability of the array to recognize unknown data with a high accuracy has been demonstrated by LDA, also investigating different training and test dataset dimensions; specifically, the prediction accuracy is always above 94%, when also halving the training dataset dimension.
Finally, an approach to identify unknown data has been suggested and tested by exploiting the Mahalanobis distance algorithm. The present results then support the application potential for this sensor array, even with a relatively small dataset, such as the one reported in this study. While PCA performs remarkably well in the discrimination of each gas contribution, for a practical application where prediction is required with specific accuracy, LDA should be preferred.

Conclusions
The goal of the present work was to develop a graphene-based sensor array in a chemiresistor configuration and to explore the discrimination capability of this array in the identification of different gases, related to specific applications, such as breathomics, environmental monitoring, and food quality tracking, exploiting few chemometric tools. The array has been prepared by dropcasting graphene nanostructures and graphene nanocomposites solutions on interdigitate alumina/silver substrates and after characterization, five different samples have been put on a properly designed platform. The goal has then been achieved in two steps: (i) ammonia exposures have been performed, the calibration curve of each sensor has been drawn, the limit of detection has been evaluated and it is in the low ppb range for all the sensors, a remarkable stability up to 3 months of all sensors response has been found and finally, the obtained results have been compared to literature, disclosing a good sensitivity for the present chemiresistors, which is in line with what has been so far published; (ii) exposures to acetone, 2-propanol, ethanol, sodium hypochlorite, and water have been carried out with the aim to prepare a dataset for chemometric analyses. These gases have been selected because of their interfering nature in many applications, including the ones for which the presented array is aimed to. In detail, at first it has been proven that the array can clearly discriminate the tested gases in a PCA space, considering both PC1-PC2 and PC1-PC-3 2D subspaces. Moreover, the role of each sensor in the discrimination is investigated and discussed, supporting the claims with experimental evidence. In particular, thanks to the loading plots, it has been found that all sensors are responsible for the discrimination of ammonia and water vapour, while Gr dispersion and Gr_TiO 2 are required to discriminate acetone, and Gr_nanoplatelets is essential for sodium hypochlorite discrimination. The capability of the array to recognize unknown data with a high accuracy has been demonstrated by LDA, also investigating different training and test dataset dimensions; specifically, the prediction accuracy is always above 94%, when also halving the training dataset dimension.
Finally, an approach to identify unknown data has been suggested and tested by exploiting the Mahalanobis distance algorithm.
Further improvements in the ∆R/R 0 response could be expected from the optimization of each layer thickness, as observed for instance in CNTs chemiresistors [67]. This is left for future studies as it goes beyond the scope of the present work.
The low-cost and simple preparation technique, as well as the easiness of use and simplicity of the sensor array itself, coupled with its remarkable discrimination and predictive capability, make the presented array suitable and promising for environmental monitoring, food quality tracking, and breathomics applications.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/s23020882/s1, Figure S1: Set up for gas exposures; Table S1: Freundlich fitting parameters and evaluated limit of detection; Figure S2: Stability of sensor response over time; Figure S3: stability of the sensor response at different working temperature; Figure S4: sensing mechanism; Figure S5: sensitivity benchmarking; Table S2: concentration range for all tested gases; Tables S3 and S4: confusion matrix for LDA cross validation with accuracy percentage evaluation; Table S5: confusion matrix for LDA predictive capability.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.