Linear and Non-Linear Modelling Methods for a Gas Sensor Array Developed for Process Control Applications

New process developments linked to Power to X (energy storage or energy conversion to another form of energy) require tools to perform process monitoring. The main gases involved in these types of processes are H2, CO, CH4, and CO2. Because of the non-selectivity of the sensors, a multi-sensor matrix has been built in this work based on commercial sensors having very different transduction principles, and, therefore, providing richer information. To treat the data provided by the sensor array and extract gas mixture composition (nature and concentration), linear (Multi Linear Regression—Ordinary Least Square “MLR-OLS” and Multi Linear Regression—Partial Least Square “MLR-PLS”) and non-linear (Artificial Neural Network “ANN”) models have been built. The MLR-OLS model was disqualified during the training phase since it did not show good results even in the training phase, which could not lead to effective predictions during the validation phase. Then, the performances of MLR-PLS and ANN were evaluated with validation data. Good concentration predictions were obtained in both cases for all the involved analytes. However, in the case of methane, better prediction performances were obtained with ANN, which is consistent with the fact that the MOX sensor’s response to CH4 is logarithmic, whereas only linear sensor responses were obtained for the other analytes. Finally, prediction tests performed on one-year aged sensor platforms revealed that PLS model predictions on aged platforms mainly suffered from concentration offsets and that ANN predictions mainly suffered from a drop of sensitivity.


Introduction
The development of green energies brings with it the problem of their intermittency.One solution is to interconnect electricity, gas and heat networks.In this way, surplus electrical energy produced at times of low household consumption could be stored in chemical form by producing H 2 through electrolysis [1].H 2 thus produced can be used for mobility applications, inserted in limited quantities into the domestic gas network, or used to capture CO 2 (produced during natural gas combustion) that will be transformed into CH 4 via the methanation reaction [2].
Nowadays, there are various processes for producing dihydrogen and methane.These include biogas reforming, pyrolysis or pyro-gasification of biomass, methanation (chemical or biological), and electrolysis.The molecules that are mainly involved in those processes (if we do not consider the different hydrocarbonated molecules that belong to the biomass) are: CH 4 , CO, CO 2 , H 2 , and H 2 O.
As the development in this Power to X area intensifies, issues related to process safety (CO or H 2 leakage, possibly concomitantly) and process control (measurement of CO 2 or CH 4 concentrations, possibly in the presence of H 2 ) are emerging.
To monitor and control industrial processes, monitor stack emissions or detect leaks, some companies are equipped with analyzers installed as close as possible to the part Sensors 2024, 24, 3499 2 of 23 of the process to be characterized (reactors, pipes, unit operations).The nature of these analyzers depends on the gases to be detected.Among the analyzer technologies used, we can find infrared analyzers, chromatography, Raman spectroscopy-based analyzers, and photoacoustic analyzers [3][4][5].However, analyzers of those types cost several tens of thousands of euros per unit.As a result, equipping a production unit with several analyzers can be quite costly.An alternative solution would be to measure gaseous composition using a multi-sensor platform, the cost of which would be reduced by a factor of 10 to 50 compared with an analyzer.After automatic sampling (using mass flowmeters), the gaseous solution to be analyzed may have to be diluted to ensure that the concentrations to be analyzed are compatible with the sensors' detection ranges, and reduced to atmospheric pressure, as most sensors have a restricted operating range in terms of pressure.
The well-known problem of sensors is their lack of selectivity [6][7][8][9].To overcome this problem, several approaches exist which have given rise to specific research.One of the methods currently used consists of modifying the composition of the sensor.This can be achieved by adding a selective sensitive layer that responds to one target gas [10][11][12][13] or by integrating a filter that will block access to the sensor's reaction sites to certain gases, similarly to the work reported by Gao et al. [14].Another method used to achieve good selectivity consists of using sensor arrays.This technique may be used as an alternative to the first one.In this case, no modification of the sensor's composition is made but signal treatment based on multivariate analysis enables the identification of the analytes' nature and concentration due to the increased size of data collected by the different sensors of the array [15,16].These arrays can use different sensors based on one transducing principle, or can group sensors with different transducing principles (arrays of MOX sensors [17][18][19], arrays of electrochemical sensors [20,21], and so on).
To achieve good prediction of a gas mixture composition, a model has to be built based on the results obtained in a first phase ("training" phase) before being validated through an independent dataset.Prediction models can be multilinear.In this case, the model will consist of a matrix.Concerning the linear methods classically used for modeling purposes, PCA calculates matrices to project variables into a new space, using a new matrix to show the degree of similarity between variables.This method is classically used in the sensor field to classify sensor signals of electronic noses into odor types [22] which can be useful in the food industry or in testing indoor/outdoor air quality, for example.However, this method is not relevant for the identification of both natures and concentrations of gases as is required in the case of process control.For this purpose, Multi Linear Regression-Ordinary Least Square (MLR-OLS) presents the advantage of being quite simple to implement [23,24].Modeling performances are interesting for calibration and for concentration predictions when the explanatory variables (models inputs including the sensor signals) are not correlated with each other.In this case, the PLS model is more appropriate.Indeed, this method is very effective, especially when the sensor signals are linear in their detecting range [25].For example, in a study performed by Karami, Rasekh [26], an e-nose with MOS sensor was used to detect oil oxidation.The reliability of the PLS method in detecting this phenomenon was the most interesting among the tested methods, and was assessed at 100%.
On the other hand, it seems interesting to consider non-linear models which can be more effective in the case of strong non-linearity in the input sensor signals.More or less complex artificial neural network (ANN)-based models will, for example, allow gas mixture composition prediction [27] and also gave good results when used with e-noses to assess the quality of products.As we will explain in the following sections, preliminary tests consisting of exposing sensors to mono-analyte gas compositions gave, in almost every case, linear responses in the targeted detection range.Non-linear models like neural networks are suitable for linear behaviors but, due to the fact that the extraction of the model parameters is based on reaching local minimums [28], the extracted parameters will not always correspond to the most adequate model.
Predictions can be biased due to numerous problems: different types of transitory phenomena (temperature or pressure evolution), ageing of the electronics for signal treatment, and drift/ageing of the sensors constituting the platform.Different reversible or irreversible processes may cause short-or long-term sensor drift.Reversible damage, which results in short-term drifts, can result from condensation of chemical vapor on the active surface of the sensor, physical adsorption of chemical compounds, or evolution of ambient atmospheric conditions (temperature, humidity, influence) [29].Irreversible damage can result from a brutal phenomenon, for example, the poisoning of MOX or electrochemical sensors with sulfur compounds or from a continuous evolution over time.This last case can be due to the evolution of the electronic components' dedicated signal treatment or from an evolution of the sensor's active materials (electrodes, semiconductor oxides, and heating element) due to surface chemical reactions or degradations due to mechanical stresses [30].
In this paper, two linear models and several neural network-based models with and without a hidden layer will be compared in terms of predictive capabilities.Single analytes and binary mixtures will be considered for developing and testing the models.In addition, the article also includes the study of the pertinence of the concentration predictions after one year of continuous use of the platform under controlled environmental conditions.In this case, predictions will be affected by irreversible damage due to continuous ageing of electronics or sensors.Comparison of the prediction performances of linear (PLS) and non-linear (ANN) methods, initially and after one year of continuous platform use, will be carried out.Indeed, one of the major objectives of this work will be to evaluate the evolution over time of the performance of commonly used linear and non-linear models for predicting gas concentrations in the simple case of binary mixtures.

Sensor Choice
The choice of sensors for the multi-sensor platform was made according to welldefined specifications.The first one concerned the targeted gases: CO 2 , H 2 , CH 4 , and CO, which are the most common gases in Power to X-linked processes.Additional temperature and humidity sensors are also needed since those two parameters will vary during the detection phase and since they constitute potential influences.The desirable gas concentration detection ranges are both a function of the gas concentrations that can be measured at specific points of the processes and the possibilities offered by commercial sensors.In order to detect traces of CO in H 2 (methanization) or leakage of CO, the targeted detection range expected for CO runs from a few ppm to some hundreds of ppm.For H 2 , requirements relate to its monitoring in a process and the detection of H 2 leaks.For safety reasons and for reasons of sensor range limitations, the targeted detection range for H 2 has been limited from a few hundred ppm to 1%.For CO 2 and CH 4 gases, the requirements concern only the gas concentration monitoring at different steps of the process.Ideally, sensor detection ranges should be from a few hundred ppm up to 100% but, as will be shown, the sensors' upper limit for those gases does not exceed a few tens of %.
In order to address the specifications, 5 commercial sensors were selected for this project.These ones were deliberately chosen with very different operating principles in order to maximize the versatility of the associated responses.
Prior to selecting the sensors, a study was carried out to determine the sensor technologies that could be used to detect the gases of interest to the project (CO, CO 2 , CH 4 , and H 2 ).The results are summarized in Table 1.
Sensor technologies can be classified into two families: chemical and physical sensors.The first family is based on the change of an electrical output characteristic's value due to a chemical reaction.CO 2 is a weakly reactive molecule.Therefore, chemical sensors (MOX, catalytic, electrochemical) are not the most effective to detect it.Commercially, two physical sensor technologies exist for the detection of CO 2 : Non-Dispersive Infrared (NDIR) sensors and photo-acoustic sensors.The second technology is relatively recent and only a few constructors propose it.The photo-acoustic effect is based on the absorption (by target molecules) of a modulated (or pulsed) light beam.As the molecules de-energize through collisions, they generate sound waves that are detected by a condenser microphone.Yet, the technology that is mainly used for the detection of CO 2 remains the classical infrared absorption-based one.As for the larger IR analyzers, those sensors will be based on the detection of a change of luminous intensity due to absorption of NDIR rays by the gaseous analyte.In an MOX sensor, the variation in electrical conductivity of an oxide semiconductor layer is measured as a function of the presence of chemisorbed gaseous analytes (redox interface reactions).Concerning electrochemical sensors, redox reactions are also involved but at electrode/electrolyte/gas interfaces.Those reactions will affect the interface resistance of the working electrode (mainly) and change the electromotive force measured between the electrodes.For catalytic sensors, a specific combustion (redox) reaction will occur on two alumina beads: a reference bead and another bead (for measurement) covered with a catalyst which decreases the combustion reaction temperature.Those alumina beads are traversed by a platinum wire, which is an RTD material, i.e., a material which changes resistance with temperature.Resistance variation due to analyte combustion is measured thanks to a Wheatstone bridge involving both the reference and the measurement bead resistances.In those three technologies of sensors, electrochemical reactions are involved.Due to the variety of semiconductor oxides used in MOX sensors, commercial references were found for CO, H 2 and CH 4 compounds.For the electrochemical sensors, a lot of references were found for CO detection and references were found for H 2 detection also.Concerning the catalytic sensors, references found mainly concerned the detection of hydrocarbons (including CH 4 ) and H 2 .
Table 1.Technologies of sensors commercially available and corresponding to our specifications according to the target gas (X: Only a few commercial references found, XX: Many commercial references could be found but only a few corresponding to our requirements, XXX: many commercial references corresponding to our specifications found).

CO
CO 2 H 2 CH 4 MOX (semiconductors) X X X

XXX X
From the references identified, the next step was the selection of the sensors that will be used in this work.This selection was performed according to many criteria: respect of the process linked specification (presented earlier), the versatility of the sensor signal expected (it was important to avoid collinearity between the sensor responses), and low number of interferents (especially humidity and temperature interferents).
Table 2 lists the name, model, type, detection range, and target gases for each selected sensor.A digital NDIR sensor was chosen for CO 2 measurement.This one incorporates a temperature sensor as well as a humidity sensor, so that the CO 2 concentration signal delivered by the sensor incorporates compensation for temperature and humidity variations.Hence, our system will eventually have 7 sensors.The temperature and the humidity sensor will also be used in the sensor network as input parameters in the multi-linear models, but not in the ANN as we feared this could lead to overfitting.Unlike other sensors, the NDIR sensor is selective.Determining CO 2 concentration will therefore not require multivariate analysis as will be the case for the other gases.The platform also incorporates two electrochemical sensors (EC-H 2 and EC-CO) which are highly sensitive to H 2 and CO, but not selective.Finally, a catalytic sensor (CATA) and a Metal Oxide sensor (MOX) were chosen for their sensitivity to CO, H 2 , and CH 4 .

Experimental Setup
A gas bench equipped with a series of flowmeters was used to generate gas mixtures of specified compositions with a total gas flow of 30 L/h (Figure 1a).In order to expose the sensor platform to single analyte gases or binary mixtures, 2 sensor platforms of 7 sensors (if we include the temperature and humidity sensors) were introduced in sealed cells such as the one shown in Figure 1b and exposed to the gas mixtures.The sensor signals were conditioned using commercial or laboratory-developed analogic and digital electronics as can be seen in Figure 2. EC-CO and EC-H 2 sensors require a special conditioning step performed by analogic circuit boards.However, for the rest of the sensors, the conditioning is performed by the central "laboratory-made" circuit board.Finally, signals are digitized and computerized using an Arduino board for NDIR CO 2 sensors, and a National Instruments (NI) board for the others.Once the sensor data have been collected, behavioral modeling of the platform is carried out using Excel software for the MLR-OLS method (Multi Linear Regression-Ordinary Least Square), the Python algorithm using the "PartialLeastSquares" library for the MLR-PLS method (Multi Linear Regression-Partial Least Square) and Keras/Tensorflow in Python for the neural network method.These models will be used for the final tests to predict a gas composition from the response of the sensor array immediately in the weeks after the model building and also after an ageing period of one year.
NDIR sensor is selective.Determining CO2 concentration will therefore not require multivariate analysis as will be the case for the other gases.The platform also incorporates two electrochemical sensors (EC-H2 and EC-CO) which are highly sensitive to H2 and CO, but not selective.Finally, a catalytic sensor (CATA) and a Metal Oxide sensor (MOX) were chosen for their sensitivity to CO, H2, and CH4.

Experimental Setup
A gas bench equipped with a series of flowmeters was used to generate gas mixtures of specified compositions with a total gas flow of 30 L/h (Figure 1a).In order to expose the sensor platform to single analyte gases or binary mixtures, 2 sensor platforms of 7 sensors (if we include the temperature and humidity sensors) were introduced in sealed cells such as the one shown in Figure 1b and exposed to the gas mixtures.The sensor signals were conditioned using commercial or laboratory-developed analogic and digital electronics as can be seen in Figure 2. EC-CO and EC-H2 sensors require a special conditioning step performed by analogic circuit boards.However, for the rest of the sensors, the conditioning is performed by the central "laboratory-made" circuit board.Finally, signals are digitized and computerized using an Arduino board for NDIR CO2 sensors, and a National Instruments (NI) board for the others.Once the sensor data have been collected, behavioral modeling of the platform is carried out using Excel software for the MLR-OLS method (Multi Linear Regression-Ordinary Least Square), the Python algorithm using the "PartialLeastSquares" library for the MLR-PLS method (Multi Linear Regression-Partial Least Square) and Keras/Tensorflow in Python for the neural network method.These models will be used for the final tests to predict a gas composition from the response of the sensor array immediately in the weeks after the model building and also after an ageing period of one year.

Role of Mono-Analyte Tests
Before exposing the sensor array to complex binary mixtures, it was important to study the sensor responses to each analyte.The goal of these "mono-analyte tests" are of different natures: to verify the reproducibility of the sensor responses, -to check that the sensor drift is limited and close to zero, -to analyze the transfer function linking the gas concentration of the analyte and the sensor responses (linear or not), -to check that the sensor responses to the introduced analytes are sufficiently uncorrelated to have enough variability in the information collected.If two sensors respond the same way to all the analytes, they finally bring "collinear" information, which would be prejudicial for the models.Indeed, it can lead to overfitting so that the model will almost perfectly learn to match the training data but will be unable to capture the validation data.

Sensor Network Exposure to Both Mono-Analyte and Binary Mixtures
A LabVIEW program is used both to control the flowmeters and therefore the gas mixture composition in the gas line according to time, and also to collect the data from the different sensors and gather them in a specific file.The program can be fed with a file containing sequences of gas compositions at different times that will be applied to the different flowmeters to obtain the expected gas concentration evolution as a function of time (Figure 3).Each sequence lasts between 30 and 60 min and a set of sequences is structured in the following way: first sequence under "base gas": 12% O2/1% absolute humidity/N2, -several sequences including introduction of analytes alone or in binary analyte mixtures, -last sequence under "base gas": 12% O2/1% absolute humidity/N2 to verify the return of the sensor signals to the base line, i.e., verify that the signal corresponding to the  to check that the sensor responses to the introduced analytes are sufficiently uncorrelated to have enough variability in the information collected.If two sensors respond the same way to all the analytes, they finally bring "collinear" information, which would be prejudicial for the models.Indeed, it can lead to overfitting so that the model will almost perfectly learn to match the training data but will be unable to capture the validation data.

Sensor Network Exposure to Both Mono-Analyte and Binary Mixtures
A LabVIEW program is used both to control the flowmeters and therefore the gas mixture composition in the gas line according to time, and also to collect the data from the different sensors and gather them in a specific file.The program can be fed with a file containing sequences of gas compositions at different times that will be applied to the different flowmeters to obtain the expected gas concentration evolution as a function of time (Figure 3).Each sequence lasts between 30 and 60 min and a set of sequences is structured in the following way: first sequence under "base gas": 12% O first sequence is the same as the signal at this last sequence (no drift of the sensor signals).

Modelling Step: Behavior Model Construction
During this step, the signals collected during the phase of sensor network exposure to the different single and binary analyte mixtures will be used.They will constitute the input data from our models and will be gathered in a matrix X constituted of the elements Xij, where "i" corresponds to the sampled point number (which can be linked to time knowing the sampling frequency) and "j" corresponds to the sensor number: 1 to 14 (2 cells with 7 sensors each).The output matrix of the model is a table Y constituted of the elements Yik, containing the concentration evolution of the four targeted analytes according to time.Here, "i" also corresponds to the sampled point number and "k" corresponds to the analyte number (from 1 to 4).
Concerning linear models, MLR-OLS [31,32] and MLR-PLS [33,34] methods were chosen.The MLR-OLS model is a modelling method in which the empirical estimation of a calibration matrix, C, allows us to use experimental sensor data (matrix ) to get a prediction matrix  (which corresponds, here, to modelled gas concentration values).C is composed of the elements  and is determined by least squares minimization (parameter RMSE, Root Mean Square Error) between modelled values  and experimental values  :  = ∑ .The RMSE is calculated for each analyte "k".In the linear model, the relationship existing between the sensors' matrix signals and the analyte concentrations is the following: where  is a one-column vector composed of  elements (number of sampled points) whose components are all equal to 1.  is a one line vector composed of  elements (14 sensors here).It corresponds to the sensors' response vector when no analyte is introduced. and the calibration matrix  determined during the training phase will constitute the parameters of the model (′ being the transpose of ).
To predict the value of the concentration matrix in the model validation phase, linear algebra is used to extract the concentration matrix  : In the case where the number of predicted variables (matrix  ) is rather high and at the same time the amount of information from experimental data (matrix ) insufficient, the OLS method becomes unstable because the system is undetermined.Similarly, when the number of experimental variables (sensors as predictors) is large and the amount of data used in the model-building phase is insufficient, OLS models then suffer from multicollinearity and overfitting problems.
The MLR-PLS model will then seek to define a model that will maximize the covariance between  and  using latent variables.These variables replace explanatory varia-

Modelling Step: Behavior Model Construction
During this step, the signals collected during the phase of sensor network exposure to the different single and binary analyte mixtures will be used.They will constitute the input data from our models and will be gathered in a matrix X constituted of the elements X ij , where "i" corresponds to the sampled point number (which can be linked to time knowing the sampling frequency) and "j" corresponds to the sensor number: 1 to 14 (2 cells with 7 sensors each).The output matrix of the model is a table Y constituted of the elements Y ik , containing the concentration evolution of the four targeted analytes according to time.
Here, "i" also corresponds to the sampled point number and "k" corresponds to the analyte number (from 1 to 4).
Concerning linear models, MLR-OLS [31,32] and MLR-PLS [33,34] methods were chosen.The MLR-OLS model is a modelling method in which the empirical estimation of a calibration matrix, C, allows us to use experimental sensor data (matrix X) to get a prediction matrix Ŷ (which corresponds, here, to modelled gas concentration values).C is composed of the elements C kj and is determined by least squares minimization (parameter RMSE, Root Mean Square Error) between modelled values Ŷik and experimental values Y ik : . The RMSE is calculated for each analyte "k".In the linear model, the relationship existing between the sensors' matrix signals and the analyte concentrations is the following: where ones is a one-column vector composed of i max elements (number of sampled points) whose components are all equal to 1. R 0 is a one line vector composed of j max elements (14 sensors here).It corresponds to the sensors' response vector when no analyte is introduced.R 0 and the calibration matrix C determined during the training phase will constitute the parameters of the model (C ′ being the transpose of C).
To predict the value of the concentration matrix in the model validation phase, linear algebra is used to extract the concentration matrix Ŷ: In the case where the number of predicted variables (matrix Ŷ) is rather high and at the same time the amount of information from experimental data (matrix X) insufficient, the OLS method becomes unstable because the system is undetermined.Similarly, when the number of experimental variables (sensors as predictors) is large and the amount of data used in the model-building phase is insufficient, OLS models then suffer from multi-collinearity and overfitting problems.
The MLR-PLS model will then seek to define a model that will maximize the covariance between X and Y using latent variables.These variables replace explanatory variables (sensor signals in our case) with a more or less strong collinearity.Indeed, they constitute linear combinations of those in which the factor affected by each explanatory variable is chosen so as to maximize the covariance between the newly created latent vector and the concentration matrix.Then, the multilinear regression is not performed on the explanatory variables but on the latent variables.In this work, a version of the algorithm developed by Abdi et al. [35] in 2010 was used through Python code.
We also decided to model the relationship between the analyte concentrations and the sensor signals through a series of artificial neural networks (ANN) [36], as illustrated by Figure 4 for the case of H 2 concentration prediction.
When  = 0, there is no hidden layer and the concentration of the considered analyte (H2 in the following equation) can be determined by: where  is the value returned by the j-th sensor of the regressor combination at time i and the  are weights to be optimised based on the training data set. is the weight of the bias neuron.In the presence of a hidden layer, the pollutant concentration (for instance H2) can be calculated using the following equation: where the weights  , , and  , are to be optimised based on the training data set. , and  , , are the weights of the bias neurons for the output and the hidden layer, respectively.The weights are optimized through the minimization of RMSE given by: where ntime is the number of time points in the training data set,  , , is the considered experimental species concentration at the k-th time point of the training data set, and  , , is the prediction of this value by the ANN.The Gradient Descent Method with momentum [37] was employed for the optimization.After a random initialization, f act,1 = tanh and f act,2 = linear are the two activation functions we have chosen.When n hidden = 0, there is no hidden layer and the concentration of the considered analyte (H 2 in the following equation) can be determined by: where X ij is the value returned by the j-th sensor of the regressor combination at time i and the ω j are weights to be optimised based on the training data set.ω 0 is the weight of the bias neuron.In the presence of a hidden layer, the pollutant concentration (for instance H 2 ) can be calculated using the following equation: where the weights ω 1,k,j and ω 2,j are to be optimised based on the training data set.ω 2,0 and ω 1,0,j are the weights of the bias neurons for the output and the hidden layer, respectively.The weights are optimized through the minimization of RMSE given by: where n time is the number of time points in the training data set, Y species,k , is the considered experimental species concentration at the k-th time point of the training data set, and Y species,ANN,k is the prediction of this value by the ANN.The Gradient Descent Method with momentum [37] was employed for the optimization.After a random initialization, the weights of the neural network are iteratively adapted according to these coupled equations: Sensors 2024, 24, 3499 9 of 23 With: ε is the learning rate which was set to 0.2, whereas ρ is the momentum constant which was set to 0.9, and v 0 is set to 0 in Keras/Tensorflow.
If ∆ t RMSE = RMSE t − RMSE t−1 < 10 −4 for a duration equal to patience = 500 iterations, the optimisation is stopped as we consider that no further significant improvement can be made.The optimisation is always stopped after a maximum of 10,000 iterations.It is worth emphasizing that this is the RMSE of the training data set as the validation data set is only used after the end of the optimization to test the ability of the model to predict independent experiments.Both the regressors (sensors) and the target concentrations were normalized before the beginning of the training of the ANN so that they only take on values between 0 and 1.This is preferable if we want to use activation functions such as "tanh".

Mono-Analyte Tests
Mono-analyte test results presented in Figures 5-8 allowed us to answer all the questions raised in part 2.3 and to validate the potentialities of the sensor array to be used for predictions.The first point to validate was reproducibility.All the results presented in Figures 5-8 were reproduced two times and, in each case, less than 10% difference was observed between the sensor responses, which is, in the case of analogue sensors, quite reasonable.Then, it was important to check that no major drifts occurred in the sensor responses, i.e., that the base line before and after the introduction of the analytes is at the same level and that the slope of the response curve according to time finally tends towards a horizontal asymptote after the gas mixture composition has been changed.This also could be validated in the results shown in Figures 5-8.Another very important point was to check that the difference in the sensor array's responses from one gas to another was sufficient to discriminate the target gases.It can be seen from Figures 5 and 6 that sensor responses to H 2 and CO are relatively close even though the amplitudes of the responses of the different sensors are not comparable.One element that could permit to discriminate H 2 and CO when they are in binary mixtures with another gas is the fact that H 2 induces a significant response in the MOX sensor, whereas for CO this is not the case.Concerning CH 4 , only the MOX sensor responds to this analyte (Figure 7), in contrast to H 2 or CO.Mixtures of H 2 and CH 4 may be less convenient to identify for low concentrations of CH 4 since the correlation between H 2 concentration and MOX sensor is also very strong.For the three gases mentioned in this paragraph, multivariate analysis is required since two conditions are not fulfilled for those gases: induce a response in only one sensor; -this last sensor should only respond to this analyte.
However, this is the case for CO 2 .Indeed, CO 2 induces a response only in the NDIR sensor (Figure 8) and, at the same time, this last sensor does not respond to the other analytes (Figures 5-7-curve f).Thus, CO 2 will not require any multivariate analysis to be discriminated.The information from the NDIR sensor will be sufficient.Therefore, in the multivariate analysis performed in the following sections, CO 2 will not be involved.

Sensor Transfer Function
The sensor transfer function towards each analyte is a very important piece of information to collect to get indications of the type of models that may be the most suitable to perform concentration predictions.Indeed, if most of the sensor responses to the analytes were not linear, attempts to build a linear model to perform prediction would be unsuitable.From Figure 9, it can be noticed that most of the transfer functions are linear for all the considered sensors and analytes.Two exceptions to this statement exist.The first one concerns the response of the catalytic sensor to H2.In this case, the response is linear until

Sensor Transfer Function
The sensor transfer function towards each analyte is a very important piece of information to collect to get indications of the type of models that may be the most suitable to perform concentration predictions.Indeed, if most of the sensor responses to the analytes were not linear, attempts to build a linear model to perform prediction would be unsuitable.From Figure 9, it can be noticed that most of the transfer functions are linear for all the considered sensors and analytes.Two exceptions to this statement exist.The first one concerns the response of the catalytic sensor to H2.In this case, the response is linear until

Sensor Transfer Function
The sensor transfer function towards each analyte is a very important piece of information to collect to get indications of the type of models that may be the most suitable to perform concentration predictions.Indeed, if most of the sensor responses to the analytes were not linear, attempts to build a linear model to perform prediction would be unsuitable.From Figure 9, it can be noticed that most of the transfer functions are linear for all the considered sensors and analytes.Two exceptions to this statement exist.The first one concerns the response of the catalytic sensor to H 2 .In this case, the response is linear until a concentration of 600 ppm, after which there seems to be an asymptotic behavior (saturation of the sensor).The second exception concerns the MOX sensor's response to CH 4 , for which the sensor's response is purely logarithmic.
a concentration of 600 ppm, after which there seems to be an asymptotic behavior (saturation of the sensor).The second exception concerns the MOX sensor's response to CH4, for which the sensor's response is purely logarithmic.
Despite these two non-linear transfer functions, multi-linear models remain good candidates to perform prediction of analyte concentrations because of the measurement uncertainty (mainly composed of reproducibility errors) which could be less amplified by linear models.

Building Up Models from Training Data
To build up the different models (linear and non-linear), four specific test sequences were carried out twice each.To avoid overloading the article, these specific test sequences will be presented in Appendix A in graphical form.The sensor data from these tests were collected, formatted and used to build the models.On the raw sensor data, the transient parts correspond to both the response time of the multi-sensor array and the air renewal dynamics in the large 0.4 L cell volume (which is a function of the total flow rate of 30 L/h on the two cells).Those transient parts were removed in the data used to build up the model, leaving only the stationary parts.The data are composed of 14 columns for the 14 sensors (two cells of seven sensors) that will constitute the explanatory variables of the different models plus four columns containing the gas concentrations that will constitute the response variables.Those concentrations are accurately known since the LabVIEW program controls them through the flowmeters.From the model parameters estimated based on the training data, "modeled" analyte concentrations could be calculated using sensor responses and compared to the "real" experimental concentrations used during the training tests.The accuracy of these concentration predictions is evaluated based on the RMSE (Table 3).The only linear model enabling good concentration predictions on the training data is MLR-PLS.In Figure 10, analyte concentration predictions performed on the training dataset with this last model are represented as a function of time and compared to experimentally imposed values of H2, CO, and CH4 concentrations.It can be noticed that good prediction results are globally obtained for those three analytes.Yet, concerning the base line (concentration of 0 ppm of analytes), some false positive or negative values are obtained for CO and CH4.Those false positive or negative concentration prediction values are, then, expected to occur also on predictions performed on validation tests.
A series of artificial neural networks based on different sensor combinations as regressors and numbers of neurons in the hidden layer were trained and validated, as explained in Appendix B: Results of the Neural Networks.While the quality of the prediction of the validation data set for CO generally increases with the complexity of the ANN structures, this is not the case for CH4 where we can only see a decrease in the training RMSE.When it comes to CH4, the training RMSE sometimes increases as the complexity of the ANN is increased, which can only stem from the optimization method becoming stuck Despite these two non-linear transfer functions, multi-linear models remain good candidates to perform prediction of analyte concentrations because of the measurement uncertainty (mainly composed of reproducibility errors) which could be less amplified by linear models.

Building up Models from Training Data
To build up the different models (linear and non-linear), four specific test sequences were carried out twice each.To avoid overloading the article, these specific test sequences will be presented in Appendix A in graphical form.The sensor data from these tests were collected, formatted and used to build the models.On the raw sensor data, the transient parts correspond to both the response time of the multi-sensor array and the air renewal dynamics in the large 0.4 L cell volume (which is a function of the total flow rate of 30 L/h on the two cells).Those transient parts were removed in the data used to build up the model, leaving only the stationary parts.The data are composed of 14 columns for the 14 sensors (two cells of seven sensors) that will constitute the explanatory variables of the different models plus four columns containing the gas concentrations that will constitute the response variables.Those concentrations are accurately known since the LabVIEW program controls them through the flowmeters.From the model parameters estimated based on the training data, "modeled" analyte concentrations could be calculated using sensor responses and compared to the "real" experimental concentrations used during the training tests.The accuracy of these concentration predictions is evaluated based on the RMSE (Table 3).The only linear model enabling good concentration predictions on the training data is MLR-PLS.In Figure 10, analyte concentration predictions performed on the training dataset with this last model are represented as a function of time and compared to experimentally imposed values of H 2 , CO, and CH 4 concentrations.It can be noticed that good prediction results are globally obtained for those three analytes.Yet, concerning the base line (concentration of 0 ppm of analytes), some false positive or negative values are obtained for CO and CH 4 .Those false positive or negative concentration prediction values are, then, expected to occur also on predictions performed on validation tests.

Validation Tests and Comparison of Models
In this section on validation tests, the best linear model and best non-linear model are selected and used to perform prediction on data that have not been used to construct the model.The sensor data used for prediction were collected during gaseous exposure of the platform to the sequence shown in Figure 12 for the gases H2, CH4, and CO, alone or in binary mixtures.Gas concentration prediction results for PLS and ANN methods are shown in Figures 13 and 14, respectively.The predictions made for the three gases are relatively satisfactory in terms of determining the nature of the gases present.However, there are still some imperfections: over/underestimation of concentrations, and false positives/negatives.Prediction of H2 and CO gas concentrations seems quite comparable between PLS and ANN methods even if H2 prediction seems slightly better with the PLS method, as is confirmed in Table 4. Concerning CH4, prediction results are better with the ANN method as can be seen in Figures 13 and 14  A series of artificial neural networks based on different sensor combinations as regressors and numbers of neurons in the hidden layer were trained and validated, as explained in Appendix B: Results of the Neural Networks.While the quality of the prediction of the validation data set for CO generally increases with the complexity of the ANN structures, this is not the case for CH 4 where we can only see a decrease in the training RMSE.When it comes to CH 4 , the training RMSE sometimes increases as the complexity of the ANN is increased, which can only stem from the optimization method becoming stuck inside local minima, since the most complex structures include the less complex ones as a special case.
The results of the different models for the training phase can be seen in Table 3.The obtained RMSE are quite comparable with those obtained with the PLS method even if the prediction of H 2 concentration seems a little less accurate for the best ANN model compared to the PLS model.These results are confirmed in Figure 11 in which we can see that the prediction results are quite comparable to those obtained with the PLS method (Figure 10).
Sensors 2024, 24, x FOR PEER REVIEW 13 of 24 inside local minima, since the most complex structures include the less complex ones as a special case.The results of the different models for the training phase can be seen in Table 3.The obtained RMSE are quite comparable with those obtained with the PLS method even if the prediction of H2 concentration seems a little less accurate for the best ANN model compared to the PLS model.These results are confirmed in Figure 11 in which we can see that the prediction results are quite comparable to those obtained with the PLS method (Figure 10).

Validation Tests and Comparison of Models
In this section on validation tests, the best linear model and best non-linear model are selected and used to perform prediction on data that have not been used to construct the model.The sensor data used for prediction were collected during gaseous exposure of the platform to the sequence shown in Figure 12 for the gases H2, CH4, and CO, alone or in binary mixtures.Gas concentration prediction results for PLS and ANN methods are shown in Figures 13 and 14, respectively.The predictions made for the three gases are relatively satisfactory in terms of determining the nature of the gases present.However, there are still some imperfections: over/underestimation of concentrations, and false positives/negatives.Prediction of H2 and CO gas concentrations seems quite comparable between PLS and ANN methods even if H2 prediction seems slightly better with the PLS method, as is confirmed in Table 4. Concerning CH4, prediction results are better with the ANN method as can be seen in Figures 13 and 14

Validation Tests and Comparison of Models
In this section on validation tests, the best linear model and best non-linear model are selected and used to perform prediction on data that have not been used to construct the model.The sensor data used for prediction were collected during gaseous exposure of the platform to the sequence shown in Figure 12 for the gases H 2 , CH 4 , and CO, alone or in binary mixtures.Gas concentration prediction results for PLS and ANN methods are shown in Figures 13 and 14, respectively.The predictions made for the three gases are relatively satisfactory in terms of determining the nature of the gases present.However, there are still some imperfections: over/underestimation of concentrations, and false positives/negatives.Prediction of H 2 and CO gas concentrations seems quite comparable between PLS and ANN methods even if H 2 prediction seems slightly better with the PLS method, as is confirmed in Table 4. Concerning CH 4 , prediction results are better with the ANN method as can be seen in Figures 13 and 14 and confirmed by an RMSE result of 497 compared to 755 for the PLS method (Table 4).Less effective prediction of CH 4 with the PLS method is consistent with the fact that CH 4 only induces a significant response on the MOX sensor, compared to 755 for the PLS method (Table 4).Less effective prediction of CH4 with the PLS method is consistent with the fact that CH4 only induces a significant response on the MOX sensor, which is logarithmic.Thereby, the limits of the linear PLS model compared to ANN are shown here, even if CH4 concentration prediction results are consistent.

Data Post-Treatment
For the data predicted by the best linear and non-linear models, there are still problems that seem avoidable, particularly when the predicted concentration values are negative or positive while the actual concentrations seen by the sensor network are zero.To overcome these false negatives and false positives, a post-processing algorithm has been developed.This is based on the knowledge of sensor signal values when analyte concentrations are zero, and will be limited to concentration predictions in the latter case only, in order to avoid introducing bias.It consists of setting the analyte concentration to 0 when PLS method is consistent with the fact that CH4 only induces a significant response on the MOX sensor, which is logarithmic.Thereby, the limits of the linear PLS model compared to ANN are shown here, even if CH4 concentration prediction results are consistent.

Data Post-Treatment
For the data predicted by the best linear and non-linear models, there are still problems that seem avoidable, particularly when the predicted concentration values are negative or positive while the actual concentrations seen by the sensor network are zero.To overcome these false negatives and false positives, a post-processing algorithm has been developed.This is based on the knowledge of sensor signal values when analyte concentrations are zero, and will be limited to concentration predictions in the latter case only, in order to avoid introducing bias.It consists of setting the analyte concentration to 0 when  compared to 755 for the PLS method (Table 4).Less effective prediction of CH4 with the PLS method is consistent with the fact that CH4 only induces a significant response on the MOX sensor, which is logarithmic.Thereby, the limits of the linear PLS model compared to ANN are shown here, even if CH4 concentration prediction results are consistent.

Data Post-Treatment
For the data predicted by the best linear and non-linear models, there are still problems that seem avoidable, particularly when the predicted concentration values are negative or positive while the actual concentrations seen by the sensor network are zero.To overcome these false negatives and false positives, a post-processing algorithm has been developed.This is based on the knowledge of sensor signal values when analyte concentrations are zero, and will be limited to concentration predictions in the latter case only, in order to avoid introducing bias.It consists of setting the analyte concentration to 0 when

Data Post-Treatment
For the data predicted by the best linear and non-linear models, there are still problems that seem avoidable, particularly when the predicted concentration values are negative or positive while the actual concentrations seen by the sensor network are zero.To overcome these false negatives and false positives, a post-processing algorithm has been developed.This is based on the knowledge of sensor signal values when analyte concentrations are zero, and will be limited to concentration predictions in the latter case only, in order to avoid introducing bias.It consists of setting the analyte concentration to 0 when the signals of the different sensors constituting the platform remain under thresholds specific to each sensor.Post-treatment results are presented in Figures 15 and 16 and Table 5 for MLR-PLS and ANN models, respectively.The corrections are very effective for the H 2 analyte.In this case, both false negative and positive could be removed.In the case of the CH 4 analyte, results are very good too, even if the algorithm seems less effective at removing the few false positives obtained.Finally, post-treatment results obtained in the case of the CO analyte are less good, especially when CH 4 is in binary mixture with H 2 .In the latter case, in spite of the absence of CO, the proximity of the sensor responses to H 2 and CO makes it very difficult to distinguish H 2 from CO and CO predicted concentration is, in this case, not equal to 0.
the signals of the different sensors constituting the platform remain under thresholds specific to each sensor.Post-treatment results are presented in Figures 15 and 16 and Table 5 for MLR-PLS and ANN models, respectively.The corrections are very effective for the H2 analyte.In this case, both false negative and positive could be removed.In the case of the CH4 analyte, results are very good too, even if the algorithm seems less effective at removing the few false positives obtained.Finally, post-treatment results obtained in the case of the CO analyte are less good, especially when CH4 is in binary mixture with H2.In the latter case, in spite of the absence of CO, the proximity of the sensor responses to H2 and CO makes it very difficult to distinguish H2 from CO and CO predicted concentration is, in this case, not equal to 0.

Ageing of Sensors
Next, the multi-sensor platforms were maintained for a year under electrical power, atmospheric pressure, and air.At the end of the year, the gas sequence to which the sensors had been exposed for the prediction testing a year earlier was used again, to check the relevance of the prediction model.Figure 17 shows a comparison between sensor responses before ageing and after a one-year ageing period when subjected to the gas sequence shown in Figure 17a.For the EC-H2 (Figure 17b) and EC-CO (Figure 17c) sensors, an overall decrease in sensitivity is observed.This is more pronounced for the EC-H2 sensor than for the EC-CO sensor.The same trend is observed for the catalytic sensor (Figure 17d), except at higher H2 concentrations, where the response of the aged sensor is greater.5 for MLR-PLS and ANN models, respectively.The corrections are very effective for the H2 analyte.In this case, both false negative and positive could be removed.In the case of the CH4 analyte, results are very good too, even if the algorithm seems less effective at removing the few false positives obtained.Finally, post-treatment results obtained in the case of the CO analyte are less good, especially when CH4 is in binary mixture with H2.In the latter case, in spite of the absence of CO, the proximity of the sensor responses to H2 and CO makes it very difficult to distinguish H2 from CO and CO predicted concentration is, in this case, not equal to 0.

Ageing of Sensors
Next, the multi-sensor platforms were maintained for a year under electrical power, atmospheric pressure, and air.At the end of the year, the gas sequence to which the sensors had been exposed for the prediction testing a year earlier was used again, to check the relevance of the prediction model.Figure 17 shows a comparison between sensor responses before ageing and after a one-year ageing period when subjected to the gas sequence shown in Figure 17a.For the EC-H2 (Figure 17b) and EC-CO (Figure 17c) sensors, an overall decrease in sensitivity is observed.This is more pronounced for the EC-H2 sensor than for the EC-CO sensor.The same trend is observed for the catalytic sensor (Figure 17d), except at higher H2 concentrations, where the response of the aged sensor is greater.

Ageing of Sensors
Next, the multi-sensor platforms were maintained for a year under electrical power, atmospheric pressure, and air.At the end of the year, the gas sequence to which the sensors had been exposed for the prediction testing a year earlier was used again, to check the relevance of the prediction model.Figure 17 shows a comparison between sensor responses before ageing and after a one-year ageing period when subjected to the gas sequence shown in Figure 17a.For the EC-H 2 (Figure 17b) and EC-CO (Figure 17c) sensors, an overall decrease in sensitivity is observed.This is more pronounced for the EC-H 2 sensor than for the EC-CO sensor.The same trend is observed for the catalytic sensor (Figure 17d), except at higher H 2 concentrations, where the response of the aged sensor is greater.Finally, for the MOX sensor (Figure 17e), a greater sensitivity is observed for the sensor after one year of ageing as long as H 2 is present is the gas mixture.
Sensors 2024, 24, x FOR PEER REVIEW 16 of 24 Finally, for the MOX sensor (Figure 17e), a greater sensitivity is observed for the sensor after one year of ageing as long as H2 is present is the gas mixture.Based on the sensor platform's responses and applying the MLR-PLS and ANN models previously developed (without post-treatment), predictions of the concentration of H2, CO, and CH4 have been performed.Concerning the predictions made using the ANN model (Figure 18), different conclusions can be drawn according to the gas considered.Indeed, for H2, a global underestimation of the concentration is observed.When H2 concentration seen by the sensors is too low, predicted H2 concentration becomes 0. Concerning CH4, it seems that when this analyte is present alone in the base gas mixture, predicted CH4 concentration is close to 0. However, as long as H2 is present in sufficiently high concentration (even without the presence of CH4), predicted CH4 concentration raises.This indicates that, when trying to predict CH4 concentration, the ANN model will actually be more representative of the H2 concentration in the mixture.Finally, the best prediction results of ANN on aged sensors is obtained for CO.The concentration evolution is globally well reproduced.However, false positives in CO concentration predictions can be observed when H2 is present in a mixture that does not contain CO.Based on the sensor platform's responses and applying the MLR-PLS and ANN models previously developed (without post-treatment), predictions of the concentration of H 2 , CO, and CH 4 have been performed.Concerning the predictions made using the ANN model (Figure 18), different conclusions can be drawn according to the gas considered.Indeed, for H 2 , a global underestimation of the concentration is observed.When H 2 concentration seen by the sensors is too low, predicted H 2 concentration becomes 0. Concerning CH 4 , it seems that when this analyte is present alone in the base gas mixture, predicted CH 4 concentration is close to 0. However, as long as H 2 is present in sufficiently high concentration (even without the presence of CH 4 ), predicted CH 4 concentration raises.This indicates that, when trying to predict CH 4 concentration, the ANN model will actually be more representative of the H 2 concentration in the mixture.Finally, the best prediction results of ANN on aged sensors is obtained for CO.The concentration evolution is globally well reproduced.However, false positives in CO concentration predictions can be observed when H 2 is present in a mixture that does not contain CO.Finally, for the MOX sensor (Figure 17e), a greater sensitivity is observed for the sensor after one year of ageing as long as H2 is present is the gas mixture.Based on the sensor platform's responses and applying the MLR-PLS and ANN models previously developed (without post-treatment), predictions of the concentration of H2, CO, and CH4 have been performed.Concerning the predictions made using the ANN model (Figure 18), different conclusions can be drawn according to the gas considered.Indeed, for H2, a global underestimation of the concentration is observed.When H2 concentration seen by the sensors is too low, predicted H2 concentration becomes 0. Concerning CH4, it seems that when this analyte is present alone in the base gas mixture, predicted CH4 concentration is close to 0. However, as long as H2 is present in sufficiently high concentration (even without the presence of CH4), predicted CH4 concentration raises.This indicates that, when trying to predict CH4 concentration, the ANN model will actually be more representative of the H2 concentration in the mixture.Finally, the best prediction results of ANN on aged sensors is obtained for CO.The concentration evolution is globally well reproduced.However, false positives in CO concentration predictions can be observed when H2 is present in a mixture that does not contain CO.Then, concerning the predictions made using the MLR-PLS model (Figure 19), for the different gases, a prediction offset is observed in each case.For H 2 and CO, a positive offset is noticed while a negative prediction offset is observed for CH 4 .Even if the prediction response amplitude seems in correlation with experimental gas concentration, the presence of this offset completely ruins prediction quality for the different gases.Application of the post-treatment algorithm described earlier is not efficient in this case because of the important offsets observed.However, those offsets could be easily removed by periodic zero calibration of the platform.In Figure 20, prediction results including offset compensation are shown.Offset-compensated H 2 predictions reveal global underestimation of H 2 concentration.In the case of CH 4 , concentration overestimation is reported.When H 2 is present without CH 4 , false positives are also present.For the PLS model, best (offsetcompensated) prediction results are obtained for CO, for which concentrations are quite accurately predicted.False positive CO concentration prediction is also obtained when a high concentration of H 2 is present (without CO) in the gas mixture.Then, concerning the predictions made using the MLR-PLS model (Figure 19), for the different gases, a prediction offset is observed in each case.For H2 and CO, a positive offset is noticed while a negative prediction offset is observed for CH4.Even if the prediction response amplitude seems in correlation with experimental gas concentration, the presence of this offset completely ruins prediction quality for the different gases.Application of the post-treatment algorithm described earlier is not efficient in this case because of the important offsets observed.However, those offsets could be easily removed by periodic zero calibration of the platform.In Figure 20, prediction results including offset compensation are shown.Offset-compensated H2 predictions reveal global underestimation of H2 concentration.In the case of CH4, concentration overestimation is reported.When H2 is present without CH4, false positives are also present.For the PLS model, best (offset-compensated) prediction results are obtained for CO, for which concentrations are quite accurately predicted.False positive CO concentration prediction is also obtained when a high concentration of H2 is present (without CO) in the gas mixture.

Conclusions
In a context of process developments linked to Power to X, the first goal of this work was to select commercial sensors to build a multi-sensor platform able to detect H2, CO, CH4, and CO2 concentrations when gases are alone or in binary mixtures while respecting specifications linked to the applications.With the sensors selected, the processing chain enabling signal treatment and digitalization was developed.Finally, the main task was to build linear (MLR-OLS and MLR-PLS) and non-linear (ANN) models capable of detecting gases in binary mixtures.The first step was training.This allowed us to train the models by estimating their coefficients and then evaluate them by their ability to reproduce the data they were trained with.This step allowed us to disqualify the MLR-OLS model, which was clearly not able to predict analyte concentrations based on the sensor signals.
Another set of experimental sensor data was then used to compare the prediction performances of MLR-PLS and ANN in the case of "fresh sensor data", i.e., data not used to Sensors 2024, 24, x FOR PEER REVIEW 17 of 24 Then, concerning the predictions made using the MLR-PLS model (Figure 19), for the different gases, a prediction offset is observed in each case.For H2 and CO, a positive offset is noticed while a negative prediction offset is observed for CH4.Even if the prediction response amplitude seems in correlation with experimental gas concentration, the presence of this offset completely ruins prediction quality for the different gases.Application of the post-treatment algorithm described earlier is not efficient in this case because of the important offsets observed.However, those offsets could be easily removed by periodic zero calibration of the platform.In Figure 20, prediction results including offset compensation are shown.Offset-compensated H2 predictions reveal global underestimation of H2 concentration.In the case of CH4, concentration overestimation is reported.When H2 is present without CH4, false positives are also present.For the PLS model, best (offset-compensated) prediction results are obtained for CO, for which concentrations are quite accurately predicted.False positive CO concentration prediction is also obtained when a high concentration of H2 is present (without CO) in the gas mixture.

Conclusions
In a context of process developments linked to Power to X, the first goal of this work was to select commercial sensors to build a multi-sensor platform able to detect H2, CO, CH4, and CO2 concentrations when gases are alone or in binary mixtures while respecting specifications linked to the applications.With the sensors selected, the processing chain enabling signal treatment and digitalization was developed.Finally, the main task was to build linear (MLR-OLS and MLR-PLS) and non-linear (ANN) models capable of detecting gases in binary mixtures.The first step was training.This allowed us to train the models by estimating their coefficients and then evaluate them by their ability to reproduce the data they were trained with.This step allowed us to disqualify the MLR-OLS model, which was clearly not able to predict analyte concentrations based on the sensor signals.
Another set of experimental sensor data was then used to compare the prediction performances of MLR-PLS and ANN in the case of "fresh sensor data", i.e., data not used to

Figure 1 .
Figure 1.(a) schematic of the gas bench.(b) Cell containing the multi-sensor platform.

Figure 1 .
Figure 1.(a) schematic of the gas bench.(b) Cell containing the multi-sensor platform.

Figure 2 .
Figure 2. Sensors and the signal treatment chain.

Figure 2 .
Figure 2. Sensors and the signal treatment chain.

2. 3 .
Test Procedure 2.3.1.Role of Mono-Analyte Tests Before exposing the sensor array to complex binary mixtures, it was important to study the sensor responses to each analyte.The goal of these "mono-analyte tests" are of different natures: -to verify the reproducibility of the sensor responses, -to check that the sensor drift is limited and close to zero, -to analyze the transfer function linking the gas concentration of the analyte and the sensor responses (linear or not), -

Figure 3 .
Figure 3. Example of gas composition sequence and gas sensor response.

Figure 3 .
Figure 3. Example of gas composition sequence and gas sensor response.

Figure 4 .
Figure 4. Example architecture of the artificial neural network (ANN) used for H2 concentration prediction.

Figure 4 .
Figure 4. Example architecture of the artificial neural network (ANN) used for H 2 concentration prediction.

Figure 9 .
Figure 9. Transfer function of the sensors submitted to different concentrations of: (a) H 2 , (b) CO, (c) CH 4 .

Figure 12 .
Figure 12.Sequence used to predict H 2 /CH 4 /CO analyte concentrations alone or binary mixtures.

Figure 17 .
Figure 17.Comparison between the sensor responses before ageing and after one year of ageing: (a) Gas sequence used for the test, (b) Electrochemical EC-H2 sensor, (c) Electrochemical EC-CO sensor, (d) Catalytic CATA sensor, (e) Metal-Oxyde MOx sensor.

Figure 18 .
Figure 18.Gas concentration predictions (_Pred in the legend) on aged sensor platform for: (a) H2, (b) CH4, (c) CO using the previously developed ANN model-comparison to experimental used concentrations (_Exp in the legend).

Figure 17 .
Figure 17.Comparison between the sensor responses before ageing and after one year of ageing: (a) Gas sequence used for the test, (b) Electrochemical EC-H 2 sensor, (c) Electrochemical EC-CO sensor, (d) Catalytic CATA sensor, (e) Metal-Oxyde MOx sensor.

Figure 17 .
Figure 17.Comparison between the sensor responses before ageing and after one year of ageing: (a) Gas sequence used for the test, (b) Electrochemical EC-H2 sensor, (c) Electrochemical EC-CO sensor, (d) Catalytic CATA sensor, (e) Metal-Oxyde MOx sensor.

Figure 18 .
Figure 18.Gas concentration predictions (_Pred in the legend) on aged sensor platform for: (a) H2, (b) CH4, (c) CO using the previously developed ANN model-comparison to experimental used concentrations (_Exp in the legend).

Figure 18 .
Figure 18.Gas concentration predictions (_Pred in the legend) on aged sensor platform for: (a) H 2 , (b) CH 4 , (c) CO using the previously developed ANN model-comparison to experimental used concentrations (_Exp in the legend).

Figure 19 .
Figure 19.Gas concentration predictions (_Pred in the legend) on aged sensor platform for: (a) H2, (b) CH4, (c) CO using the previously developed MLR-PLS model-comparison to experimental concentrations (_Exp in the legend).

Figure 20 .
Figure 20.Offset-corrected gas concentration predictions (_offset shift in the legend) on aged sensor platform for: (a) H2, (b) CH4, (c) CO using the previously developed MLR-PLS model-comparison to experimental concentrations (_Exp in the legend).

Figure 19 .
Figure 19.Gas concentration predictions (_Pred in the legend) on aged sensor platform for: (a) H 2 , (b) CH 4 , (c) CO using the previously developed MLR-PLS model-comparison to experimental concentrations (_Exp in the legend).

Figure 19 .
Figure 19.Gas concentration predictions (_Pred in the legend) on aged sensor platform for: (a) H2, (b) CH4, (c) CO using the previously developed MLR-PLS model-comparison to experimental concentrations (_Exp in the legend).

Figure 20 .
Figure 20.Offset-corrected gas concentration predictions (_offset shift in the legend) on aged sensor platform for: (a) H2, (b) CH4, (c) CO using the previously developed MLR-PLS model-comparison to experimental concentrations (_Exp in the legend).

Figure 20 .
Figure 20.Offset-corrected gas concentration predictions (_offset shift in the legend) on aged sensor platform for: (a) H 2 , (b) CH 4 , (c) CO using the previously developed MLR-PLS model-comparison to experimental concentrations (_Exp in the legend).

Table 2 .
List of selected sensors to be used in the sensor network cell.LEL stands for Lower Explosive Limit (which is the lowest concentration of a gas or vapor that will burn in air-about 4%, 5%, and 12.5% respectively for H 2 , CH 4 , and CO).** according to the sensors' datasheets. *

Table 2 .
List of selected sensors to be used in the sensor network cell.

Table 3 .
RMSE obtained with concentrations predicted from training data used to build the models.

Table 3 .
RMSE obtained with concentrations predicted from training data used to build the models.
Thereby, the limits of the linear PLS model compared to ANN are shown here, even if CH 4 concentration prediction results are consistent.

Table 4 .
RMSE obtained with concentrations predicted from validation data.

Table 4 .
RMSE obtained with concentrations predicted from validation data.

Table 4 .
RMSE obtained with concentrations predicted from validation data.

Table 4 .
RMSE obtained with concentrations predicted from validation data.

Table 5 .
RMSE obtained with concentrations predicted from validation data and post-treated.

Table 5 .
RMSE obtained with concentrations predicted from validation data and post-treated.

Table 5 .
RMSE obtained with concentrations predicted from validation data and post-treated.