pH Colorimetric Sensor Arrays: Role of the Color Space Adopted for the Calculation of the Prediction Error

A pH colorimetric sensor array was prepared and characterized by combining tetrabromophenol blue (TBB) and bromothymol blue (BB) embedded in organically modified silicate (OrMoSil) spots polyvinylidene fluoride (PVDF)-supported. The signal was based on the Hue profile (H). The individual calibrations of TBB and BB showed precisions with minimum values of 0.012 pH units at pH = 2.196 for TBB and 0.018 at pH = 6.692 for BB. The overall precision of 10 spots of the mixture TBB/BB increased in the pH range of 1.000–8.000 from a minimum value of pH precision of 0.009 at pH = 2.196 to 0.012 at pH = 6.692, with the worst value of 0.279 pH units at pH = 4.101. The possibility to produce an array with much more than 10 spots allows for improving precision. The H analytical performance was compared to those of other color spaces such as RGB, Lab, and XYZ. H was the best one, with prediction error in the range of 0.016 to 0.021 pH units, at least three times lower than the second-best (x coordinate), with 0.064 pH units. These results were also confirmed by the calculation of the main experimental contributions to the pH prediction error, demonstrating the consistency of the proposed calculation approach.


Introduction
Colorimetric sensor arrays (CSAs) [1] are chemical sensors using suitable dyes to detect a specific analyte [2][3][4][5][6][7][8][9]. Color variations are usually recorded with CCD cameras or scanners [10]. The red, green, blue (RGB) color space is widely used in colorimetric sensing processes, but the composition of the R, G, and B does not change monotonically with spectral wavelength and intensity [11]. In 1931, the Commission International de l'Éclairage (CIE) defined the concept of the tristimulus values X, Y, and Z based on the three-component theory of color vision. The receptors of the human eye are responsible for three primary colors (red, green, and blue), and all colors are mixtures of them. The XYZ tristimulus values are obtained by using suitable color matching functions. The Lab model, indirectly obtained from the CIE-XYZ color space, is made of two chromatic components (a and b) and a lightness component (L). The two models express a wider gamut than the RGB. Ideally, they can reproduce an infinite number of chromatic mixtures [12]. Other color spaces are characterized by a specific tone or hue, a saturation level, and a lightness component [12]. The H component of the HSV (hue, saturation, value) is more stable and robust than the other color spaces as the illumination is enclosed in the V component of the model [11,[13][14][15]. Nevertheless, the HSV model has some issues. The first occurs when the maximum and minimum values for RGB are the same, which corresponds to the gray tones (undefined value for hue). This causes some incorrect color interpretations. The second issue

Reagents and Instrumentation
Dodecyltriethoxysilane, TEOS (Tetraethyl Orthosilicate) (≥99%), HCl 37%, tetrabromophenol blue (TBB, 85%), bromothymol blue (BB, 95%), hexadecyltrimethylammonium p-toluenesulfonate (CTApTs), acetic acid, and NaOH (≥97%) were purchased from Sigma Aldrich, whilst KCl was purchased from Prolabo. Sodium hydrogen carbonate (99.8%), sodium dihydrogen phosphate, and absolute ethanol were provided by Carlo Erba. We illustrated the cell used for pH measurements in our recent paper [13]. A Crison MM 40 pH-meter and a combined glass electrode (calibration with two standard solutions Mettler Toledo; pH = 6.865 and 4.006) were used for the reference pH measurements. Analytical (AS 220 R2 Radwa) and technical (EU-C500 Gibertini) balances were used for weight measurements. The pH buffers have a 0.1 M total concentration. The color of the wet spots was sampled in the most homogeneous portion of the spot (≈120 pixels). Background detection occurs in an external area near the spot. Dedicated programs written with MATLAB were employed to figure out the color coordinates. The regressions were obtained by using the iterative algorithm "Levenberg Marquardt" [31].

Preparation of the CSA
The preparation of the OrMoSil sol was made by mixing 4.03 g of TEOS, 0.65 g of dodecyl-TEOS, 1.58 g of Milli-Q water, and 0.55 g of 0.03 M HCl. To prepare the 10-spot sensor, CTApTs was now added (1.75 g) together with TBB/BB in the following molar ratios: 0.024, 0.061, 0.098, 0.147, 0.184, 0.233, 0.331 and 0.478, respectively. The tenth spot (first row in Figure 1) contained only TBB. The spots were aged at 20 ± 2 • C for three days before use. After a prior conditioning cycle, the pH CSA was immersed consecutively, for 100 s, in each buffer solution (28 pH values) from the acidic pH interval to the basic one. Pictures of the 10-spot CSA sensor, deposited by hands, with various molar ratios of tetrabromophenol blue/bromothymol blue (TBB/BB). Molar ratio and pH increase in the direction of the arrows. The colors come from the immersion of the sensor in 28 pH buffers from pH 1 to pH 10.

Main Error Contributions Affecting a Generic X Color Coordinate
The pH value measured with a CSA requires a suitable camera able to read the color. The color space usually adopted is the sRGB. Nevertheless, this color space is not the best in terms of stability, robustness, and precision of the signal [11]. In the following sections, the analytical performance of the H coordinate from HSV color space will be compared to those of other color spaces such as RGB, Lab, and XYZ. The best performance of H has been already cited by other authors, although no-one, to our knowledge, has rationalized its behavior [11,13,14]. The quantitative rationalization will be based on the pH prediction errors. Since the variance of the color coordinate affected the overall prediction error, the choice of the color space plays an important role. For this reason, it will be determined a ranking of the best performing coordinates. If X is a generic experimental color coordinate and μ is its theoretical value, we can write: The parameters β, δ, and ε are error sources defined as follow: • β is the background level due to the lighting conditions and to the CSA support (associated with the spot); • δ is the error due to the image acquisition conditions (associated with the spot); • ε is the instrumental error of the camera (associated with the detected color).
In particular, it will be demonstrated that H is affected only by the ε contribution.

Linearization of the Sigmoidal Calibration Model
The nature of an acid-base indicator is to change its color at the pKa value. The color transition is usually sigmoidal and can be managed with X. The calibration function that interprets the X vs. pH profile of a single pH indicator is given by the usual Boltzmann equation: where XHIn and XIn are the X color values of the HIn and In forms, respectively. ΔpH is the pH working interval of the indicator (the interval in which is possible to observe a variation of the color coordinate). This parameter is a function of the indicator but also (as we will see below) of the chosen color coordinate. The pHi parameter is the pH value of the inflection point. ΔX= | | is the X maximum variation. The sensitivity is obtained by considering the ratio: = Δ /ΔpH. Mixtures of two indicators required a bi-sigmoidal model:

Main Error Contributions Affecting a Generic X Color Coordinate
The pH value measured with a CSA requires a suitable camera able to read the color. The color space usually adopted is the sRGB. Nevertheless, this color space is not the best in terms of stability, robustness, and precision of the signal [11]. In the following sections, the analytical performance of the H coordinate from HSV color space will be compared to those of other color spaces such as RGB, Lab, and XYZ. The best performance of H has been already cited by other authors, although no-one, to our knowledge, has rationalized its behavior [11,13,14]. The quantitative rationalization will be based on the pH prediction errors. Since the variance of the color coordinate affected the overall prediction error, the choice of the color space plays an important role. For this reason, it will be determined a ranking of the best performing coordinates. If X is a generic experimental color coordinate and µ is its theoretical value, we can write: The parameters β, δ, and ε are error sources defined as follow: • β is the background level due to the lighting conditions and to the CSA support (associated with the spot); • δ is the error due to the image acquisition conditions (associated with the spot); • ε is the instrumental error of the camera (associated with the detected color).
In particular, it will be demonstrated that H is affected only by the ε contribution.

Linearization of the Sigmoidal Calibration Model
The nature of an acid-base indicator is to change its color at the pK a value. The color transition is usually sigmoidal and can be managed with X. The calibration function that interprets the X vs. pH profile of a single pH indicator is given by the usual Boltzmann equation: where X HIn and X In are the X color values of the HIn and In forms, respectively. ∆pH is the pH working interval of the indicator (the interval in which is possible to observe a variation of the color coordinate). This parameter is a function of the indicator but also (as we will see below) of the chosen color coordinate. The pH i parameter is the pH value of the inflection point. ∆X= |X In − X HIn | is the X maximum variation. The sensitivity is obtained by considering the ratio: SL X = ∆X/∆pH. Mixtures of two indicators required a bi-sigmoidal model: where X 0 is the initial X value. The parameters p and 1 − p represent the contribution of the two indicators to the X value; pH i,1, and pH i,2 are the pH values of the first and the second inflection point; ∆pH 1 and ∆pH 2 are the working intervals around the first and the second inflection point of the bi-sigmoid. Since the variance of X for some indicators is not homoscedastic in the transition zone [19,30], the sigmoidal regression must be weighted. For this reason, it was convenient to linearize the Boltzmann sigmoidal equation to obtain a homoscedastic calibration interval simplifying the calculation of the discriminated pH accuracy [30]. The linearization is the following: where a and b are the intercept and slope so that the working interval of the indicator is ∆pH = 4/b and the inflection point is pH i = −a/b. The error of the discriminated pH is given by: where s y/x is the regression standard deviation.

Results and Discussion
Before starting the discussion, we wish to point out that since the glass electrode was used to calibrate our CSA, our devices cannot give better results than the potentiometric technique. The pH errors calculated for our sensors are, in some cases, of the order of few thousandths of pH units. These values are extreme even for potentiometric measurements; therefore, the reported results will demonstrate only the comparability of our CSA with the glass electrode, although the CSA can have better precision.

Experimental Analytical Performance of Various Color Spaces
In this section, the experimental analytical performance of the H coordinate was compared to those of other color spaces such as RGB, Lab, and XYZ. We focused our attention on five repeated BB spots which are nominally identical. One of them showed a light reflection area caused by a non-optimal cell geometry (see Figure 2, spot 5).
Sensors 2020, 20, x 4 of 10 where X0 is the initial X value. The parameters p and 1 − p represent the contribution of the two indicators to the X value; pHi,1, and pHi,2 are the pH values of the first and the second inflection point; ∆pH1 and ∆pH2 are the working intervals around the first and the second inflection point of the bi-sigmoid. Since the variance of X for some indicators is not homoscedastic in the transition zone [19,30], the sigmoidal regression must be weighted. For this reason, it was convenient to linearize the Boltzmann sigmoidal equation to obtain a homoscedastic calibration interval simplifying the calculation of the discriminated pH accuracy [30]. The linearization is the following: where a and b are the intercept and slope so that the working interval of the indicator is ∆pH = 4/ and the inflection point is pH = / . The error of the discriminated pH is given by: where sy/x is the regression standard deviation.

Results and Discussion
Before starting the discussion, we wish to point out that since the glass electrode was used to calibrate our CSA, our devices cannot give better results than the potentiometric technique. The pH errors calculated for our sensors are, in some cases, of the order of few thousandths of pH units. These values are extreme even for potentiometric measurements; therefore, the reported results will demonstrate only the comparability of our CSA with the glass electrode, although the CSA can have better precision.

Experimental Analytical Performance of Various Color Spaces
In this section, the experimental analytical performance of the H coordinate was compared to those of other color spaces such as RGB, Lab, and XYZ. We focused our attention on five repeated BB spots which are nominally identical. One of them showed a light reflection area caused by a non-optimal cell geometry (see Figure 2, spot 5). This situation was chosen on purpose to evaluate the influence of anomalous signals on the overall result. Table 1 summarizes the results achieved from the sigmoidal profiles of the color coordinates (X) vs. pH in terms of calculated with Equation (4) at the inflection point. Data were ordered with increasing prediction error (s ). The performance of H was the best one in all spots. In particular, was in the range 0.016-0.021 pH units, at least three times lower than the second-best (the x coordinate), with a pH error in the interval of 0.064-0.109 pH units. The H coordinate exhibited the smallest regression variance and the lowest ∆pH (0.9 pH units). None of the other coordinates gave comparable results. Concerning the fifth spot affected by reflection phenomena (see Figure 2), it can be fitted only by using the H coordinate. In other cases, the spot was unusable. On the other hand, the other coordinates are less sensitive but work in a wider pH range, This situation was chosen on purpose to evaluate the influence of anomalous signals on the overall result. Table 1 summarizes the results achieved from the sigmoidal profiles of the color coordinates (X) vs. pH in terms of s pH i calculated with Equation (4) at the inflection point.
Data were ordered with increasing prediction error (s pH i ). The performance of H was the best one in all spots. In particular, s pH i was in the range 0.016-0.021 pH units, at least three times lower than the second-best (the x coordinate), with a pH error in the interval of 0.064-0.109 pH units. The H coordinate exhibited the smallest regression variance and the lowest ∆pH (0.9 pH units). None of the other coordinates gave comparable results. Concerning the fifth spot affected by reflection phenomena (see Figure 2), it can be fitted only by using the H coordinate. In other cases, the spot was unusable. On the other hand, the other coordinates are less sensitive but work in a wider pH range, sometimes larger than two logarithmic units. By using the H coordinate, the spot response was less affected by the spot shape, concentration, and optical inhomogeneity, and it maintained the same value (s s pH i,H = 0.002; s s pH i,B = 0.005; s s pH i,G = 0.011; s s pH i,R = 0.012; s s pH i,L = 0.013; s s pH i,y = 0.013; s s pH i,x = 0.019). Moreover, leaching or re-arrangements of the pH indicator in the spot did not alter the H value. A significant consequence is that only the calibration obtained with H remains identical in time.

Quantification of the Error Contributions on pH Discrimination
In this section, the precision of the pH value obtained with H will be evaluated and compared to the other color coordinates. H is defined as [19]: It contains D, which is the function difference of the normalized coordinates, r − g, g − b, or r − b (r = R/255, g = G/255, b = B/255), and ∆ is the product between luminance and saturation [19,30]. As H contains a difference function, on the basis of Equation (1), we can write: where ε' represents the error associated with a couple of rgb coordinates considered. It is evident that the sum β + δ elides for D as its value is identical for the rgb coordinate of the same spot but it will not elide for the other X coordinates. To calculate the error contributions, β, δ, and ε, the overall variance of D (i) and a generic X coordinate (ii) were calculated. In particular, (i) the experimental s 2 D = 2s 2 ε is obtained considering the average variance of the experimental values r-g, g-b, and r-b where β+δ does cancel; (ii) s 2 X = 2s 2 βδ + 2s 2 ε is obtained when all the contributions to the variances of the rgb coordinates were considered so that β+δ does not cancel. Figure 3a reports D vs. pH (r-g ( ), g-b (•), and r-b ( )) of five independent spots of TBB. The insert of Figure 3a reports the average standard deviations (i) of the same spots, s D . They were constant with pH and similar: s r−g = 0.0031, s g−b = 0.0035 and s r−b = 0.0047 (p value < 0.001) so that s D s ∆ . On the other hand, from (ii), the variance of X, with the couple r-g, taken as an example, was: s 2 X = s 2 r + s 2 g . These errors are larger-0.0090, 0.0060 and 0.0085-for s r,g , s g,b and s r,b , respectively, as expected.
The estimation of the errors for the five repeated spots of TBB was: respectively. The ratio between these two values is very close to the one calculated from the data in Table 1 between H and x coordinate (0.23 ≈ 0.24), indicating the correctness of the error calculation. Figure 3b reports the experimental standard deviations, s H ( ) and s pH (•) vs. pH, referring to the same spots. The continuous line, in good agreement with the experiment, is the theoretical s H profile (Equation (5) in [19]). respectively. The ratio between these two values is very close to the one calculated from the data in Table 1 between H and x coordinate (0.23 ≈ 0.24), indicating the correctness of the error calculation. Figure 3b reports the experimental standard deviations, sH (○) and spH (•) vs. pH, referring to the same spots. The continuous line, in good agreement with the experiment, is the theoretical sH profile (Equation (5) in ref. [19]).

Calculation of the Discriminated pH Precision for a CSA
To estimate the overall precision of a CSA, , the weight of each spot was = . By using the same estimate of for all the spots, the weighted was: where Δj is the product of the saturation by the luminance of the jth spot and is the sensitivity Δ /ΔpH of the H sigmoidal profile of the jth. The sum of the weights T is: The corresponding error is: The parameter was set to 0 when the calibration sensitivity is less than 0.1 so that the pH

Calculation of the Discriminated pH Precision for a CSA
To estimate the overall precision of a CSA, s pH w , the weight of each spot was w j = 1 . By using the same estimate of s H p for all the spots, the weighted pH w was: where ∆ j is the product of the saturation by the luminance of the jth spot and S H j is the sensitivity ∆H/∆pH of the H sigmoidal profile of the jth. The sum of the weights T is: The corresponding error is: The parameter w j was set to 0 when the calibration sensitivity is less than 0.1 so that the pH measurement is centered within the most sensitive calibration zone.

The Behavior of the Mixture TBB/BB in the CSA
The circles in Figure 4a describe 10 H calibration profiles (28 pH values each, from pH 1 to 10) obtained with TBB and BB at different molar ratios and the continuous lines are the corresponding bi-sigmoidal curve fitting. The shift from TBB to BB is evidenced. Both the acidic and alkaline plateau of TBB alone (curve 10) and BB alone (curve 1) are coincident, as both have complementary color transitions (yellow-blue). The use of the TBB/BB mixture allows for widening the pH interval, as they have different pK a values. Figure 4b reports the experimental profiles of the s pH precisions referring to curves 1 (s pH BB ), 5 (s pH TBB/BB ), 10 (s pH TBB ) and the overall one (s pH w ). Prediction errors of 0.012 pH units at pH = 2.196 and 0.018 pH units at pH = 6.692 were obtained with the data of curves 10 (TBB) and 1 (BB), respectively. The spot relative to curve 5 (both indicators present) had the worst precision at pH 2.2 and 6.7 compared to the curve of the single indicators TBB (curve 1) and BB (curve 10) since the bi-sigmoidal plot decreases the slope at the inflection points. On the other hand, the zone of pH 2.600-6.100 was improved. The minimum TBB precision error obtained with the calibration of a single spot was comparable with 0.014, reported in Figure 3b, relative to five independent spots, indicating a repeatable deposition procedure. On the other hand, the zone of pH 2.600-6.100 was improved. The minimum TBB precision error obtained with the calibration of a single spot was comparable with 0.014, reported in Figure 3b, relative to five independent spots, indicating a repeatable deposition procedure.  By considering all the 240 pH data in Figure 4a in the pH range 1-8, an improvement of the overall precision is evident: = 0.009 pH units at pH = 2.196 and = 0.012 pH units at pH = 6.692. The most critical pH region, close to pH 4.101, was also significantly improved: = 0.279 pH units at pH = 4.101. The production of arrays with more than 10 spots produces a further improvement of the precision. Table 2 reports the calibration parameters of curves 1-10. The H0 and the ∆H values were constant for all the mixtures.  By considering all the 240 pH data in Figure 4a in the pH range 1-8, an improvement of the overall precision is evident: s pH w = 0.009 pH units at pH = 2.196 and s pH w = 0.012 pH units at pH = 6.692. The most critical pH region, close to pH 4.101, was also significantly improved: s pH w = 0.279 pH units at pH = 4.101. The production of arrays with more than 10 spots produces a further improvement of the precision. Table 2 reports the calibration parameters of curves 1-10. The H 0 and the ∆H values were constant for all the mixtures. Table 2. Fitting parameters H 0 , ∆H, p, pH 1 , pH 2 , R 2 , ∆pH 1, ∆pH 2 obtained with a bi-sigmoidal regression (Equation (3)) of experimental data of Figure 4a.

Conclusions
A colorimetric sensor array (CSA) to detect pH based on tetrabromophenol blue (TBB) and bromothymol blue (BB) will be used for monitoring the relevant error behavior. The analytical performance of the H coordinate was compared to those of other color spaces, such as RGB, Lab, and XYZ by quantifying the pH prediction error. The pH prediction error, s pH i , obtained with H was in the range 0.016 to 0.021 pH units, at least three times lower than the best of the other coordinates (the X coordinate from CIE-XYZ color space was characterized by a 0.064 pH units error). The ratio between the variances of the difference coordinate D and a generic X color coordinate is the same as that emerging from the pH prediction error calculated on H and the X coordinate, indicating the correctness of the error calculation. In particular, the use of H eliminated the error contributions coming from the spot preparation and background anomalies (β) and the lighting conditions (δ). The performance of H was, therefore, the best one, since it is affected only by the instrumental error due to the camera characteristic, ε. This kind of CSA is characterized by errors of the same order of the potentiometric technique in the indicator working interval. The overall precision s pH w of only 10 spots of TBB/BB mixture in the pH range 1.000-8.000 has minimum error values of s pH w =0.009 at pH = 2.196 and of s pH w = 0.012 at pH = 6.692, with the worst precision at pH = 4.101, s pH w = 0.279. In any case, the possibility to produce arrays with much more than 10 spots allows for improving precision.