Prediction of Carbonate Selectivity of PVC-Plasticized Sensor Membranes with Newly Synthesized Ionophores through QSPR Modeling

Developing a potentiometric sensor with required target properties is a challenging task. This work explores the potential of quantitative structure-property relationship (QSPR) modeling in the prediction of potentiometric selectivity for plasticized polymeric membrane sensors based on newly synthesized ligands. As a case study, we have addressed sensors with selectivity towards carbonate—an important topic for environmental and biomedical studies. Using the logKsel(HCO3−/Cl−) selectivity data on 40 ionophores available in literature and their substructural molecular fragments as descriptors, we have constructed a QSPR model, which has demonstrated reasonable precision in predicting selectivities for newly synthesized ligands sharing similar molecular fragments with those employed for modeling.


Introduction
Ionophore-based ion-selective sensors are widely employed for quantification of ionic composition in a variety of samples [1,2]. The key element of such sensors is a plasticized polymeric membrane with incorporated ionophore-a lipophilic ligand providing the selectivity towards the target analyte [3]. Currently, the number of ionophores proposed for determination of various inorganic ions is rather large and growing. The search for new ligands is dictated by the needs of their practical application where different types of samples may require quite different selectivity patterns of sensors.
It is noteworthy that the number of ionophores proposed for cations is much higher than that for anions. This relates to the difficulty of developing selective ligands for anion binding: inorganic anions, as compared to cations, are characterized by lower chargeto-radius ratios and a large variety of geometries, and the forms of anion existence in the solution strongly depend on pH [4]. The sensor membrane phase formed by the plasticized polymer is very hydrophobic, so it is difficult for hydrophilic ions to cross the phase boundary, and thus, ensuring the selectivity of polymer sensors that differ from the Hoffmeister series is a major challenge.
The common experimental protocol for studying novel potential ligands for ionselective sensors requires chemical synthesis of candidate substances, their purification and characterization, sensor membrane preparation (typically in a broad variety of compositions including different solvent-plasticizers and different ratios between a ligand and an ion-exchanger), and potentiometric measurements to assess the sensitivity and selectivity parameters of the novel membranes. The overall process is quite tedious and time-consuming and the success is not guaranteed, i.e., the studied candidate substances may have no required selectivity for further use.
In these circumstances, it would be good to have an instrument that would allow an initial screening of candidate ionophore substances and, based on the chemical structure of the substance, would be able to reliably predict if the particular ligand is worth the whole experimental study. This in silico approach would save a considerable amount of time and resources in developing new sensors.
Quantitative Structure-Property Relationship (QSPR) is a computational method that aims to build a mathematical model relating the chemical structure of the substance to some of its properties [5]. In order to do so, the QSPR approach formalizes a chemical structure in a set of molecular descriptors (numbers that describe the structure as a whole [6]) and tries to find a mathematical relation (regression model), connecting these descriptors with a property of interest. The model construction requires a training set of molecules where the target property under study is experimentally determined. QSPR modelling is a mature field of research that has contributed to many different areas of chemistry such as, e.g., drug design [7,8], material science [9,10], and toxicity evaluation [11,12]. QSPR can effectively consider various complex effects in multicomponent systems. For example, QSPR was successfully applied for prediction of equilibrium constants in the complexation of metals with organic ligands and in liquid extraction processes [13,14].
Recently, the QSPR approach was suggested for the prediction of analytical performance parameters for ionophore-based potentiometric sensors [15,16]. The potentiometric sensitivity of the sensors based on a variety of nitrogen-containing ligands (mainly diamides of pyridine and bipyridine acids) towards heavy metal ions (Cu 2+ , Zn 2+ , Cd 2+ , Pb 2+ ) was predicted with root mean squared errors around 5 mV/dec based on the set of descriptors derived from substructural molecular fragments [15]. This model was based only on the experimental data obtained by the authors. The study [16] developed this concept further and aimed at Mg 2+ /Ca 2+ potentiometric selectivity prediction using QSPR based on the literature data. While the attained precision of the model in prediction was not very high (±0.5 logK sel ), the model was able to distinguish reliably between the ligands with high, medium, and low Mg 2+ selectivity.
The present work is devoted to study whether it will be possible to predict potentiometric selectivity of the sensors based on newly synthesized ligands using a QSPR model based on literature data. As a challenging case for modeling, we have focused on carbonateselective sensors, where the number of available ligands with appropriate selectivity is rather limited. The selectivity values logK sel (HCO 3 − /Cl − ) were used as a target parameter for modeling.

The Dataset for Modelling
The following flow diagram ( Figure 1) illustrates the logic of the experiment. The dataset for QSPR modeling contained the carbonate ionophores described in the literature (40 substances in total). The structures of all ionophores obtained from the literature sources are given in Table S1 (Supplementary Materials). Considering the small number of papers reporting on the successful development of carbonate potentiometric sensors and in order to extend the dataset with additional entries, we have also included several ionophores showing poor carbonate selectivity. In these cases, the reported values logK sel (Cl − /HCO 3 − ) were converted into logK sel (HCO 3 − /Cl − ) according to Nikolsky-Eisenman equation: where a I, a J , z I , z J are the activities and charges of the primary and interfering ions, respectively, and K IJ is a selectivity coefficient. The value of the selectivity coefficient can be found according to: and it is possible to calculate the logarithm of the selectivity coefficient K JI given that the charges of the ions I and J are equal. Thus, in our case, Chemosensors 2022, 10, x FOR PEER REVIEW 3 of 12

The Dataset for Modelling
The following flow diagram ( Figure 1) illustrates the logic of the experiment. The dataset for QSPR modeling contained the carbonate ionophores described in the literature (40 substances in total). The structures of all ionophores obtained from the literature sources are given in Table S1 (Supplementary Materials). Considering the small number of papers reporting on the successful development of carbonate potentiometric sensors and in order to extend the dataset with additional entries, we have also included several ionophores showing poor carbonate selectivity. In these cases, the reported values logK sel (Cl − /HCO3 − ) were converted into logK sel (HCO3 − /Cl − ) according to Nikolsky-Eisenman equation: where a , a , zI, zJ are the activities and charges of the primary and interfering ions, respectively, and K is a selectivity coefficient. The value of the selectivity coefficient can be found according to: and it is possible to calculate the logarithm of the selectivity coefficient K given that the charges of the ions I and J are equal. Thus, in our case, logK sel (Cl − /HCO3 − ) = −logK sel (HCO3 − /Cl − ). While compiling the literature dataset we have limited ourselves with the papers where poly(vinylchloride) (PVC) was employed as a polymer for membrane matrix. Since the solvent-plasticizers varied from study to study, we have only considered the papers employing o-nitrophenyl octyl ether (NPOE), dioctyl adipate (DOA), or bis(2-ethylhexyl) sebacate (DOS) as plasticizers. In order to take into account the differences in membrane properties induced by the plasticizer, its dielectric permittivity was added to the dataset as one of the descriptors for modelling. The anion-exchanger employed in all the literature sources was tridodecylmethylammonium (TDMA). In order to ensure that HCO 3 − is a dominant ionic form in solutions, we only considered the literature data obtained in the pH range 7.0-8.6. All logK sel (HCO 3 − /Cl − ) were obtained using a separate solution method, except for eight substances from [17] where the matched potential method was employed. Since these two methods usually yield comparable results, it was decided to keep these eight entries in the dataset. The uniformity of experimental conditions is very important to warrant a reliable dataset for QSPR modelling. The selectivities logK sel (HCO 3 − /Cl − ) in the literature dataset varied from −5.8 to 6.2. The mean value was −1.4 and the median value was −2.6.

of 11
The structures of the new ligands are given in Table 1. The motivation for synthesis of these particular substances was the following: most of the carbonate ionophores developed so far are based on trifluoroacetophenone and its derivatives. The working fragment responsible for carbonate ion binding is the carbonyl group with lowered electron density on oxygen due to the presence of a strong acceptor-the CF 3 group. The substance 1 is commercially available and was acquired from Merck (Darmstadt, Germany) and used as received. The ligands 2-4 ( Table 1) were newly synthesized. All four substances contain different acceptor substituents in the vicinity of carbonyl. We have hypothesized that these ligands may also have considerable carbonate binding ability.
The synthesis and characterization of newly synthesized substances are provided in the Supplementary Materials along with their NMR spectra. sources was tridodecylmethylammonium (TDMA). In order to ensure that HCO3 − is dominant ionic form in solutions, we only considered the literature data obtained in t pH range 7.0-8.6. All logK sel (HCO3 − /Cl − ) were obtained using a separate solution metho except for eight substances from [17] where the matched potential method was employe Since these two methods usually yield comparable results, it was decided to keep the eight entries in the dataset. The uniformity of experimental conditions is very importa to warrant a reliable dataset for QSPR modelling. The selectivities logK sel (HCO3 − /Cl − ) the literature dataset varied from −5.8 to 6.2. The mean value was −1.4 and the medi value was −2.6. The structures of the new ligands are given in Table 1. The motivation for synthe of these particular substances was the following: most of the carbonate ionophores dev oped so far are based on trifluoroacetophenone and its derivatives. The working fragme responsible for carbonate ion binding is the carbonyl group with lowered electron dens on oxygen due to the presence of a strong acceptor-the CF3 group. The substance 1 commercially available and was acquired from Merck (Darmstadt, Germany) and used received. The ligands 2-4 ( Table 1) were newly synthesized. All four substances conta different acceptor substituents in the vicinity of carbonyl. We have hypothesized th these ligands may also have considerable carbonate binding ability.

#
Ligand Structure IUPAC Name Phenanthrenequinone S1 2 sources was tridodecylmethylammonium (TDMA). In order to ensure that HCO3 − is dominant ionic form in solutions, we only considered the literature data obtained in th pH range 7.0-8.6. All logK sel (HCO3 − /Cl − ) were obtained using a separate solution metho except for eight substances from [17] where the matched potential method was employe Since these two methods usually yield comparable results, it was decided to keep the eight entries in the dataset. The uniformity of experimental conditions is very importa to warrant a reliable dataset for QSPR modelling. The selectivities logK sel (HCO3 − /Cl − ) the literature dataset varied from −5.8 to 6.2. The mean value was −1.4 and the media value was −2.6. The structures of the new ligands are given in Table 1. The motivation for synthes of these particular substances was the following: most of the carbonate ionophores deve oped so far are based on trifluoroacetophenone and its derivatives. The working fragme responsible for carbonate ion binding is the carbonyl group with lowered electron densi on oxygen due to the presence of a strong acceptor-the CF3 group. The substance 1 commercially available and was acquired from Merck (Darmstadt, Germany) and used received. The ligands 2-4 ( Table 1) were newly synthesized. All four substances conta different acceptor substituents in the vicinity of carbonyl. We have hypothesized th these ligands may also have considerable carbonate binding ability.

#
Ligand Structure IUPAC Name The synthesis and characterization of newly synthesized substances are provided the Supplementary Materials along with their NMR spectra.
The synthesis and characterization of newly synthesized substances are provided the Supplementary Materials along with their NMR spectra.

Potentiometric Sensors Based on New Ligands
Four new ligands were employed for preparation of PVC-plasticized sensor mem branes. The compositions of the membranes are given in Table 2. Each membrane co tained 50 mmol/kg of ionophore and 10 mmol/kg of TDMA-NO3 as the anion-exchange The ratio between PVC and a solvent-plasticizer was 1:2 and the total weight of membran was 300 mg. o-nitrophenyloctyl ether (NPOE) was employed as the solvent-plasticizers Polymeric sensor membranes were prepared according to the standard protocol. Th weighted amounts of all membrane components were dissolved in 3 mL freshly distille tetrahydrofuran (THF) in a glass beaker using a magnetic stirrer. The resulting solutio were poured into teflon beakers 20 mm in diameter and left to dry for 48 h. The disks mm in diameter were cut from the parent membranes and then attached to the PVC sens bodies with a mixture of PVC and cyclohexanone. The thickness of the prepared mem branes was 0.4 mm. After drying the glue, the inner parts of the resulted electrodes we filled with a mixture of 0.01 M NaHCO3 and 0.001 M NaCl. The chloride anion is require for Ag/AgCl electrode functioning, while bicarbonate is needed to ensure its constant co tent in the membrane phase required by Nikolsky formalism. The water for making a aqueous solutions was obtained from distiller GFL 2102 (GFL Burgwedel, Germany). Th

Potentiometric Sensors Based on New Ligands
Four new ligands were employed for preparation of PVC-plasticized sensor membranes. The compositions of the membranes are given in Table 2. Each membrane contained 50 mmol/kg of ionophore and 10 mmol/kg of TDMA-NO 3 as the anion-exchanger. The ratio between PVC and a solvent-plasticizer was 1:2 and the total weight of membrane was 300 mg. o-nitrophenyloctyl ether (NPOE) was employed as the solvent-plasticizers. Polymeric sensor membranes were prepared according to the standard protocol. The weighted amounts of all membrane components were dissolved in 3 mL freshly distilled tetrahydrofuran (THF) in a glass beaker using a magnetic stirrer. The resulting solutions were poured into teflon beakers 20 mm in diameter and left to dry for 48 h. The disks 8 mm in diameter were cut from the parent membranes and then attached to the PVC sensor bodies with a mixture of PVC and cyclohexanone. The thickness of the prepared membranes was 0.4 mm. After drying the glue, the inner parts of the resulted electrodes were filled with a mixture of 0.01 M NaHCO 3 and 0.001 M NaCl. The chloride anion is required for Ag/AgCl electrode functioning, while bicarbonate is needed to ensure its constant content in the membrane phase required by Nikolsky formalism. The water for making all aqueous solutions was obtained from distiller GFL 2102 (GFL Burgwedel, Germany). The conductivity of the mono-distillate is 2.2 µs/cm at 25 • C. Finally, internal Ag/AgCl electrodes were embedded in the sensors. Three replicate sensors were prepared from each membrane composition.
Potentiometric measurements were performed with a multichannel digital mV-meter KHAN-32 (Sensor Systems LLC, St. Petersburg, Russia) connected to a PC for data acquisition through USB port. The measurements were done against the standard reference Ag/AgCl electrode ESr-10101 (Izmeritel'naya Tekhnika, Moscow, Russia). The glass pH sensor PY-41 (GOnDO Electronic Co., Ltd., Taipei, Taiwan) was employed to control pH values in sample solutions.
Sensor sensitivities were studied in aqueous solutions of inorganic salts (sodium sulphate, sodium chloride, sodium nitrate, sodium bicarbonate, and monosodium phosphate) in the concentration range from 10 −6 to 10 −2 M. The sensitivity values were calculated as the slopes of the linear parts of sensor response curves (10 −4 -10 −2 M). The sensors were washed with several portions of distilled water before, after, and between the measurements until the constant values of potential was achieved.
Selectivity coefficients were obtained by separate-solution method, which is also known as bi-ionic potentials method [18]: E I and E J were registered in 10 −3 M solutions of primary and interfering ions, respectively.

QSPR Modelling
Substructural Molecular Fragments (SMF) were employed as descriptors for encoding molecular structures of ligands. This method interprets a molecular structure as a graph so that all possible molecular fragments are subgraphs. These fragments with given length are found and counted and their numbers are placed in a resulted descriptor matrix.
"MolFrag", which is a part of "ISIDA QSPR" software suite was used to obtain SMF [19]. SMF in ISIDA can be described in two ways: either as sequences of atoms and/or bonds (topological path) or as a selected ("augmented") atom (atom-centred fragments) with its environment, which can be atoms, bonds, or both of them. Both ways are illustrated in Figure 2.
sensor PY-41 (GOnDO Electronic Co., Ltd., Taipei, Taiwan) was employed to control pH values in sample solutions.
Sensor sensitivities were studied in aqueous solutions of inorganic salts (sodium sulphate, sodium chloride, sodium nitrate, sodium bicarbonate, and monosodium phosphate) in the concentration range from 10 −6 to 10 −2 M. The sensitivity values were calculated as the slopes of the linear parts of sensor response curves (10 −4 -10 −2 M). The sensors were washed with several portions of distilled water before, after, and between the measurements until the constant values of potential was achieved.
Selectivity coefficients were obtained by separate-solution method, which is also known as bi-ionic potentials method [18]: log K IJ = z I F(E J -E I ) RT ln 10 + (1z I z J ) log a I ⁄ . (3) E and E were registered in 10 −3 M solutions of primary and interfering ions, respectively.

QSPR Modelling
Substructural Molecular Fragments (SMF) were employed as descriptors for encoding molecular structures of ligands. This method interprets a molecular structure as a graph so that all possible molecular fragments are subgraphs. These fragments with given length are found and counted and their numbers are placed in a resulted descriptor matrix.
"MolFrag", which is a part of "ISIDA QSPR" software suite was used to obtain SMF [19]. SMF in ISIDA can be described in two ways: either as sequences of atoms and/or bonds (topological path) or as a selected ("augmented") atom (atom-centred fragments) with its environment, which can be atoms, bonds, or both of them. Both ways are illustrated in Figure 2.  In this research we chose atom and bond sequences for molecular description. Only the shortest paths from one atom to the other were considered. The minimal and the maximal lengths of atom-bonds chains in this research were 4 and 10, respectively.
The generation of descriptors resulted in the matrix 40 [number of literature compounds] × 1103 [calculated structural descriptors], which was extended with one more column containing dielectric constants of the solvent-plasticizers employed in the sensor membrane preparation. The resulting matrix was used to establish the regression model relating these descriptors with logK sel (HCO 3 − /Cl − ) selectivity values of the corresponding potentiometric sensors.
Partial Least Squares (PLS) algorithm was applied for constructing multivariate regressions. PLS is a very popular chemometric tool for such applications and it is widely applied in QSPR studies [20][21][22]. Briefly, PLS seeks to calculate the coefficients B in the equation: where Y is a vector containing target property values (logK sel (HCO 3 − /Cl − ) values in this study) for each of the cases in the training set and X is the matrix containing descriptive variables for each of the cases (molecular descriptors in this study). The PLS algorithm is looking for the variables in descriptors that would be highly correlated to the variance in the Y vector of the training data. These correlated variables will get the highest values of regression coefficients. The detailed description of PLS can be found elsewhere [23].
PLS models were calculated in The Unscrambler 9.7 software package (CAMO, Norway). Full cross-validation was applied to optimize the number of latent variables and the number of descriptors in the model using RMSECV value (root mean square error of cross-validation) as a criterion: where y pred and y real are the predicted and real values of the target property in each run of full cross-validation and n is the number of samples. Prior to modelling, the values of descriptors in the X matrix were autoscaled (the column wise average was subtracted from each element and the result was divided by column wise standard deviation).
The resulted model was used to predict the logK sel (HCO 3 − /Cl − ) values for the newly synthesized ligands.

Results and Discussion
The matrix of molecular descriptors acquired for 40 ligands was related to their logK sel (HCO 3 − /Cl − ) values using PLS regression. The initially obtained model was optimized with respect to the number of variables using regression coefficients values in the model. All variables having regression coefficients in the range [−5 × 10 −3 ; 5 × 10 −3 ] were excluded from consideration, thus yielding 585 variables in the model. The resulting QSPR model is given in Figure 3. Based on the model statistics (RMSECV and R 2 values) it can be concluded that the derived relation is suitable for semi-quantitative prediction of potentiometric selectivity of ionophores. The logK sel (HCO 3 − /Cl − ) values varied from −6 to +6 in the modeled dataset and RMSECV value is 1.5, thus, the model allows distinction between poorly, medium, and highly selective carbonate ligands. This appears to be a promising result considering a wide chemical diversity of the ligands, comparatively small (in QSPR scale) dataset, employment of literature data obtained in obviously non-identical conditions, and overall simplicity of the approach.
The analysis of the regression coefficients of the PLS model allows judging on the importance of particular descriptors and their contribution to the selectivity values. The largest contribution is made by the fragments having the highest absolute values of regression coefficients. Calculated fragments (model variables) were filtered by the presence of at least five ligand structures. Then, the remaining fragments were analysed for repeatability in other fragments, and the smallest fragments were chosen from the row of equivalent fragments. In this way, the fragment C-C-C-O-C=C contains the shorter fragment C-O-C=C, and this shorter fragment was kept for analysis. Finally, the fragments were sorted by the value of the corresponding regression coefficients and only the fragments with absolute values of b < 0.01 were considered. The resulting variable importance graph is shown in Figure 4.  The analysis of the regression coefficients of the PLS model allows judging on the importance of particular descriptors and their contribution to the selectivity values. The largest contribution is made by the fragments having the highest absolute values of regression coefficients. Calculated fragments (model variables) were filtered by the presence of at least five ligand structures. Then, the remaining fragments were analysed for repeatability in other fragments, and the smallest fragments were chosen from the row of equivalent fragments. In this way, the fragment C-C-C-O-C=C contains the shorter fragment C-O-C=C, and this shorter fragment was kept for analysis. Finally, the fragments were sorted by the value of the corresponding regression coefficients and only the fragments with absolute values of b < 0.01 were considered. The resulting variable importance graph is shown in Figure 4.   The analysis of the regression coefficients of the PLS model allows judging on the importance of particular descriptors and their contribution to the selectivity values. The largest contribution is made by the fragments having the highest absolute values of regression coefficients. Calculated fragments (model variables) were filtered by the presence of at least five ligand structures. Then, the remaining fragments were analysed for repeatability in other fragments, and the smallest fragments were chosen from the row of equivalent fragments. In this way, the fragment C-C-C-O-C=C contains the shorter fragment C-O-C=C, and this shorter fragment was kept for analysis. Finally, the fragments were sorted by the value of the corresponding regression coefficients and only the fragments with absolute values of b < 0.01 were considered. The resulting variable importance graph is shown in Figure 4.  It can be seen that the largest negative contribution corresponds to the fragment C=C-C=C-C-C-F. Since the modelled value was logK sel (HCO 3 − /Cl − ), and the more negative it is the higher is the selectivity towards HCO 3 − , this means that such C=C-C=C-C-C-F fragments contribute to greater selectivity towards the hydrocarbonate anion. It should be noted that amongst the 17 fragments with the negative contribution, there are only 2 that have no fluorine (F − ) or C=O groups. These fragments are C=C-C=C-C-O and C-C-N-C and they also have significant negative contributions amongst the chosen fragments (15th and 16th place out of 17).
Ten fragments with the largest negative contributions occur in the trifluoroacetophenone (TFA) group. The chemical structure of this group is shown in Figure 5. The TFA group is part of a significant amount of existing carbonate ionophores. The great exam-ple of ionophore with this group and the great selectivity (logK sel (HCO 3 − /Cl − ) = −3.1) is N, N-Dioctyl-3α, 12α-bis(4-trifluoroacetylbenzoyloxy)-5β-cholan-24-amide (carbonate ionophore VII from Merck catalogue [24]). The structure of this compound can be found in Table S1 in the Supplementary Materials. The TFA group is known to promote carbonate anion bonding through the formation of hydrogen bonds [25].
noted that amongst the 17 fragments with the negative contribution, there are only 2 that have no fluorine (F − ) or C=O groups. These fragments are C=C-C=C-C-O and C-C-N-C and they also have significant negative contributions amongst the chosen fragments (15th and 16th place out of 17).
Ten fragments with the largest negative contributions occur in the trifluoroacetophenone (TFA) group. The chemical structure of this group is shown in Figure 5. The TFA group is part of a significant amount of existing carbonate ionophores. The great example of ionophore with this group and the great selectivity (logK sel (HCO3 − /Cl − ) = −3.1) is N, N-Dioctyl-3α, 12α-bis(4-trifluoroacetylbenzoyloxy)-5β-cholan-24-amide (carbonate ionophore VII from Merck catalogue [24]). The structure of this compound can be found in Table S1 in the Supplementary Materials. The TFA group is known to promote carbonate anion bonding through the formation of hydrogen bonds [25].  [26]. An example of these ionophores is {µ-[4,5-Dimethyl-3,6-bis(octyloxy)-1,2-phenylene]} bis(trifluoroacetato-0) dimercury, which has all the fragments with positive contribution in its structure, which results in logK sel (HCO3 − /Cl − ) = 5.5. The structure of this ligand can be found in Table S1 (Supplementary Materials).
Thus, it can be stated that the constructed QSPR model agrees well with the chemical considerations on ligand structures and the importance of various fragments corresponds to the chemical intuition.
The polymeric sensor membranes prepared with four new ligands were studied with respect to their sensitivity towards inorganic anions and their HCO3 − /Cl − selectivity. The typical response curves of the new sensors are given in Figure S1 (Supplementary Materials) and the calculated sensitivity values are shown in Table S2 (Supplementary Materials). It must be pointed out that the S2 and S3 sensors did not demonstrate sufficient sensitivity towards carbonate.
Using the QSPR model derived from literature data, the selectivities of these four new sensor membrane compositions were predicted. In order to do so, each of the ligands were described with the same set of SMF and the resulting row-vectors were filtered to put them into the correspondence with the filtered set of 585 variables. The resulting data were plugged into the QSPR model, which returned the predicted logK sel (HCO3 − /Cl − ) values. The results of this prediction are given in Table 3 along with the data on selectivity  [26]. An example of these ionophores is {µ-[4,5-Dimethyl-3,6bis(octyloxy)-1,2-phenylene]} bis(trifluoroacetato-0) dimercury, which has all the fragments with positive contribution in its structure, which results in logK sel (HCO 3 − /Cl − ) = 5.5. The structure of this ligand can be found in Table S1 (Supplementary Materials).
Thus, it can be stated that the constructed QSPR model agrees well with the chemical considerations on ligand structures and the importance of various fragments corresponds to the chemical intuition.
The polymeric sensor membranes prepared with four new ligands were studied with respect to their sensitivity towards inorganic anions and their HCO 3 − /Cl − selectivity. The typical response curves of the new sensors are given in Figure S1 (Supplementary Materials) and the calculated sensitivity values are shown in Table S2 (Supplementary Materials). It must be pointed out that the S2 and S3 sensors did not demonstrate sufficient sensitivity towards carbonate.
Using the QSPR model derived from literature data, the selectivities of these four new sensor membrane compositions were predicted. In order to do so, each of the ligands were described with the same set of SMF and the resulting row-vectors were filtered to put them into the correspondence with the filtered set of 585 variables. The resulting data were plugged into the QSPR model, which returned the predicted logK sel (HCO 3 − /Cl − ) values. The results of this prediction are given in Table 3 along with the data on selectivity that were obtained in a traditional potentiometric experiment using a separate solution method. Taking into account the RMSECV value of the QSPR model of 1.4 logK sel (HCO 3 − /Cl − ), the correspondence between the predicted and the experimental selectivities appears to be good in three out of five cases (S1, S4, S5). The S1 compound has C=C-C=C-C=O, C-C-C=C, C=C-C-C=O, and C=C-C=O, C-C-C=O fragments with negative contribution and a C=C-C=C fragment with a positive contribution. The model predicted the logarithm of the selectivity coefficient with a value of −2.5 while the experimentally found selectivity coefficient is −3.6. The S2 compound contains a C-C-N-C fragment and the rest of the fragments are the same fragments with the first compound. All of these fragments, except C=C-C=C, make a negative contribution to the logarithm of the selectivity coefficient. Thus, there is a noticeable deviation between the predicted logK sel (HCO 3 − /Cl − ) − 3.4 and the experimental value of −0.7. The S3 compound has only a few fragments with a significant model impact. These fragments are C-C-C=C with the negative contribution and C=C-C=C with a positive contribution. As the model has only two structural fragments with a significant impact for this compound, the difference between the prediction of the model which (−3.0) and the experimental data (−1.0) is also high. A rather high discrepancy between prediction and experiment for S2 and S3 can also be explained by the fact that these sensors in fact did not have reasonable response towards carbonate (Table S2).
The S4 compound has fragments such as C=C-C=C-C=O, C=C-C-C=O, C=C-C=O, C-C-C=O, C-C-C=C, O-C-C=C-C=C-C=O, and C-C-N-C with a negative contribution and C-O-C=C, C=C-C=C-O, C-C=C-O, and C=C-C=C with a positive contribution in its composition. Since the number of important fragments is rather high, the agreement between the experiment and the calculation is also rather good (−2.7 vs. −2.5, correspondingly).
Thus, proper prediction requires that the ligands employed for QSPR model construction would share the same important structural fragments with newly proposed ligands. This issue is somehow obvious from the chemical and mathematical intuition as the diversity of the training set determines the applicability domain of the model.

Conclusions
The QSPR model relating the structure of carbonate ionophores to the selectivity (logK sel (HCO 3 − /Cl − ) of the corresponding potentiometric sensors with plasticized polymeric membrane was developed based on the literature data on 40 ligands. The important structural fragments in ligands were identified using the regression coefficients values in the PLS model. The applicability of the model for prediction of the selectivity of four new ligands with different acceptor substituents at carbonyl group was explored. For two ligands that have demonstrated potentiometric sensitivity to hydrocarbonate ions, reasonable agreement in predicted and experimental selectivities was observed. It is noteworthy that the match with experimental data was better for the ligands sharing the same important molecular fragments with those employed for QSPR model construction. We believe these results demonstrate the potential of QSPR approach in the development of novel anion-selective sensors.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/chemosensors10020043/s1, Figure S1: Typical potentiometric response curves of the sensor S2; Table S1: Structure and selectivity of the ionophores employed for modeling; Table S2: Sensitivity values of the sensors in the solution of the studied anions, mV/dec. Funding: This research was supported by RFBR project #20-33-70272.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.