Solvatochromic Sensitivity of BODIPY Probes: A New Tool for Selecting Fluorophores and Polarity Mapping

This research work is devoted to collecting a high-quality dataset of BODIPYs in a series of 10–30 solvents. In total, 115 individual compounds in 71 solvents are represented by 1698 arrays of the spectral and photophysical properties of the fluorophore. Each dye for a series of solvents is characterized by a calculated value of solvatochromic sensitivity according to a semiempirical approach applied to a series of solvents. The whole dataset is classified into 6 and 24 clusters of solvatochromic sensitivity, from high negative to high positive solvatochromism. The results of the analysis are visualized by the polarity mapping plots depicting, in terms of wavenumbers, the absorption versus emission, stokes shift versus − (absorption maxima + emission maxima), and quantum yield versus stokes shift. An analysis of the clusters combining several dyes in an individual series of solvents shows that dyes of a high solvatochromic sensitivity demonstrate regular behaviour of the corresponding plots suitable for polarity and viscosity mapping. The fluorophores collected in this study represent a high quality dataset of pattern dyes for analytical and bioanalytical applications. The developed tools could be applied for the analysis of the applicability domain of the fluorescent sensors.

Routine information regarding the spectral and photophysical properties of dyes important for the further development of BODIPY fluorophores and discussions regarding their structure-property relationships are dispersed in numerous papers and reviews that have been published within the last two decades. In the original research, several libraries of BODIPY for intracellular cell imaging have been reported for individual solvents, namely: dimethylsulfoxide [35][36][37][38][39][40], methanol [41,42], and HEPES buffer solution [43].
A general overview of the spectral properties, Stokes shift, and quantum yield for the whole dataset is demonstrated by the frequency diagrams in Figure 1. The properties correspond to the whole range of 71 solvents depicted in Table S3

Cluster Analysis of the Solvatochromic Sensitivity 2.2.1. Six-Cluster Dataset of High Solvatochromic Sensitivity
The total dataset of 115 BODIPY dyes and 1698 combinations of dye and a solvent was divided into six clusters according to the solvatochromic sensitivity parameter A. The solvatochromic sensitivity (or polarity) of each compound in the whole dataset of BODIPY sensors varied in the range of −1.3 to +1.2. Each cluster combined several series of dyes, and a series of solvents for each dye appeared only once, according to the arrangements of the data.
Parameter A, representing solvatochromic sensitivity, was defined according to a method described in Section 3.3, through the use of the absorption and emission maxima of the fluorophore in a series of solvents.
The chemical structures of the BODIPY sensors classified according to the above procedure of clustering are provided in the six chats below (Schemes 1-6).  The spectral and photophysical properties of the dyes in the clusters are presented below in several diagrams. Figure 2 shows the relationship of the absorption maxima versus the emission maxima according to Equation (4) in terms of nm. Figure 3 demonstrates the relationship of the Stokes shift versus − (Absorption maxima + Emission maxima), which corresponds to the basic equation of Equation (1) and is expressed in 1000/cm.   The clustering of the dyes according to solvatochromic sensitivity parameter A helps understand the general concept of the solvatochromic classification of the dyes. For instance, the plot for absorption maxima versus emission maxima for cluster 1 in Figures 2 and 3 exhibiting high positive solvatochromism depicts several vertically oriented asterisms with an approximately constant level of absorption in the solvents of a different nature, while the emission properties are varied in a wide range, indicating the high polarity of the molecule in the excited state. On the contrary, the absorption versus emission plot for cluster 6 demonstrates a wide variation in the absorption maxima and a low variation of emission characterizing the low polarity of the molecules in the excited state. Such behavior for the fluorophore is not typical. The increased values of the Stokes shift in cluster 1 in Figures 4 and 5 are shifted to the positive direction of the x-axis (less negative values), while those in cluster 6 to the opposite side follow the values of parameter A. If we take into consideration the other two pairs of symmetric clusters 2, 5, and 3, 4, shown in Figures 4 and 5, respectively, the characteristic behavior of the less polar dyes is similar to clusters 1 and 6, but not much more pronounced.     The frequency diagram shown in Figure 6 demonstrates a high quantum yield for a large number of dyes bearing near-zero solvatochromism (clusters 4 and 5), indicating the role of the dye polarity in the quantum yield. The plot of toe quantum yield vs Stokes shift, shown in Figure 7, is convenient for practical use. The frequency diagram shown in Figure 6 demonstrates a high quantum yield for a large number of dyes bearing near-zero solvatochromism (clusters 4 and 5), indicating the role of the dye polarity in the quantum yield. The plot of toe quantum yield vs Stokes shift, shown in Figure 7, is convenient for practical use.    Figure 1d. For the key for the clusters, see Schemes 1-6.

24-Cluster Dataset of High Solvatochromic Sensitivity
More detailed clustering of the dataset up to count 24 helps to identify the more regular behavior of the dyes in several clusters. The first one, the absorption maxima versus the emission maxima, is shown in Figure 8, for an approximate visualization of the absorption and the emission spectral range of the BODIPY fluorophores. Figure 8 illustrates the general semiempirical rule for the relationships between the absorption and emission wavenumbers approximated by Equation (4). Figure 9 clarifies the role of the relationship between the Stokes shift and − (Absorption maxima + Emission maxima) for polar sensors according to Equation (1). The plot of the quantum yield versus Stokes shift in Figure 10 supports understanding a common statement for polarity sensitive fluorophores-the decrease in quantum yield is accompanied by an increase in the Stokes shift.

24-Cluster Dataset of High Solvatochromic Sensitivity
More detailed clustering of the dataset up to count 24 helps to identify the more regular behavior of the dyes in several clusters. The first one, the absorption maxima versus the emission maxima, is shown in Figure 8, for an approximate visualization of the absorption and the emission spectral range of the BODIPY fluorophores. Figure 8 illustrates the general semiempirical rule for the relationships between the absorption and emission wavenumbers approximated by Equation (4). Figure 9 clarifies the role of the relationship between the Stokes shift and − (Absorption maxima + Emission maxima) for polar sensors according to Equation (1). The plot of the quantum yield versus Stokes shift in Figure 10 supports understanding a common statement for polarity sensitive fluorophores-the decrease in quantum yield is accompanied by an increase in the Stokes shift.       It is noteworthy that the diagram for quantum yield versus Stokes shift in Figure 10 for several dyes in clusters 1, 2, and 23, 24 could be approximated by one curve, respectively, while the plot for the Stokes shift versus − (absorption maxima + emission maxima) for the correspondent clusters in Figure 9 could be approximated by two lines. As an exception, a slight increase in the quantum yield versus Stokes shift is exhibited for parent aza-BODIPY, 2011 Banuelos(3437)-BTAA [69], which is provided in the dataset for comparison with the dipyrrin dyes. A similar behavior is demonstrated by 2018 Leen-BODIPY [93] in cluster 22/24, which could be attributed to the keto-enol tautomerism of the diketone fragment of the molecule. ular behavior for dyes bearing a high positive and high negative solvatochromic sensitivity, respectively. Those fluorophores exhibiting a charge transfer in polar solvents are suitable for polarity mapping due to the high deviation in the absorption and emission wavelength, as well as the quantum yield and Stokes shift. The chemical structures of the corresponding dyes are shown below in Schemes 7 and 8. Several compounds in this multitude coincide with those estimated in chemoinformatics research based on the submolecular fragments approach [57]. It is noteworthy that the diagram for quantum yield versus Stokes shift in Figure 10 for several dyes in clusters 1, 2, and 23, 24 could be approximated by one curve, respectively, while the plot for the Stokes shift versus − (absorption maxima + emission maxima) for the correspondent clusters in Figure 9 could be approximated by two lines. As an exception, a slight increase in the quantum yield versus Stokes shift is exhibited for parent aza-BODIPY, 2011 Banuelos(3437)-BTAA [69], which is provided in the dataset for comparison with the dipyrrin dyes. A similar behavior is demonstrated by 2018 Leen-BODIPY Scheme 8.
Chemical structures of dyes in clusters 23/24 and 24/24, bearing a high negative solvatochromism.

BODIPYs Bearing Near-Zero Solvatochromism
The diagrams in Figure 10  The individual dyes covering a wide range of quantum yields are exemplified by following compounds shown in Scheme 10.
Several dyes mentioned above in this section exhibited molecular rotor effects example, those in clusters 7a, 9a, 9b, and 19. The dye in cluster 7 is known to exhibit a h quantum yield in solvents with a high polarity. This observation demonstrates the plicability of the developed tool for selecting efficient fluorophores demonstrating a spectral sensitivity and high sensitivity of quantum yield to the polar solvents.

Analysis of Duplicates and Evaluation of Deviations of Spectral Data
A new tool for selecting the fluorescent probes proposed above is convenient for analysis of deviation of experimental data for identical compounds. In essence, the p lem of duplicates in chemoinformatics is solved by special algorithms based on, for ex ple, SMILES or IUPAC InChI formats [107]. Normally, duplicate chemical structures deleted from the dataset. However, spectral data for the essayed repeated compound well as their solvatochromic sensitivity, could be used for evaluating the deviation of correspondent properties in various experiments.
The chemical structures of the duplicates of the dataset collected in the curren search are represented in Scheme 11. The deviation of the solvatochromic sensitivity rameter A for each duplicate is shown in Figures 11 and 12. The solvent count is sho above each column. All of the cases with a solvent count equal to or above 10 corresp The individual dyes covering a wide range of quantum yields are exemplified by the following compounds shown in Scheme 10.
Several dyes mentioned above in this section exhibited molecular rotor effects, for example, those in clusters 7a, 9a, 9b, and 19. The dye in cluster 7 is known to exhibit a high quantum yield in solvents with a high polarity. This observation demonstrates the applicability of the developed tool for selecting efficient fluorophores demonstrating a low spectral sensitivity and high sensitivity of quantum yield to the polar solvents.

Analysis of Duplicates and Evaluation of Deviations of Spectral Data
A new tool for selecting the fluorescent probes proposed above is convenient for the analysis of deviation of experimental data for identical compounds. In essence, the problem of duplicates in chemoinformatics is solved by special algorithms based on, for example, SMILES or IUPAC InChI formats [107]. Normally, duplicate chemical structures are deleted from the dataset. However, spectral data for the essayed repeated compounds, as well as their solvatochromic sensitivity, could be used for evaluating the deviation of the correspondent properties in various experiments. The chemical structures of the duplicates of the dataset collected in the current research are represented in Scheme 11. The deviation of the solvatochromic sensitivity parameter A for each duplicate is shown in Figures 11 and 12. The solvent count is shown above each column. All of the cases with a solvent count equal to or above 10 correspond to the dataset in Table 1, while a solvent count below 10 corresponds to the extra dataset of duplicates shown in Table 2, which were collected from the literature for a comparison of the spectral data.

4, x FOR PEER REVIEW 19 of 31
Scheme 11. Chemical structure of the duplicates of the target dataset studied in the current research. Figure 11. Solvatochromic sensitivity parameter A for the duplicates of the studied dataset, solvent count ≥ 10 for a target dataset, and solvent count ≤ 9 for an extra dataset.

List of Dyes 1999 Lopez Arbeloa(315)-PM546 [108]
The analysis of the solvatochromic sensitivity parameter A for all duplicates is a compact form of comparison of the data for solvatochromic behaviour compared with the whole spectral dataset. The solvatochromic sensitivity of the compounds of the groups G, M, N, O, P, Q, and R lies within the limits of the experimental error shown in the dataset represented in files of SI.csv and SI.sdf. However, several other compounds depict a high

List of Dyes
The analysis of the solvatochromic sensitivity parameter A for all duplicates is a compact form of comparison of the data for solvatochromic behaviour compared with the whole spectral dataset. The solvatochromic sensitivity of the compounds of the groups G, M, N, O, P, Q, and R lies within the limits of the experimental error shown in the dataset represented in files of SI.csv and SI.sdf. However, several other compounds depict a high deviation, for example, a parent compound 2009 Arroyo-1 in group A, 2015 Waddell-1, 2015 Waddell-2, and the compounds in groups I and J.
It is noteworthy that several duplicate compounds have been studied in a number of experiments for quite different solvents. For example, the dyes in group F demonstrate a similar solvatochromic sensitivity with a positive solvatochromism (A > 0) for a series of 16, 22, and 6 solvents.
According to the data in the diagram shown in Figure 1, the mean standard deviation SD and standard deviation of the mean SD/ √ n for the solvatochromic sensitivity are estimated as follows, respectively:

Dataset of BODIPYs
The list of dyes included a total dataset of 115 BODIPYs and the corresponding references are shown above in Table 1 (Section 2.1). The dataset of BODIPY fluorophores collected in this study are listed in SI.csv and SI.sdf files with a correspondent count for solvents; references and IDs generated from the year of publication, first author, and title of the compound in the paper. The chemical structure in SMILES format and the direct code of JChem [107] was used for arranging the data. Each series of spectral and photophysical data are represented by 10-30 solvents for the analysis of the solvatochromic effect.
The chemical structures of the fluorophores are depicted in Schemes 1-6 in Section 2.2.1.

Evaluation of Statistical Deviations in Spectral and Photophysical Data
The dispersion of spectral data was evaluated through the use of duplicates in the dataset. For this purpose, additional data for spectral behaviour of BODIPYs covering six to nine solvents in each series, shown in Table 2 (Section 2.3) (see also Table S2 in Supplementary Information), were used along with software for the analysis and modelling of the molecular properties, such as DataWarrior [135] and ChemMine Tools [136].

Calculation of Solvatochromic Sensitivity of BODIPY Fluorophores
In previous research, a novel methodology was reported for studying the solvatochromism of BODIPYs [56,57]. This approach explores the empirical generalization of solvents' functions in Liptay's theory [137] of solvatochromism yielding an easy approach for treating a wide variety of spectral data published in the literature. The corresponding semiempirical equations are expressed as follows: where ν(Abs), ν(Em) is the wavenumbers that correspond to the absorption and emission maxima; A, B are the solvatochromic sensitivity coefficients, evaluated as regression coefficients of the linear approximation ν(Abs) − ν(Em) vs −[ν(Abs) + ν(Em)]; and µ(g) and µ(e) are the vectors of the dipole moments of the fluorophore in the ground and excited states. For simplicity, the vector symbol is omitted. Parameter A, described by Equation (5), is written in a more general form than in previous research [56,57] (see Supplementary Information). It helps to use dipole moments in vector form with no additional assumptions because the square of a vector function has a scalar value.
As far as parameter A is a function of dipole moments controlling the solvation of the fluorophore and dye spectral behavior in the ground and excited states, it is reasonable to use it as a quantitative measure of the solvatochromic sensitivity.
Coefficient A can be experimentally evaluated from the regression coefficients of Equations (1) In the case of high positive and negative solvatochromism (Figures 13 and 14), the Stokes shift exhibits a low statistical deviation from the linear correlation suitable for the estimation of parameter A; in the case of near-zero solvatochromism (Figure 15), the absorption or emission wavenumbers are more suitable for this purpose.
The experimentally evaluated parameters of Equations (1a)-(3a), (7), and (8), as well as their statistical characteristics for a dataset of 115 BODIPYs, are provided in the Supplementary Information.   In the case of high positive and negative solvatochromism (Figures 13 and 14), the Stokes shift exhibits a low statistical deviation from the linear correlation suitable for the estimation of parameter A; in the case of near-zero solvatochromism (Figure 15), the absorption or emission wavenumbers are more suitable for this purpose.
The experimentally evaluated parameters of Equations (1a)-(3a), (7), and (8), as well as their statistical characteristics for a dataset of 115 BODIPYs, are provided in the Supplementary Information.

Clustering the Properties of the Fluorophores and the Graphical Analysis
The clustering of the experimental data for a whole family of BODIPYs was performed by the parameters of solvatochromic sensitivity parameter A. This method of presentation helps to highlight each dye in a series of solvents, as well as to compare different series with a similar solvatochromic sensitivity. A graphical analysis of the spectral and photophysical properties of the collected BODIPYs was performed with the software Scimago Graphica [138].

Conclusions
Based on published results, a dataset of 115 BODIPY sensors in a series of 10-30 solvents with correspondent spectral and photophysical characteristics was collected. The solvatochromic sensitivity of each dye was evaluated by a recently developed semiempirical approach for the analysis of the solvatochromism. The polarity of the dyes was divided into 6 and 24 clusters from high negative to high positive levels, and was applied for the classification of fluorophores and an analysis of their properties.
The analysis of clusters combining several dyes in an individual series of solvents shows that dyes with a high polarity, with either a positive or negative solvatochromism, demonstrated regular changes in the absorption maxima versus emission maxima, Stokes shift versus -(absorption maxima + emission maxima), and quantum yield versus Stokes shift in a wide range of properties. Non-polar probes were arranged into characteristic clusters with quantum yields of 0.02-1 and Stokes shifts of 170-860 cm −1 , 1270-1400 cm −1 , and 2190-2800 cm −1 .
The developed tool based on the classification of solvatochromic sensitivity is suitable for the efficient selection of fluorescent probes, for both the polarity and viscosity sensing of biological tissue through the use of individual dyes and a set of dyes. The selected BODIPY probes could serve as pattern samples for the testing and design of the fluorophores.
In the case of coplanarity of the vectors of the dipole moments µ(g) and µ(e), they can be used as scalar parameters, thus Expressions (5) and (8) are simplified for a clear understanding of the physicochemical meaning of parameter A defining the solvatochromic sensitivity: An illustration of the calculation method is provided in the papers [56,57]. Some additional examples are given below in Figures 13-15 for the fluorophores discussed above. In the case of high positive and negative solvatochromism (Figures 13 and 14), the Stokes shift exhibits a low statistical deviation from the linear correlation suitable for the estimation of parameter A; in the case of near-zero solvatochromism (Figure 15), the absorption or emission wavenumbers are more suitable for this purpose.

Clustering the Properties of the Fluorophores and the Graphical Analysis
The clustering of the experimental data for a whole family of BODIPYs was performed by the parameters of solvatochromic sensitivity parameter A. This method of presentation helps to highlight each dye in a series of solvents, as well as to compare different series with a similar solvatochromic sensitivity. A graphical analysis of the spectral and photophysical properties of the collected BODIPYs was performed with the software Scimago Graphica [138].

Conclusions
Based on published results, a dataset of 115 BODIPY sensors in a series of 10-30 solvents with correspondent spectral and photophysical characteristics was collected. The solvatochromic sensitivity of each dye was evaluated by a recently developed semiempirical approach for the analysis of the solvatochromism. The polarity of the dyes was divided into 6 and 24 clusters from high negative to high positive levels, and was applied for the classification of fluorophores and an analysis of their properties.
The analysis of clusters combining several dyes in an individual series of solvents shows that dyes with a high polarity, with either a positive or negative solvatochromism, demonstrated regular changes in the absorption maxima versus emission maxima, Stokes shift versus − (absorption maxima + emission maxima), and quantum yield versus Stokes shift in a wide range of properties. Non-polar probes were arranged into characteristic clusters with quantum yields of 0.02-1 and Stokes shifts of 170-860 cm −1 , 1270-1400 cm −1 , and 2190-2800 cm −1 .
The developed tool based on the classification of solvatochromic sensitivity is suitable for the efficient selection of fluorescent probes, for both the polarity and viscosity sensing of biological tissue through the use of individual dyes and a set of dyes. The selected BODIPY probes could serve as pattern samples for the testing and design of the fluorophores.