The TDDFT Excitation Energies of the BODIPYs; The DFT and TDDFT Challenge Continues

The derivatives of 4,4-difluoro-4-bora-3a,4a-diaza-s-indacene (BODIPY) are pivotal ingredients for a large number of functional, stimuli-responsive materials and therapeutic molecules based on their photophysical properties, and there is a urgent need to understand and predict their optical traits prior to investing a large amount of resources in preparing them. Density functional theory (DFT) and time-dependent DFT (TDDFT) computations were performed to calculate the excitation energies of the lowest-energy singlet excited state of a large series of common BODIPY derivatives employing various functional aiming at the best possible combination providing the least deviations from the experimental values. Using the common “fudge” correction, a series of combinations was investigated, and a methodology is proposed offering equal or better performances than what is reported in the literature.

From all these previous investigations, the main conclusion is that the methodology that requires the least resources in material science and biomedical research is the application of an empirical correction: "fudge" [18,38,49]. Fudge methods generally consist in applying an empirical correction to calculated data to make them fit with experimental results. Usually, this method is used for TD-DFT computations where the shape of the simulated spectra compare favorably with the experiments but exhibit large offsets in terms of wavelength position.
This work proposes a revisit where new basis sets and computational methods are applied to find the best combination in order to provide a better agreement between computations and experiments. Figure 1 depicts the structure of the BODIPY core and its classical numbering scheme, for the purpose of this paper, the alpha, beta, and meso designation will be used.

Optimisation of the BODIPYs Structures and Orbitals Parameters
Optimization processes were carried out with different functionals, respectively the one used for the TD-DFT calculation. No imaginary frequencies were observed for any of the investigated molecules nor for any functionals used assessing the correct energy minimization of the structure. Regarding structural features, most of the BODIPY core are planar or quasi planar with small deviation up to 30 degree in extreme cases. This issue has already been addressed by Orte and coworkers [27] and, in our case, did not explicitly interfere with the TD-DFT calculations for the estimation of the 0-0 transition.
The frontier molecular orbitals (HOMO (Highest Occupied Molecular Orbital) and LUMO (Lowest Unoccupied Molecular Orbital)) were generated and checked in order to verify that the correct modeling of the S 1 state was achieved. Figure 2 depicts the HOMO-LUMO levels and MO contours for B1, B17, and B23. The HOMO-LUMO gaps are in good agreement with the BODIPY family standard values [50]. Generally, the HOMOs are built upon on the π-systems located on the pyrrole rings. Concurrently, the LUMOs are partially located on the meso-position accompanied by a depletion of the pyrrole π-systems relative to that found for the HOMOs. Minor contributions are also computed on the two fluorides of the BF 2 group. When bulky groups, aromatics or conjugated systems are present at the meso-position, the LUMOs tend to extend towards these groups directly linked or through a heteroatom (i.e., oxygen or nitrogen). This interaction could be described as an intramolecular partial charge transfer.

TD-DFT Assessing the First Excited State
The lowest energy spin-allowed electronic transitions calculated from TD-DFT computations have been correlated with the maximum 0-0 peak in the absorption experimental spectra. Figure 3 shows these correlations compared with the ideal case where these values are identical. The computed values are systematically lower than the experimental, which is consistent with the literature [35,36].  Tables 1 and 2, and the appropriate solvent field has been applied.
To pin down what computational method appears the best, least square regressions have been performed on each dataset. Each of them consists of the lowest energy spinallowed electronic transitions obtained from TD-DFT (as described in the computational details) and the positions of the maximum intensity peak (i.e., 0-0) in UV-vis spectra. Two parameters have been extracted: the R 2 correlation and the slope. A R 2 value approaching one implies a perfect or quasi-perfect correlation between computed and experimental wavelengths. Concurrently, a slope approaching one means that the deltas between each molecular species are the same numerical values. The "sensitivity" would be ideal in this case. Figure 3h displays these correlation parameters after a linear regression for each method for all 30 species. ωB97X-D seems to be the best suitable functional with the BODIPY dyes along with CAM-B3LYP as well as RHF methods. Figure 4 regroups the correlation parameters R 2 separated by BODIPY groups as defined in "Investigated Structures" section. The CAM-B3LYP and ωB97X-D functionals as well as the RHF method appear to give results with good correlation with experiments (blue bars). The presence of an aromatic group (at the meso position in these series) has the greatest impact on the correlation mainly for the B3LYP, PBE, and PBE0. This, with the fact that ωB97X-D functional is giving the best results, indicates that the long-distance electronic correlation is important for this purpose. Indeed, this functional includes a 100% longrange exact exchange in its definition. For the investigated dyes, this interaction seems to be primordial during the interaction between light and a conjugated system and the formation of the charge transfer S 1 state. In comparison with a previous study on generic organic molecules, a more exact long-range term is needed (~25% vs. 100%) [58]. In the cases of the B3LYP, PBE, PBE0, and TPSSh and RHF methods, low R 2 values are obtained for molecules containing aromatic groups placed at the meso and α and β positions and substituents at various positions on the aromatic groups (B10-B13). However, an investigation as whether the inclusion or exclusion of specific groups on the BODIPY skeleton have any effect, has been examined. However, this endeavor was inconclusive. Adding or removing dyes has a neglectable effect on the resulting fits (R 2 and slope) for each series. These tests have been performed by removing one dye at the time and examining what the effect on R 2 and slope are (Table A2).

Model Validation
In order to validate this computational model obtained with the training series (dark dots; (B1-B30), a test group (P1-P10) represented in Figure 3 by empty circles were added to the graphs. The test group is composed of analogous ring-fused and larger cyclic pyrrole-based dyes such as BOPHYs (bis(difluoroboron)1,2-bis((1H-pyrrol-2-yl)methylene)hydrazine (P1-P4)), free base porphyrinoides (tetraphenyl porphyrin P5, tetrabenzonporphyrin P6, phtalocyanine P7, and corrole P8) and diketo-pyrrolopyrroles (P9-P10) with common substituents to ensure a high degree of diversity. With little to no surprise, the best fits were again observed for the CAM-B3LYP and the ωB97X-D functionals as well as the RHF method. For B3LYP, PBE, and PBE0, the results are spread too widely to consider them as adequate functionals for this purpose.

The "Fudge Factor" Approach
Regarding the deviation of the slopes from the theoretical perfect theoretical match (black line in Figure 3), similarly to many other research groups, a "fudge factor" correction is also applied. This approach is generally accepted and is motivated on the ratio computation time vs quality of the method permitting to achieve a correlation slope close to 1 with a maximal R 2 . Based on the large computational investigation above, ωB97X-D with 6.311g (d,p) basis set appear the most appropriate for the largest set of different substrates ( Figure 4) and were selected to test the fudge factor ( Figure 5). The calculated data set was plotted on Figure 5 with a R 2 of 0.97 and a slope of 0.917. After a mathematical correction of the data set, a simple linear equation was established in order to achieve a reasonable precision and accuracy. This simple correction procedure is a common method used throughout the literature and allows to estimate absorption 0-0 peak position with quick and low cost calculations [18,38,49]. The resulting sought equation is placed inside Figure 5.

Materials and Methods
The computations have been carried out with the Gaussian16 package [59] and ORCA 4.2.0 [60]. The calculations consisted in a simple three step-procedure. The geometries of all 30 BODIPYs and related pyrroliques dyes were preoptimized in their ground state with the B3LYP functional in conjunction with the 6-311g(d,p) basis set, as they are generally robust parameters for organic molecules in the literature. The optimized geometries were taken as starting points for optimization with solvent model (CPCM [61]) with each functional described below. Finally, TD-DFT computations were performed using the basis set 6-311g(d,p) in conjugation with def2-TZVP (Valence triple-zeta basis set) for heavy atoms. More precisely, B3LYP [62,63], CAM-B3LYP [64], PBE [65], PBE0 [66], TPSSh [67], and ωB97X-D [68] in addition of the RHF method [69] were used. Geometries were kept fixed during the TD-DFT computations and the excited state of interest (first excitation state, N = 1) was retrieved for each molecule. DFT integration grid was set to 4 and the final grid was set to 5. All other parameters were kept at their default values. Table A1 regroups all numerical data calculated for each method. The fudge correction was performed in two steps: first the calculated parameters were plotted against the experimental ones and fitted with a linear regression. Slope, R 2 and intercept were obtained. Second, the equation of the linear regression was then equalized with the x = y diagonal to obtain a translation equation.

Conclusions
An improved computational methodology for the prediction of the low-energy absorption peak of the BODIPY dyes has been developed. This method appears as a promising method with a lot of potential for applications because it is simple, cost effective and relatively accurate. As stated in the Introduction, many studies have addressed this important problem, but a good all-around method was not yet available. A wide benchmark series of TD-DFT methods, namely, to examine a large spread of possible wavelength values in the absorption spectra of a series of 30 BODIPYs, was employed and then validated with a test group of 10 related structures. The investigated BODIPYs included anchored groups such aromatics, electro-acceptor/donor substituents and saturated carbons chains to ensure that the design methodology could be applied in a more general manner. The best results were obtained using the ωB97X-D functional (with the basis set 6-311g(d,p)) giving the best correlation parameters (R 2 > 0.97). Concurrently, this study also pointed out that the electronic correlation at long distance is important to describe the BODIPYs first excited state. However, in the classic situation for a need of anticipated absorption maximum of the S 0 → S 1 transition for the BODIPY dyes, the "fudge factor" approach is a relatively affordable (i.e., best ratio accuracy vs computational resources) method (meaning relatively acceptable predicted values for relatively short computational time).