Unravelling the Role of Uncommon Hydrogen Bonds in Cyclodextrin Ferrociphenol Supramolecular Complexes: A Computational Modelling and Experimental Study

We sought to determine the cyclodextrins (CDs) best suited to solubilize a patented succinimido-ferrocidiphenol (SuccFerr), a compound from the ferrociphenol family having powerful anticancer activity but low water solubility. Phase solubility experiments and computational modelling were carried out on various CDs. For the latter, several CD-SuccFerr complexes were built starting from combinations of one or two CD(s) where the methylation of CD oxygen atoms was systematically changed to end up with a database of ca. 13 k models. Modelling and phase solubility experiments seem to indicate the predominance of supramolecular assemblies of SuccFerr with two CDs and the superiority of randomly methylated β-cyclodextrins (RAMEβCDs). In addition, modelling shows that there are several competing combinations of inserted moieties of SuccFerr. Furthermore, the models show that ferrocene can contribute to high stabilization by making atypical hydrogen bonds between Fe and the hydroxyl groups of CDs (single bond with one OH or clamp with two OH of the same glucose unit).


Cyclodextrins
Cyclodextrins (CDs) are cyclic oligosaccharides formed by glucose units bound together. Nowadays, they are used as excipients to improve the solubility of lipophilic compounds. Typical cyclodextrins contain a number of glucose units ranging from six to eight units in a ring and can be classified as α-cyclodextrin with six glucose subunits, β-cyclodextrin with seven glucose subunits ( Figure 1) and γ-cyclodextrin with eight glucose subunits. Cyclodextrins are frequently transformed by chemical modification of the hydroxyl group, for example, to form alkyloxy derivatives (O-Me, O-Et, etc.) [1].
The most commercially viable strategies lead to the synthesis of partially and randomly alkylated cyclodextrins [2,3]. Chemically modified cyclodextrin derivatives available on the market are mixtures with different degrees of substitution [4]. These cyclodextrins can be used as excipients in many pharmaceutical formulations (e.g., tablets, aqueous parenteral solutions, eye drop solutions and nasal sprays) [5][6][7][8]. Although these randomly substituted cyclodextrins are widely used as solubilizers, they can form a wide range of complexes, Figure 1. Representation of a β-cyclodextrin (CD, here 2-Me-βCD belonging to the ca defined CDs) with the oxygen atom numbering used for our modelling and database bering is not displayed).
The most commercially viable strategies lead to the synthesis of part domly alkylated cyclodextrins [2,3]. Chemically modified cyclodextrin deri able on the market are mixtures with different degrees of substitution [4]. Th trins can be used as excipients in many pharmaceutical formulations (e.g., tab parenteral solutions, eye drop solutions and nasal sprays) [5][6][7][8]. Although th substituted cyclodextrins are widely used as solubilizers, they can form a complexes, making the characterization of these mixtures quite challenging. els of these mixtures are usually focused on one or very few well-defined st The term "well-defined" implies that all glucose units are identical. Fi an example, 2-Me-βCD, in which all 2-positions (glucose numbering is not methylated, and all 3-and 6-positions are unmethylated. CDs having at least glucose units are designated as undefined CDs in this work. Although chemical modification of cyclodextrins might seem quite simp particularly challenging due to the presence of -OH groups in large numbe plex methods for selective modifications of cyclodextrins have been review currently used to prepare very specific varieties of them, such as the heptak methyl)-β-cyclodextrin or the heptakis-(2,6-di-O-methyl)-β-cyclodextrin.
These cyclodextrins are much more expensive than the commercially domly methylated cyclodextrins (RAMEβCDs), making their use very rest researchers are often led to focus on cheaper, randomly substituted cyclode mal studies and for clinical trials. A non-trivial question arises when model trins: what is the difference between solutions prepared with a well-defined and with a mixture of randomly substituted cyclodextrins? We already kno The term "well-defined" implies that all glucose units are identical. Figure 1 shows an example, 2-Me-βCD, in which all 2-positions (glucose numbering is not indicated) are methylated, and all 3-and 6-positions are unmethylated. CDs having at least two different glucose units are designated as undefined CDs in this work.
Although chemical modification of cyclodextrins might seem quite simple, it is in fact particularly challenging due to the presence of -OH groups in large numbers. More complex methods for selective modifications of cyclodextrins have been reviewed [9] and are currently used to prepare very specific varieties of them, such as the heptakis(2,3,6-tri-O-methyl)-β-cyclodextrin or the heptakis-(2,6-di-O-methyl)-β-cyclodextrin.
These cyclodextrins are much more expensive than the commercially available randomly methylated cyclodextrins (RAMEβCDs), making their use very restricted. In fact, researchers are often led to focus on cheaper, randomly substituted cyclodextrins for animal studies and for clinical trials. A non-trivial question arises when modelling cyclodextrins: what is the difference between solutions prepared with a well-defined cyclodextrin and with a mixture of randomly substituted cyclodextrins? We already know that the association constant of randomly methylated cyclodextrins depends greatly on their degree of methylation [10].

Ferrociphenol SuccFerr
In this paper, we report that we have performed a series of experiments for formulation with CDs on a ferrocidiphenol derivative (ferrociphenol with an additional phydroxyphenyl group: N-{4-ferrocenyl-5,5-bis-(4-hydroxy-phenyl)-pent-4-enyl}succinimide; SuccFerr (or P722), Figure 2) recently developed in our research group. SuccFerr is the result of an optimization of the ferrociphenol family [11] whose members act by apoptosis or senescence [12], where the pharmacophore is the trans (ferrocene-double bond-p-phenol) motif [13] (in blue in Figure 2), and the main metabolite is a quinone-methide [14,15] able to attack proteins [16,17]. However, the attachment of a succinimidyl group decreased the IC 50 to around 40 nM on MDA-MB-231, A2780, A2780-Cis and K562 cell lines [18] thanks to an intramolecular stabilization [19]. Moreover, recent research on glioblastomas has shown that SuccFerr has a surprising selectivity on patient-derived cell lines (PDCLs) [20]. This patented compound [21] is currently being tested for its anticancer activity, but a suitable formulation was needed to achieve a usable formulation because of its lipophilicity and poor solubility. SuccFerr was recently formulated with various nanocapsules in its diphenol form, or to increase its lipophilicity and the load in lipid nanocapsules, as monoacetate [22] or diacetate [22][23][24], with or without the unexpected formation of a gel [22]. It was recently formulated with randomly methylated cyclodextrins (RAMEβCDs) and tested in vivo without noticeable adverse toxicity [25], but this last complexation has yet to be analyzed more finely by modelling and laboratory experiments.
creased the IC50 to around 40 nM on MDA-MB-231, A2780, A2780-Cis [18] thanks to an intramolecular stabilization [19]. Moreover, recent re tomas has shown that SuccFerr has a surprising selectivity on patien (PDCLs) [20]. This patented compound [21] is currently being tested f tivity, but a suitable formulation was needed to achieve a usable form its lipophilicity and poor solubility. SuccFerr was recently formu nanocapsules in its diphenol form, or to increase its lipophilicity an nanocapsules, as monoacetate [22] or diacetate [22][23][24], with or with formation of a gel [22]. It was recently formulated with randomly me trins (RAMEβCDs) and tested in vivo without noticeable adverse toxic complexation has yet to be analyzed more finely by modelling and ments.

Experiments Prior to Modelling (Proof of Concept)
In our study, it was important to compare the modelling results solubilization tests. The complexing capacity is known to be very var the type of cyclodextrin, and it was therefore necessary to compare the zation of SuccFerr with various CDs.
The phase-solubility study was first described by Higuchi and C the dissolving properties of the complexation.
In our experiment ( Figure 3A), α-CD did not improve the solubilit ably due to its small cavity size (internal diameter: 0.57 nm). In contr β-cyclodextrin (RAMEβCD) showed significant solubilization prope ously described for the phthalimide compound [27]. The internal dia 0.68 nm, and it is well adapted to complex aromatic and heterocyclic m

Experiments Prior to Modelling (Proof of Concept)
In our study, it was important to compare the modelling results with the laboratory solubilization tests. The complexing capacity is known to be very variable depending on the type of cyclodextrin, and it was therefore necessary to compare the results of solubilization of SuccFerr with various CDs.
The phase-solubility study was first described by Higuchi and Connors [26] to test the dissolving properties of the complexation.
In our experiment ( Figure 3A), α-CD did not improve the solubility of SuccFerr, probably due to its small cavity size (internal diameter: 0.57 nm). In contrast, the methylated β-cyclodextrin (RAMEβCD) showed significant solubilization properties already previously described for the phthalimide compound [27]. The internal diameter of this CD is 0.68 nm, and it is well adapted to complex aromatic and heterocyclic moieties [28].
The phase solubility diagram of SuccFerr in the presence of RAMEβCD (see Figure 3A) exhibits a positive curvature, described as Ap-type by Higuchi and Connors [26]. This is observed if more than one CD can complex the drug corresponding to 1:2, 1:3, 1:4 (or more) stoichiometries. Our results confirm that RAMEβCD can be used to prepare a solution of SuccFerr. The phase solubility diagram of SuccFerr in the presence of RAMEβCD (see Figure  3A) exhibits a positive curvature, described as Ap-type by Higuchi and Connors [26]. This is observed if more than one CD can complex the drug corresponding to 1:2, 1:3, 1:4 (or more) stoichiometries. Our results confirm that RAMEβCD can be used to prepare a solution of SuccFerr.
It is noteworthy that DMβCD does not similarly improve the solubility as RAMEβCD does.
A second study was carried out to better understand the complexation mechanism. This experiment was performed to confirm the inclusion of the succinimidyl group in RAMEβCD. The spectrum of pure SuccFerr exhibits an absorption maximum at 300 nm corresponding to the succinimidyl moiety ( Figure 3B). The ferrocenyl moiety is known to exhibit a λmax at approximately 200 nm, whereas the λmax of phenols is found at approximately 270 nm. The addition of RAMEβCD induces a hyperchromic effect at 300 nm (Figure 3B); in contrast, the inclusion of phenols and ferrocenyl moiety did not modify the spectrum at 254 nm and 270 nm.
The specific effect observed at 300 nm can be used to study the inclusion of the succinimidyl group using the method previously described for the phthalimide derivative [27] and derived from the Benesi-Hildebrand method. Consequently, we studied the variations in the absorption of SuccFerr as a function of the concentration of RAMEβCD using the spectroscopic data. Each absorption value was taken from our spectra at 300 nm.
The corresponding theoretical equation is:  It is noteworthy that DMβCD does not similarly improve the solubility as RAMEβCD does. A second study was carried out to better understand the complexation mechanism. This experiment was performed to confirm the inclusion of the succinimidyl group in RAMEβCD. The spectrum of pure SuccFerr exhibits an absorption maximum at 300 nm corresponding to the succinimidyl moiety ( Figure 3B). The ferrocenyl moiety is known to exhibit a λ max at approximately 200 nm, whereas the λ max of phenols is found at approximately 270 nm. The addition of RAMEβCD induces a hyperchromic effect at 300 nm ( Figure 3B); in contrast, the inclusion of phenols and ferrocenyl moiety did not modify the spectrum at 254 nm and 270 nm.
The specific effect observed at 300 nm can be used to study the inclusion of the succinimidyl group using the method previously described for the phthalimide derivative [27] and derived from the Benesi-Hildebrand method. Consequently, we studied the variations in the absorption of SuccFerr as a function of the concentration of RAMEβCD using the spectroscopic data. Each absorption value was taken from our spectra at 300 nm.
The corresponding theoretical equation is: Figure 3C). Consequently, we can confirm that the linear curves obtained for double reciprocal curves of Benesi-Hildebrand correspond to a 1:1 apparent complexation with the succinimidyl group. In this equation, the absorbance coefficient and K a can be defined as the harmonic mean of all the complexes if there is an inclusion of the succinimidyl moiety (see [27] for more details).

This equation is similar to the Benesi-Hildebrand equations and is linear for [SuccFerr]/∆A as a function of 1/[CD 0 ] (
[SuccFerr] is the concentration of SuccFerr, [CD0] is the initial concentration of free cyclodextrin, ε c is the change in the molar absorption coefficient after complexation, and l is the path length. The curve obtained from experimental data (i.e., [SuccFerr]/∆A as a function of 1/[CD 0 ]) is presented in Figure 3C, and K a was calculated (K a = 50.618 ± 5.280 M −1 ).

Modelling and Web Application
We used modelling (semiempirical PM3 quantum-mechanical method) to determinate the type of association SuccFerr could make with CDs. Two main types of series have been produced. For the inclusion of SuccFerr into a single CD, eight possible systems were created (four moieties (Fc, Ph1, Ph2 and Succ, Figure 2) inserted by the narrow or the wide side of the CD, denoted 1-CD series S1 to S8). For the models with two CDs, only four series were made (2-CD series S1 to S4). These 12 trees have the same treatment in common. First, taking 1-CD models as an example, we calculated all the possible combinations of insertion (four moieties for SuccFerr × two insertion sides per CD × six possible well-defined CDs, see Tables S1 and S2 for 2-CD). Then, for each of the eight (moiety-CD side) combinations, we selected the model with the most negative ∆E (the energy difference between the supramolecular assembly and the separated molecules, while keeping the same geometry). The aim of this selection was to start with the lowest possible energy for each of these eight models that served as the G0 (generation 0) model for creating series/trees. Then, the methylations on all oxygen atoms (21 for 1-CD systems) were inverted, creating a G1 level with 21 models in the tree. The best model (lowest ∆E) was selected to create the G2 level of the tree in the same way, and so on.
We created a web application to handle the data of these 12 trees, which also provided statistics, but a need arose to obtain more data on our calculations. This is why a second program was created in C programming language. Its purpose was to verify the models (building error that would invalidate this work), to search for certain hydrogen bonds of interest, to count the average number of methyl groups around Fc and to generate the static webpages, copies of the dynamic webpages created by the web application. In the latter case, the goal of generating these static webpages was to dispense with the web application for simplified read-only consultation (to view a particular model or to navigate into the trees). These webpages can be downloaded as a compressed archive [29].
To facilitate a better understanding of this concept, the code of the PHP program, the code of the C program and the database (scheme and data) are provided in a data repository [30] (including the static webpages cited above). The web application (CDModelTree) was also referenced on Software Heritage via Hal [31] and is executable and testable (even with any other inserted molecule that readers of this article would like to test, with up to two CDs for inclusion, and without any need to adapt the code) using the XAMPP pack [32]. More details can be found in the supporting information.
The results of the calculations of these 12 series/trees are reported in Table 1.   These data were extracted from the webpages generated by the PHP program and from the report files generated by the C program that can be consulted in the data repository [30].

Discussion
These calculations were carried out to determine which CDs should be the best fit for each moiety of SuccFerr. More particularly, these calculations were meant to serve, among other things, to determine the best suitable average methylation rate of these CDs.
Data collected in Table 1 and the representation of the variation in ∆E for the unique route going from G0 to the best model (called "main route", Figure S1) for each of the 12 series/trees (Figure 4) show the superiority of the 2-CD systems, even at the G0 level (i.e., with well-defined CDs), which is predictable because there are more stabilizing interactions. Among the G0 systems of 1-CD, series 7 (insertion of Succ via the narrow side), which was the best, is exceeded in these mutation trees, at G1 by series S8 (insertion of Succ from the wide side) and even at G6 by series 2 (insertion of Fc from the wide side). This shows that no conclusion can be made based on well-defined CDs, but also that undefined CDs can sometimes perform better, even though the RAMEβCD mixture also contains CDs that are less efficient than well-defined CDs. At any rate, this behavior of 1-CD S8 and S2 merits further study.
Succ from the wide side) and even at G6 by series 2 (insertion of Fc from the wide side). This shows that no conclusion can be made based on well-defined CDs, but also that undefined CDs can sometimes perform better, even though the RAMEβCD mixture also contains CDs that are less efficient than well-defined CDs. At any rate, this behavior of 1-CD S8 and S2 merits further study. Figure 4. For the eight 1CD series (S1 to S8) and the four 2CD series (S1 to S4), unique route in the trees, starting from G0 and ending when reaching the best model of the series with display of the variation in ∆E (kJ/mol) at each generation. Some values can be found in Table 1: the starting ∆E (∆E G0), the ending ∆E (∆E of the best model: best ∆E), and the generation of the best model (final generation: Gbest). The PHP code to retrieve the data needed to fill this graphic from the database is explained in SI.

Observed Contribution of the Iron Atom to Stability
Although Table 1 and Figure 4 show a superiority of 2-CD over 1-CD (lower ∆E for G0, lower average ∆E, best lower ∆E of series), the 1-CD series S8 experiment shows very different ∆E from other 1-CD series and is in competition with the ∆E of the 2-CD. The webpage displaying the G0 model of 1-CD S8 and its first-generation descendants (G1) can be consulted to see the data (using the web application by typing its ID number (6990) into the form, or more simply displaying the static saved version "1-CD-webpage-ID- Figure 4. For the eight 1CD series (S1 to S8) and the four 2CD series (S1 to S4), unique route in the trees, starting from G0 and ending when reaching the best model of the series with display of the variation in ∆E (kJ/mol) at each generation. Some values can be found in Table 1: the starting ∆E (∆E G0), the ending ∆E (∆E of the best model: best ∆E), and the generation of the best model (final generation: Gbest). The PHP code to retrieve the data needed to fill this graphic from the database is explained in SI.

Observed Contribution of the Iron Atom to Stability
Although Table 1 and Figure 4 show a superiority of 2-CD over 1-CD (lower ∆E for G0, lower average ∆E, best lower ∆E of series), the 1-CD series S8 experiment shows very different ∆E from other 1-CD series and is in competition with the ∆E of the 2-CD. The webpage displaying the G0 model of 1-CD S8 and its first-generation descendants (G1) can be consulted to see the data (using the web application by typing its ID number (6990) into the form, or more simply displaying the static saved version "1-CD-webpage-ID-6990.html" file, available inside the repository [30]). In the list of the G1 descendants, experiment ID 7010 shows the strongest variation in ∆E (between itself and its parent, ∆∆E = −139 kJ/mol) of all the 8892 calculated 1-CD experiments during the demethylation of the oxygen atom O 20 . This brusque variation in ∆E can be seen in Figure 4 at G1. In addition, the generations derived from this experience all show a significant destabilization during the re-methylation of this atom (as in model ID 7169) but also of O 19 (for example, ID 7017). These anomalies in the data can be easily detected by SQL queries ordering the experiments by their 'deltadelta' field (representing ∆∆E in the database, Note A12). It is worth mentioning that thanks to these anomalies in the data, two atypical hydrogen bonds between hydrogens atoms from O 19 and O 20 and the central ferrocene iron atom have been detected. This is characterized by the formation of a clamp (-O 19 -H . . . Fe . . . H-O 20 -), as can be seen with the best model of this series ( Figure S5a). This type of bond, but intramolecular and without a clamp, has been reported with NH groups by an XRD analysis of a single crystal [33]. However, to the best of our knowledge, this is the first time that it has been described by modelling in supramolecular systems involving ferrocene and a cyclodextrin forming intermolecular Fe-H hydrogen bonds. This phenomenon is responsible for this decrease in energy, but also for its increase when the clamp is broken by methylating O 19 Figure  S6) was also discovered in the 1-CD series S2, again associated with a strong positive value of 'deltadelta' (∆∆E) when O 10 or O 11 is methylated (for example, ID 3743 for O 10 and ID 3744 for O 11 , seen when displaying the webpage of their common parent, model ID 3738). Unlike 1CD S8, where two Fe-H bonds (clamp) are formed at the same time, which explains the steep decrease in ∆E, the formation of the clamp for 1CD S2 is done gradually, as can be seen in Figure 4 (slower variation of ∆E, due to different stages passing through weaker Fe-H bonds formations). Even though both O 10 and O 11 were non-methylated since G1, this clamp is only formed at G9 due to the favorable influence of the modification of methylations managed by the web application. In a related way, the greatest increases in ∆E are not all due to the direct breaking of the clamp, but often also by indirect cleavage caused by steric hindrance of the latter during the methylation of an oxygen atom close to it, as methylation of O 1 for series S8 (ID 7011 and ID 7030, having ∆∆E > 176 kJ/mol). In 2-CD systems, this phenomenon also exists, but the connection is single, or of an asymmetrical clamp type (one strong/short and one weak/long), probably because of the greater steric effect. Analyzing the geometries of all the 2-CD models, thanks to the C program (that does the same for 1-CD models) reading each of the XYZ files, we can confirm that no strong symmetrical clamp exists with Fe in any of these models, even though some simple bonds were found and are important for stability (Note A13).
Analyzing the geometries of the full dataset (XYZ files, using the C program) proved to be extremely useful in understanding the effect of the atypical and classic hydrogen bonds on the stability of supramolecular assemblies. We have systematically checked, for all the models, the existence, or not, of seven possible intermolecular hydrogen bonds. The maximum distance was set to 2 Å, and the results can be found in the provided Hbonds-1CD.txt [34] and Hbonds-2CD.txt [35] report files for 1-CD and 2-CD models, respectively. For SuccFerr, the atoms involved in these bonds are the iron atom, the two phenols (-Oand -H independently) and the two carbonyls (=O) of the imide group. For CDs, these are the hydroxyls (-O-and -H independently), the ethers and the anomeric group (-O-). For each series, the models were separated into two groups (worse models or best models, Note 8). The comparison of the number of each type of hydrogen bond found in these two groups showed that the effect is important especially when the iron atom is involved, even more so when it is implicated into a clamp (asymmetrical clamps (one bond > 2 Å and one bond < 2 Å) have been ignored). For phenols, the clamps only exist for the 2-CD series, whereas clamps involving the C=O groups were detected for both the 1-CD and 2-CD series. However, these two types of clamps do not carry as much weight in stability as those involving Fe (Note A13). To explain why, in the 2-CD models, no strong clamp involving the iron atom was detected, we can compare the 1-CD S8 ( Figure 5) and 2-CD S1 series (which corresponds to the 1-CD S8 series with an additional CD in which Fc is inserted, Figure 6). We see on the models that this second CD impedes the approach of iron to two OH groups of the CD including Succ, as in the 1-CD S8 series, and instead makes a single bond with the iron atom ( Figure 6).    . Molecular model of the best 2-CD series S1 system (ID 586, Succ entering by the wide side of CD1 and Fc entering by the narrow side of CD2) computed at the semiempirical PM3 level of theory. SuccFerr is represented with a "ball-and-stick" model, and CD1 and CD2 are represented with a "licorice" model. Orange dashed line is the hydrogen bond between the iron atom and O36-H of CD2.

Contribution of Methyl Groups to Stability
One might think that since the ferrocenyl group is nonpolar, as are the phenyl groups, it could therefore interact better with methyl groups and prefer highly methylated CDs, or at least have many proximate methyl groups, as postulated by Buriez [37]. However, this is not reflected in our modelling for Fc, because the number of neighboring methyl groups is generally rather low, as shown in Figure 5. To understand the effect of the Fc-Me proximity on the stability, a systematic analysis has been performed by the C program (see code 'NbMeFc' in Table 1 and notes A7, A8, A9). We can see that the average number  19 ,Fe and O 20 ,Fe. The calculation of the ferrocene + methanol models in water confirms the presence of Fe-H bonds in DFT in aqueous medium (in the compressed archive DFT.zip [36]). A further Natural Bond Orbitals (NBO) calculation has allowed us to estimate the stabilizing energy (E (2) , see Table 2) of the atypical H-bonds found between the Fe from the SuccFerr compound and the alcohols from the CD. Adding all the NBO (donor-acceptor) energy contributions (see Table S4 in SI for further details) results in a total stabilisation energy of 86.45 kJ/mol (20.66 kcal/mol) and 119.24 kJ/mol (28. Table 2. Second-order perturbation stabilization energy contributions to the atypical H-bond Fe-HO interactions. NBO labels: LP = lone pair, LP* = antibonding lone pair, CR = core, σ* =Antibonding sigma bond. The PM3 and DFT methods gave comparable results (see Table 3 for the lengths and angles of some hydrogen bonds for the best model ID 7759).

Contribution of Methyl Groups to Stability
One might think that since the ferrocenyl group is nonpolar, as are the phenyl groups, it could therefore interact better with methyl groups and prefer highly methylated CDs, or at least have many proximate methyl groups, as postulated by Buriez [37]. However, this is not reflected in our modelling for Fc, because the number of neighboring methyl groups is generally rather low, as shown in Figure 5. To understand the effect of the Fc-Me proximity on the stability, a systematic analysis has been performed by the C program (see code 'NbMeFc' in Table 1 and notes A7, A8, A9). We can see that the average number of these proximities are quite limited, and that there is no clear difference between best and worse models (except for 1CD S8 and S2 series, where some methyls should be removed to form the beneficial clamp with the iron).
Finding the best blend would not be of interest, as statistically, it only exists in small amounts in the RAMEβCD mix and is closely followed by many other assemblies in terms of stability; moreover, it would be idealistic to seek to synthesize the complex corresponding CDs, merely to dissolve SuccFerr better. Thus, it is better to consider macroscopic values, such as the average degree of methylation of the ideal mixture. By "ideal mix", we mean that one should include in this calculation, for each series, only CDs participating in improved assemblies, that is, those for which ∆E < ∆EG0. These average states of methylation are calculated by the PHP program and are given in Table 1. However, relying only on these average methylations is restrictive, as it is found that the ideal range for each run (lowest and highest methylation rate of all improved CDs in any given run) is large. We can therefore deduce that SuccFerr can adapt to RAMEβCDs with very various methylation rates. The most important parameter is not this rate, but the distribution of CDs (in particular, the isomers), a parameter impossible to control during the synthesis of RAMEβCD, or difficult to assess by analysis. In conclusion, it has been shown in this work that the solubility of SuccFerr with methylated CDs depends on many variables, and it is no simple task to determine the best methylation rate for the CDs.

Phase Solubility Studies of SuccFerr Complexation
SuccFerr complexation with various CDs was evaluated using the phase-solubility method [26]. A suspension of a large excess of SuccFerr (30 mg) in 2 mL of aqueous solutions of the appropriate CD (concentrations ranging from 0.125 to 160 mM, pH adjusted to 7) was stirred in screw-capped amber vials for 24 h on a rock-and-roller agitator at 25 • C. Preliminary experiments indicated that equilibrium was reached after this 24 h stirring period. Each suspension was then centrifuged at 9000 g for 10 min and diluted from 1/10 to 1/1000 with acetonitrile, and the amount of dissolved SuccFerr was assessed by HPLC at 286 nm. Phase solubility curves (i.e., solubility of SuccFerr as a function of the CD concentration) were drawn for each CD.

UV-Vis Experiments-Benesi-Hildebrand Method
First, UV-vis spectra (Cary 50 spectrophotometer, Varian, Les Ulis, France) were obtained for a concentration of 2.5 × 10 −5 M of SuccFerr and various concentrations of CDs (0 to 1.1 × 10 −5 M). RAMEβCD or HPβCD were added, and the SuccFerr concentration was kept at 2.5 × 10 −5 M. All experiments were performed in 1% (v/v) DMSO/water mixture (the first dissolution of SuccFerr was performed in pure DMSO) to avoid precipitation, and the measurement was taken after 10 min to account for the kinetics of complex formation. Then, the mole-ratio titration method was used to calculate the values of the apparent binding constant (K a ) [38][39][40]. These experiments were performed at various wavelengths obtained from the spectra (low concentration of CD). Another series of experiments was performed with higher CD concentrations (2 × 10 −6 -25 × 10 −5 M, or high concentration of CD) to confirm the binding constants.

Software
Our experiments generated a large amount of data. We chose to create our own web application using the XAMPP pack (web server + database + PHP, version 8.2.0 was used) [32], which allowed us to design the very specific scheme required to store the data generated by our algorithms. We used this system to store our 13,261 models of inclusion of SuccFerr into one or two CD(s). The web browser serves as an interface between the scientist and our web application (entering data into a web form or displaying it). The PHP application acts as an intermediary with the MariaDB database, using SQL queries (see some examples in Appendix A), to store or access the data or even to generate statistics automatically.
The C program (to retrieve from the data repository [30]) was edited, compiled and executed using Code::Blocks version 20.03 64 bit [41]. It analyzes the XYZ files (checking for errors), creates the static webpages and performs additional statistics. More details can be found in SI.

Computational Details
All the SuccFerr-β-cyclodextrins complexes were computed using the program Spar-tan14 (Wavefunction, Irvine, CA, USA), performing a molecular mechanics optimization with the Merck molecular force field (MMFF) and a subsequent minimization with the semiempirical PM3 quantum-mechanical method. Starting from the best PM3 optimized geometry, a minimization was carried out using the Polarizable Continuum Model (PCM) in water. Density Functional Theory (DFT) was employed to perform this calculation using the ωB97XD functional with the diffuse-augmented polarization valence-triple-ζ (6-311G(d,p)) basis set including a set of p polarization functions for the hydrogen atoms and a set of d polarization functions for the second-row elements. For the iron atom, the uncontracted triple-ζ quality LANL08 basis set with an effective core potential (including 10 core electrons) was used in these calculations. This calculation was performed with the Gaussian 16 quantum package (Gaussian Inc., Wallingford, Connecticut, USA). On top of these DFT calculations, further NBO analyses were performed using the NBO version 3 integrated in Gaussian [42].
The PM3 method was used to determine the affinity of SuccFerr with the CD (∆E, Equation (2) or Equation (3)): for 2-CD: The energy of each element was calculated (energy method, for the ground state, no modification of geometry that should be the same as in the assemblage). The heat of formation (in kJ/mol) of each element and of the assemblage were copied from Spartan and pasted into the form of the webpage of the web application that calculated ∆E, and this was saved in the database.

Conclusions
This work has shown that the RAMEβCD mixture seems more suited than welldefined CDs to form stable associations with SuccFerr, and that the most probable assemblages are SuccFerr systems with two CDs, probably with a certain proportion of 1-CD series S8. The quantification results agree with the modelling showing that there are 2-CD assemblies. The modelling also revealed the special behavior of ferrocene, which forms atypical hydrogen bonds (one or two in the form of a clamp) between its iron atom and the hydrogen atoms of the hydroxyl groups of CDs. This result is particularly important because it has been possible, for the first time, to the best of our knowledge, to model the atypical hydrogen bonds between CDs and an anticancer molecule based on nonpolar ferrocene. These interactions are, in fact, responsible for the high affinity observed between the CDs and this molecule. Methyl groups also provide stability or instability depending on their position. For example, two OH groups at position 2 and 3 of the same glucose unit are not a sufficient parameter to allow a clamp with Fe, since some Me and OH groups should be placed at the right places elsewhere (6-Me-βCD in Table S1 does not have this clamp despite having seven glucose units with OH in position 2 and 3). Our method using trees of modifications of methylation permitted us to reach these particular configurations, and the discovery of these atypical bonds is due to the proper functioning of the experimental web application and to the automatic analysis of the model files by the C program. Funding: This research was funded by Agence Nationale de la Recherche, ANR PRCE NaTeMOc no. ANR-19-CE18-0022-01. The APC was funded by the startup FEROSCAN (G. Jaouen is member of this startup).

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
A part of the data supporting this article is available within the article and Supplementary Materials (Tables S1-S4 and more detailed descriptions of the program and examples of SQL queries adapted to our database). The main part of the data is publicly accessible in the 'Recherche Data Gouv' repository (space of Sorbonne Université) at https://doi.org/10.577 45/CBUPP3 (accessed on 22 April 2023) [30]. This data repository contains the database in various formats (CSV, SQL (preferred to import the database [43]), JSON and XML), 13361 models in XYZ format, 28 original Spartan files, the S8 DFT geometry in XYZ format, the files created by analysis of the models by the C program, the C program itself and the PHP program. See the README.txt file for details [44]. The PHP program was also referenced on Software Heritage via Hal [31].