A Family of Externally-Functionalised Coordination Cages

: New synthetic routes are presented to derivatives of a (known) M 8 L 12 cubic coordination cage in which a range of different substituents are attached at the C 4 position of the pyridyl rings at either end of the bis(pyrazolyl-pyridine) bridging ligands. The substituents are (i) –CN groups (new ligand L CN ), (ii) –CH 2 OCH 2 –CCH (containing a terminal alkyne) groups (new ligand L CC ); and (iii) –(CH 2 OCH 2 ) 3 CH 2 OMe (tri-ethyleneglycol monomethyl ether) groups (new ligand L PEG ). The resulting functionalised ligands combine with M 2+ ions (particularly Co 2+ , Ni 2+ , Cd 2+ ) to give isostructural [M 8 L 12 ] 16+ cage cores bearing 24 external functional groups; the cages based on L CN (with M 2+ = Cd 2+ ) and L CC (with M 2+ = Ni 2+ ) have been crystallographically characterised. The value of these is twofold: (i) exterior nitrile or alkene substituents can provide a basis for further synthetic opportunities via ‘Click’ reactions allowing in principle a diverse range of functionalisation of the cage exterior surface; (ii) the exterior –(CH 2 OCH 2 ) 3 CH 2 OMe groups substantially increase cage solubility in both water and in organic solvents, allowing binding constants of cavity-binding guests to be measured under an increased range of conditions. synthesis characterisation of compounds.

Accordingly, the main focus on coordination cage chemistry has been guest binding in the central cavity which is now a very well-developed area. Rather less attention however has been given to the functionalisation of the exterior surfaces which can be an equally important aspect of cage chemistry. Carefully chosen substituents attached to cage exterior surfaces can control solubility [29,30]; provide functional groups which can interact with surfaces [31] or proteins [32]; and provide functionality such as redox [33] or photophysical [34] properties that supplement the properties of the cage/guest assembly.
In this paper, we report a series of synthetic studies on our octanuclear cubic coordination cage assembly H (Figure 1) [35,36] which are focused on the exterior surface. The parent cage H (and structurally equivalent complexes with other metal ions at the vertices) as its fluoroborate salt is soluble in polar organic solvents, in which early studies of guest binding were performed [29,37]; the addition of hydroxymethyl substituents to the exterior surface to give H w (Figure 1) improved water solubility [29], allowing for a much stronger binding of a wide range of neutral organic guests because of the magnitude of the hydrophobic effect which dominates guest binding in aqueous solution [24]. However, the inclusion of the hydroxymethyl substituent at the C 4 position of each pyridine ring substantially complicated the ligand synthesis, limiting our ability to apply the same methodology to other external functional groups. Accordingly, we have improved the synthetic procedure associated with the ligands which form the basis of cages of the H family, to provide a more general route to a wider variety of externally-functionalised cages. As well as allowing substantially improved solubility of the cage family by allowing the straightforward inclusion of external solubilising substituents, we have incorporated as external functional groups both alkyne and nitrile substituents, which provide a useful basis for elaboration into a wide range of other functional groups; structurally characterised examples of these alkyne-substituted and nitrile-substituted analogues of H are presented. In short, this synthetic study greatly improves the possibilities for external functionalisation of the cage family with a wide variety of different substituents which are of value for a range of purposes.

General Details
The ligand H w was prepared as previously reported by us [29]; 2-acetyl-4-cyanopyridine was prepared according to the literature method [38]. Other organic reagents and metal salts were purchased from Sigma-Aldrich and used as received. Instrumentation used for routine spectroscopic measurements was as follows: 1 H-NMR spectroscopy, Bruker Avance 300, 400 or 500 MHz instruments; ES mass spectrometry, a Bruker Compact ESI-Q-TOF in positive ion mode; fluorescence spectra, an Agilent Cary Eclipse instrument. Details of the synthesis and characterisation of new ligands and complexes are in the Supporting Information.

X-ray Crystallography
The X-ray crystallographic data for the structure determination of H CC were collected in-house using a Rigaku Oxford Diffraction Synergy S with a HyPix-6000HE detector; data for the structure determination of H CN were collected in Experiment Hutch 1 of beamline I-19 at the UK Diamond Light Source synchrotron facility [39]. Details of software used for structure solution and refinement have been previously reported [40]. Crystallographic, data collection and refinement parameters are collected in Table 1. CCDC deposition numbers: 2110709-2110710. In both cases, as is usual for large supramolecular assemblies of this type, a combination of weak scattering associated principally with the disorder of anions and solvent molecules required extensive use of geometric restraints during refinement to ensure a physically meaningful and stable refinement. Diffuse electron density that could not be modelled satisfactorily was removed from the refinement using the solvent mask feature in OLEX; details are in the individual CIFs.

Reactions to Introduce 2-Acetyl Group onto Pyridine Nucleus
The original simple synthesis of the ligand L used for assembly of H [35] is based on the easy availability of 2-acetylpyridine, whose acetyl group is trivially converted via two simple steps to a pyrazole group [41,42], furnishing the 3-(2-pyridyl)-pyrazole unit which is the essential chelating unit of all ligands in this cage family. The adaptation of this synthesis to incorporate a substituent X at the pyridine C 4 position immediately presents a greater challenge as the desired 4-X-2-acetylpyridines are rarely commercially available. To make the hydroxymethyl-substituted ligand L w (used as the basis of the water-soluble cage H w ) [29] we started with 4-hydroxymethylpyridine, protected the HO group by silylation, and then introduced the 2-acetyl group in a 3-step sequence involving conversion of the pyridine to a pyridine-N-oxide, reaction with a cyanide source to give the 2-cyanopyridine, and then a Grignard reaction with MeMgI to give the necessary 2-acetylpyridine bearing a protected hydroxymethyl substituent. This worked but is rather cumbersome and the final Grignard reaction to generate the acetyl group is low-yielding [29].
A one-step procedure to introduce the 2-acetyl group to the pyridine nucleus, which is much quicker and higher-yielding and which is also tolerant of a range of substituents at the C 4 position of the pyridine ring, is the reaction shown in Scheme 1, reported in 1991 by Fontana and co-workers, which involves the in situ generation of an acyl radical by decarboxylation of pyruvic acid using persulfate and an Ag(I) catalyst [38]. It will be clear that this is immediately both simpler and more versatile and has become our default route to synthesise ligands of this type. It opens up the chemistry of externally-substituted cages to a wide range of functional groups, as we show below for the preparation of L PEG and L CN and their associated cage complexes. Scheme 1. One-step introduction of a 2-acetyl group onto a 4-substituted pyridine according to ref. [38]: (i) pyruvic acid, (NH 4 ) 2 S 2 O 8 , catalytic AgNO 3 . The subsequent sequence of steps for conversion of the 2-acetyl group to a pyrazole, and thence to the completed ligands L R , have been reported previously [29,35].

Preparation of L PEG and the Associated Cage H PEG
A particularly useful set of external substituents on the cage core is the poly(ethyleneglycol) (PEG) chains, giving the ligand L PEG which in turn forms the cage H PEG . The starting material 4-hydroxymethylpyridine was first alkylated with the tosylate of tri-ethyleneglycol monomethyl ether (Supplementary Materials, p. 2) to give the PEG-ylated pyridine which could then be converted to the 2-acetyl-4-PEG-pyridine using the method of Scheme 1. The remaining steps to convert the acetyl group to a pyrazole [41,42], and then connect two of the PEG-ylated 3-(2-pyridyl)pyrazole units to the central naphthalene-1,5-diyl core to give L PEG [29,35], were carried out following the general methodology reported earlier (Supplementary Materials, p. 3-6): variations in workup associated with the presence of the PEG chains (which require, for example, different conditions for chromatography) are required.
The reaction of L PEG with Co(BF 4 ) 2 or Cd(NO 3 ) 2 in the required 3:2 molar ratio in methanolic solution at 60°C afforded the PEG-ylated cages Co•H PEG and Cd•H PEG , as confirmed by 1 H NMR spectroscopy and ES mass spectrometry (Supplementary Materials, p. 7-8), with the ES mass spectra showing a characteristic sequence of peaks for the intact cage cation associated with varying numbers of counter-ions. The addition of the PEG substituents results in complexes that clearly tumble slowly in solution, giving particularly broad 1 H NMR signals (in addition to the inherent paramagnetic broadening associated with Co•H PEG ) so individual peaks are not assignable: but the number of signals is consistent with expectations based on the symmetry of the cages with two independent ligand environments [29,35]. These PEG-ylated cages are much more soluble in water than H W , and also much more soluble in organic solvents (including low-polarity ones like CHCl 3 ) than unsubstituted H. This apparent dichotomy arises from the flexibility of PEG chains which have both polar and non-polar conformations according to how they are folded [43,44].
This substantially improved the solubility of the host in CHCl 3 and has immediate benefits in terms of improved binding in the cavity of H-bond accepting guests such as members of the coumarin family, which interact with an H-bond donor site on the cage interior surface; this cage/guest H-bonding interaction has been analysed in detail before [29,37], and the nature of the interaction has been confirmed by X-ray crystallography of cage/guest complexes [40]. Several 1:1 binding constants of substituted coumarins inside the cavity of H were previously measured in MeCN and observed to have K values of the order of 10 2 M −1 based on this interaction [29,37]. We can immediately see the benefit of a solvent such as CHCl 3 that is a poorer competitor for hydrogen-bonding sites: for example, the binding constant of 4-methyl-7-aminocoumarin (MAC) in the cavity of Co•H PEG in CHCl 3 , measured by the progressive quenching of MAC fluorescence on the addition of portions of Co•H PEG (Figure 2), is 3.8 × 10 4 M −1 , an increase in K of ca. 2 orders of magnitude compared to what we have previously observed for coumarin derivatives in MeCN [37]. Given how much our previous studies of assemblies based on H and H w , from the measurement of guest binding constants [29,36] to the study of photoinduced electron transfer in cage/guest assemblies [45,46], is limited by solubility issues the combination of: (i) greatly increased solubility in both water and organic solvents and; (ii) stronger binding of hydrogen-bonding guests in organic solvents provided by the H PEG complexes, will greatly facilitate future studies on properties of cage/guest assemblies.

Preparation of L CN and the Associated Cage Cd•H CN ; Crystal Structure of Cd•H CN
The same new synthetic methodology has allowed the preparation of the nitrilefunctionalised ligand L CN , with the substituents again at the pyridyl C 4 position, and its use to the prepared octanuclear cage Cd•H CN which contains 24 externally-directed nitrile groups. For this, the starting material was the known compound 4-cyanopyridine (Scheme 1) [38]. Apart from the convenience of introducing the 2-acetyl group in one step, the previous three-step synthetic route [29] would not have worked here as the final step was a Grignard reaction of 2-cyanopyridine with MeMgI to generate a 2-acetyl group, which would clearly be a challenge with another nitrile group present at the 4-position. 2-Acetyl-4-cyanopyridine could be converted by the standard sequence of reactions (Supplementary Materials, p. 9-11) to L CN which formed the octanuclear cage Cd•H CN by reaction with Cd(BF 4 ) 2 (Supplementary Materials, p. 12); Cd•H CN is soluble only in polar organic solvents such as MeCN and DMF. Again, high-resolution ES mass spectrometry confirmed the formulation of the cage. The potential value of the nitrile functional groups is that they provide a site for facile further functionalisation in many ways, such as reduction to an amine and then conjugation to a peptide; coordination to additional metal ions to crosslink cages; or a Huisgen-type 'click' cycloaddition reaction with an azide to give a set of 24 units of any desired external substituent around the cage connected via tetrazole spacers [47][48][49].
Views of the crystal structure of the Cd(II) cage Cd•H CN are shown in Figure 3. The basic structure of the cage core is not affected by the presence of the nitrile substituents, and consequently, it has the same basic architecture as the parent cage H with a Cd(II) ion at each vertex of an approximate cube, and a bridging ligand spanning each edge [35]. The whole assembly is centrosymmetric, with one half of the complex (four metal ions and six ligands) in the asymmetric unit, and one complete molecule in the P-1 unit cell.
A diagonally opposite pair of Cd(II) ions [Cd(2) and its symmetry equivalent] have a fac tris-chelate coordination geometry, with the remaining six Cd(II) ions all having a mer tris-chelate geometry, such that there is a (non-crystallographic) threefold axis passing through Cd (6) and Cd (6 ) in addition to the inversion centre, meaning that the complex as a whole has S 6 molecular symmetry [35]. The flexibility of the ligands allows them to bend at the CH 2 groups which act like hinges, meaning that the ligands can adopt conformations in which inter-ligand stacking interactions between electron-rich naphthyl groups and electron-deficient pyrazolyl-pyridine units (coordinated to Cd 2+ metal ions) generates multi-component pi-stacks around the cage exterior. This occurs also in H [35] and H w [29], but we can see here how some of the nitrile groups-those which are not externally directed, away from the cage coreparticipate in the stacking interactions (a detail of one of the 5-component π-stacks is in Figure 4a), in which separation between mean planes of adjacent overlapping aromatic ligand fragments is in the typical 3.3-3.5 Å range. A view of the cage with the nitrile groups emphasising how they form an externally directed pseudo-spherical array is included in Figure 3c. Extensive disorder of the lattice solvent molecules meant that the diffuse electron density associated with them could not be modelled and had to be removed from the refinement using a solvent mask facility. The location of the anions is also of interest, as it is the accumulation of anions around the cationic cage surface in the solution that drives the catalysis that we have observed, in particular the cage-catalysed Kemp-elimination in which cavity-bound benzisoazole reacts extremely fast with adjacent surface-bound hydroxide ions [36]. We have previously shown that the windows in the centres of the faces of the cubic assembly provide recognition and binding sites for a wide range of anions via a combination of charge-assisted hydrogenbonding and hydrophobic effects [50]. We can see that some tetrafluoroborate anions in the structure of Cd•H CN are associated with the cage surface in this way, though disorder of some of them precludes detailed structural analysis. Two of the cage faces (a crystallographically equivalent pair, opposite each other) contain a fluoroborate anion in a single position with 100% occupancy, this is the anion containing B(21X), coloured blue in Figure 4b. Another pair of faces bind the anion containing B(31X), coloured green in Figure 4b, which has a site occupancy of 0.5; the remaining two faces bind the anion containing B(51X) (dark red in the figure) which also has a site occupancy of 50% in this position; there is also a position close to these anions based on B(51Y), with 25% site occupancy, which lie closer into the centre of the cavity than the B(51X) tetrafluoroborate anion. In all cases, multiple CH•••F interactions occur with the ligands in the cage superstructure with H•••F contacts around 2.5 Å.

Preparation of L CC and the Associated Cage H CC ; Crystal Structure of Ni•H CC
In addition to the new synthetic methodology-the one-step introduction of the 2acetyl group onto the pyridine nucleus, outlined above-which has allowed the preparation of ligand L PEG and L CN and their associated cages, we can also functionalise the ligands by simple modifications of existing external functional groups. As part of a general strategy for the external functionalisation of cages, of which the work in this paper is a part, we investigated alkylation of the hydroxyl groups of L w [29] as a means to introduce more versatile and generally synthetically useful substituents, viz. alkynyl groups which can also be used for further functionalisation via the well-known CuAAC "click" coupling reaction with azides [51,52]. To this end reaction with pre-formed L w with propargyl bromide in the presence of base allowed the straightforward attachment of two terminal alkyne groups to the pyridine rings ligand core, attached by a short flexible chain, to give the new ligand L CC (Scheme 2; Supplementary Materials, p. 13). To confirm that attachment of these external substituents does not impede cage assembly we used L CC to prepare the Ni 2+ cage Ni•H CC following the usual methodology (Supplementary Materials, p. 14); it is sparingly soluble in MeCN and soluble in DMF. Mass spectrometric evidence confirmed the formulation of the complex. Note that the paramagnetism of octahedral Ni(II) complexes broadens their 1 H NMR spectra to the point of being of no value so a 1 H NMR spectrum for Ni•H CC is not included in the Supplementary Materials. However crystallographic analysis confirms that the Ni(II) cage Ni•H CC has a core structure similar to what we observed with Cd•H CN , but bearing 24 alkyne groups on the external surface. Scheme 2. Alkylation of the pendant HO groups of L w (propargyl bromide, NaH, thf) to give L CC .
The crystal structure of Ni•H CC is shown in Figure 5. The main features of the M 8 L 12 cage core (arrangement of metal ions and ligands, symmetry, inter-ligand π-stacking, the interaction of anions with the surface portals and so on) are basically the same as in H CN , reported above, and do not need re-explaining. The main point however is that the 24 alkyne groups with which the exterior surface is decorated do not interfere with cage assembly and clearly provide a platform for further attachment of a very wide range of substituents.

Discussion
Overall, the ability to provide a range of externally-directed functional groups around the cage surface is an important development as this strongly influences how the cage hosts interact with the outside world: it therefore provides quite a different research focus from the inward-looking aspects of host/guest complex formation based on the central cavity. At a simple level external substituents control solubility, which is of fundamental importance for optimising host/guest complex formation which is highly solvent-dependent, and we can see for Co•H PEG how the high solubility in CHCl 3 allows for much stronger guest binding (by hydrogen-bonding to the interior surface) than occurs in more polar solvents which were previously necessary to dissolve members of this cage family. At a more sophisticated level, the attachment of appropriate external groups could allow binding to surfaces or recognition by biomolecules, and the nitrile or alkyne functional groups which we have appended to the cage exterior surface will be particularly versatile in this respect. A tempting target, for example, is the attachment of glycans to facilitate recognition by lectins: an array of appropriate glycans on the cage exterior could result in strong binding to, for example, pathogenic proteins such as the cholera toxin or could allow cell ingress via recognition routes involving glycans. The previously-demonstrated ability of this cage to bind drug molecules in its cavity [53] could thereby provide a mechanism for delivery of a drug molecule to a specific target which is recognised via interactions with the cage exterior surface, thereby combining different recognition processes associated with the cage interior, and the exterior functional groups, working together.  Data Availability Statement: Data underpinning the work in this paper that is not already included in Supplementary Materials is available on request from the corresponding author.