Structure of the Receptor-Binding Carboxy-Terminal Domain of the Bacteriophage T5 L-Shaped Tail Fibre with and without Its Intra-Molecular Chaperone

Bacteriophage T5, a Siphovirus belonging to the order Caudovirales, has a flexible, three-fold symmetric tail, to which three L-shaped fibres are attached. These fibres recognize oligo-mannose units on the bacterial cell surface prior to infection and are composed of homotrimers of the pb1 protein. Pb1 has 1396 amino acids, of which the carboxy-terminal 133 residues form a trimeric intra-molecular chaperone that is auto-proteolyzed after correct folding. The structure of a trimer of residues 970–1263 was determined by single anomalous dispersion phasing using incorporated selenomethionine residues and refined at 2.3 Å resolution using crystals grown from native, methionine-containing, protein. The protein inhibits phage infection by competition. The phage-distal receptor-binding domain resembles a bullet, with the walls formed by partially intertwined beta-sheets, conferring stability to the structure. The fold of the domain is novel and the topology unique to the pb1 structure. A site-directed mutant (Ser1264 to Ala), in which auto-proteolysis is impeded, was also produced, crystallized and its 2.5 Å structure solved by molecular replacement. The additional chaperone domain (residues 1263–1396) consists of a central trimeric alpha-helical coiled-coil flanked by a mixed alpha-beta domain. Three long beta-hairpin tentacles, one from each chaperone monomer, extend into long curved grooves of the bullet-shaped domain. The chaperone-containing mutant did not inhibit infection by competition.


Introduction
Bacteriophages are ubiquitous viruses of bacteria. They are important biological model systems and have been applied in detection and control of pathogenic bacteria [1,2]. More than 95% of all phages belong to the Caudovirales (tailed phages) [3]. This order is divided into three families, according to phage tail morphology: Myoviridae, Podoviridae and Siphoviridae. Around half of all known bacteriophages are Siphoviruses and have long, flexible, non-contractile tails. The well-studied Siphovirus T5 has a three-fold symmetric tail-tube, to the end of which three long, L-shaped fibres are attached [4,5]. These fibres are made up of a homotrimer of protein pb1 (pb stands for "protein band"). The gene encoding pb1 specifies 1396 amino acids, although the carboxy-terminal 133 residues are auto-proteolytically removed after correct trimerization and folding. It is thought that the trimer of these carboxy-terminal 133 residues functions as an intra-molecular chaperone [6,7].
The L-shaped tail fibres are responsible for the initial recognition of certain Escherichia coli (E. coli) host strains, binding to the O8-or O9-type O-antigen of the bacterial lipopolysaccharide (LPS) [6]. The structures of the O8 and O9 antigens have been characterized before [8,9]; the minimal region in common is a tri-mannose with alpha(1,2)-linkages. The pb1-LPS interaction is reversible, and, in fact, the presence of the fibres on the phage is not absolutely necessary, although they increase the adsorption rate by a factor of 15 [6,10]. In a secondary, irreversible, interaction, the central straight tail fibre interacts with the specific receptor of T5, the bacterial iron transporter protein FhuA. This interaction is mediated by the receptor binding protein pb5, which is located at the end of the straight fibre [11], and leads to release of the phage DNA into the host cell [12,13].
Here, we present the structure of the carboxy-terminal domain of the bacteriophage T5 L-shaped tail fibres, solved in the absence and presence of the intra-molecular chaperone domain and show that only the processed, mature protein can inhibit phage infection by competition.

Protein Expression, Purification, Crystallization and Stability Measurement
Production and crystallization of pb1(970-1263) was described before [14]. Upon expression and protein folding, the intra-molecular chaperone domain is auto-proteolytically cleaved off before purification and crystallization. To prevent auto-proteolytic cleavage and obtain a carboxy-terminal domain including the chaperone, the Ser1264-codon was mutated to code for Ala using the QuikChange Site-Directed Mutagenesis Kit (Agilent, Las Rozas, Madrid, Spain). The resulting expression plasmid was named pET28-pb1(970-1396)S1264A and yields pb1(970-1396), which was expressed and purified as described for the wild-type protein [14]. Crystallization trials for pb1(970-1396) were performed by vapor diffusion in sitting drop Compact Clover plates (Jena Biosciences, Jena, Germany) with 0.15 mL reservoirs and drops of 2 µL of protein solution mixed with 2 µL of reservoir solution. Lens-shaped crystals up to 0.15 mm long were obtained in drops where the reservoir contained 0.1 M sodium citrate pH 4.0 and 12.5% (w/v) PEG 2000 and were briefly soaked with the same solution including 20% glycerol before data collection.
Crystallization trials for native pb1(970-1263) in the presence of mannose were also performed (although no ligand was identified in the electron density). The purified protein (at 15 mg/mL in 20 mM Tris-HCl pH 8.5) was incubated in the presence of 2 mM mannose for 2 h at room temperature. Crystals were obtained in a drop where the reservoir contained 0.1 M Tris-HCl pH 8.5, 8% (w/v) PEG 4000, 10 mM iron(III) chloride and 20% (v/v) glycerol and were rapidly cryo-cooled directly in liquid nitrogen.

Phage Infection Inhibition
The O-antigen dependent bacteriophage T5oad mutant [15] was used to test for inhibition of phage T5 adsorption by the C-terminal domain of pb1. Due to a mutation in the gene encoding the receptor binding protein pb5, this mutant binds the outer membrane protein receptor FhuA with a low affinity and, as a consequence, successful infection is strictly dependent on the interaction between the L-shaped tail fibers and polymannose O-antigen of E. coli F. T5oad was amplified on the E. coli F strain and purified by centrifugation on cesium chloride gradients [16]. E. coli F bacteria were harvested in the mid-log phase, washed in phosphate-buffered saline and resuspended at a cell density of 1ˆ10 9 /mL in T5-TBS buffer (10 mM Tris-HCl pH 7.2, 100 mM sodium chloride, 1 mM calcium chloride and 1 mM magnesium chloride). Aliquots (0.05 mL) of cell suspension were incubated for 20 min at 0˝C with concentrations (0.015 to 1.5 µM) of purified pb1(970-1263) or pb1(970-1396). Estimating that a bacterial cell contains about 10 6 LPS molecules [17], this corresponds to a ratio of trimeric pb1 to LPS of 0.01 to 1. Phage T5oad mutant was then added at a multiplicity of infection of 1 and incubation was continued for 15 min at 37˝C. The mixture was then diluted 500-fold into ice cold T5-TBS buffer and centrifuged at 6000ˆg for 6 min. The amount of unadsorbed phage particles remaining in the supernatant was determined by titration on E. coli F. In these experimental conditions, we confirmed that adsorption of T5oad to E. coli B, which has a semi-rough LPS, lacking the O-specific chain, was less than 1%. The presence of 8.5 µM of bovine serum albumin did not affect the adsorption of T5oad to E. coli F.
Image processing was performed using XMIPP [34] and defocus was determined with CTFFIND3 [35]. For bacteriophage T5 tail samples, we selected 1129 images of the distal part of the fibre; classification was used to only select straight distal parts, as described [14]. The XMIPP manual particle picking interface was used to select 2006 and 1375 individual images of pb1(970-1263) and pb1(970-1396) assemblies, respectively. After alignment and classification using maximum likelihood routines [36], we selected 444 straight rod images of the tail fibre to obtain a two-dimensional average image. The complete datasets of manually picked particles of pb1(970-1293) and pb1(970-1396) were used to obtain their respective average images (no classification was used). The resulting images were rotationally averaged around their longitudinal axis to render averaged three-dimensional volumes. The UCSF CHIMERA fit routine was used to computationally dock the X-ray structures of pb1(970-1293) in the fibre and in pb1(970-1293) averaged volumes, and that of pb1(970-1396) in the pb1(970-1396) averaged map, with correlation coefficients of 93%, 95% and 96%, respectively.

Protein Expression, Purification and Crystallization
The proteins pb1(794-1263) and pb1(970-1263), corresponding to two carboxy-terminal (distal) fragments of the L-shaped tail fibres of different size, were expressed, purified and crystallized as described before [14]. A single point mutation (T to G), mutating the codon of the corresponding serine in pb1 (Ser1264) to code for alanine, was introduced into the plasmid pET-28-pb1(970-1396). Expression of the mutated gene in E. coli cells led to production of the non-proteolyzed version of the protein, which we call pb1(970-1396). Substitution of a conserved serine to alanine was previously described to inhibit the auto-proteolytic cleavage of a similar intra-molecular chaperone domain [7,37]. Pb1(970-1396) was purified by metal affinity chromatography and anion exchange chromatography and crystallized from poly-ethylene glycol solutions at pH 4.

Phage Infection Inhibition
We expected the carboxy-terminal end of the L-shaped tail fibres to be the specific domain that binds to the O-antigen of the LPS. To confirm this, we measured the inhibition of phage T5oad adsorption by the carboxy-terminal domain of pb1. Unlike wild-type T5, the T5oad mutant is strictly dependent on the binding of the L-shaped fibers to polymannose O antigen of E. coli F to successfully infect its host [15]. Thus, addition of the carboxy-terminal domain of pb1 to E. coli F bacteria is expected to inhibit infection with T5oad. The inhibitory effects of purified pb1(970-1263) and pb1(970-1396) are shown in Figure 1. More than 90% inhibition was measured when T5oad was incubated with pb1(970-1263) at concentrations ranging from 0.015 to 1.5 µM of the trimeric proteins, while pb1(970-1396) does not significantly inhibit T5oad adsorption at the same concentrations. These data indicate that the pb1(970-1263) domain likely binds to the O-antigen receptor. As the presence of the intra-molecular chaperone domain prevents inhibition of T5 adsorption, we propose that the binding site for the O-antigen is located at the terminal end the L-shaped tail fibre and is unmasked after cleavage of the chaperone domain.
Viruses 2015, 7, 6424-6440 4 the protein, which we call pb1(970-1396). Substitution of a conserved serine to alanine was previously described to inhibit the auto-proteolytic cleavage of a similar intra-molecular chaperone domain [7,37]. Pb1(970-1396) was purified by metal affinity chromatography and anion exchange chromatography and crystallized from poly-ethylene glycol solutions at pH 4. We expected the carboxy-terminal end of the L-shaped tail fibres to be the specific domain that binds to the O-antigen of the LPS. To confirm this, we measured the inhibition of phage T5oad adsorption by the carboxy-terminal domain of pb1. Unlike wild-type T5, the T5oad mutant is strictly dependent on the binding of the L-shaped fibers to polymannose O antigen of E. coli F to successfully infect its host [15]. Thus, addition of the carboxy-terminal domain of pb1 to E. coli F bacteria is expected to inhibit infection with T5oad. The inhibitory effects of purified pb1(970-1263) and pb1(970-1396) are shown in Figure 1. More than 90% inhibition was measured when T5oad was incubated with pb1(970-1263) at concentrations ranging from 0.015 to 1.5 µM of the trimeric

Structure Solution
A dataset to 2.5 Å resolution of the seleno-methionine derivative of pb1(970-1263) [14] collected at the selenium peak was used as input for the phasing procedure. Fifteen selenium atoms were located as expected; the asymmetric unit of the crystal contains one trimer, and each pb1(970-1263) monomer has five methionine residues. After solvent flattening, an initial model was produced by automatic tracing, which was improved manually. The resulting model was used to phase the structure of pb1(970-1396) by molecular replacement. Three trimers were located in the asymmetric unit, and the intra-molecular chaperone region could be constructed by a combination of automated and manual building in the calculated electron density. Density for residues 1297-1303 is missing in each of the nine monomers; these residues likely form a disordered loop. The structures of pb1(970-1263) and pb1(970-1396) ( Figure 2) were refined to R-factors of 21.3% and 18.5% and free R-factors of 26.6% and 21.3%, respectively, with good geometry and few residues in unlikely regions of the Ramachandran plot (Table 1). Interpretable density was present neither for residues 970-987, nor for the amino-terminal purification tag. The data collected by Garcia-Doval et al. [14] for native pb1(794-1263) and pb1(970-1263) was also used in molecular replacement experiments, but in the case of pb1(794-1263), no density for extra N-terminal residues was observed and the remaining part could not be refined to satisfactory R-factors. In the case of pb1(970-1263), no satisfactory solution could be found by molecular replacement. However, we did collect data from a new crystal form of native methionine-containing protein, and refined this structure at 2.3 Å resolution (Table 1).

6
al. [14] for native pb1(794-1263) and pb1(970-1263) was also used in molecular replacement experiments, but in the case of pb1(794-1263), no density for extra N-terminal residues was observed and the remaining part could not be refined to satisfactory R-factors. In the case of pb1(970-1263), no satisfactory solution could be found by molecular replacement. However, we did collect data from a new crystal form of native methionine-containing protein, and refined this structure at 2.3 Å resolution (Table 1). Overall side stereo-view of pb1(970-1263) in cartoon representation. Monomers are coloured differently (chain A green, chain B blue, chain C magenta). The first interdigitated region (ir1), first triangular domain (td1), second interdigitated region (ir2), second triangular domain (td2) and third interdigitated region (ir3) are indicated (interdigitated regions and triangular domains have been described in phage tailspike structures before, see text). The amino-terminal (N) and carboxyterminal ends of the magenta subunits are indicated; (B) Overall side stereo-view of pb1(970-1396) in the same orientation as part A. Pb1(970-1263) is shown in gray, the monomers of the intramolecular chaperone domain (ic) coloured differently (chain A green, chain B blue, chain C magenta); (C) Space-filling diagram in the same orientation as in part B. The carboxy-terminal part of the protein is coloured green, the intra-molecular chaperone domain in magenta. Monomers are coloured differently (chain A green, chain B blue, chain C magenta). The first interdigitated region (ir1), first triangular domain (td1), second interdigitated region (ir2), second triangular domain (td2) and third interdigitated region (ir3) are indicated (interdigitated regions and triangular domains have been described in phage tailspike structures before, see text). The amino-terminal (N) and carboxy-terminal ends of the magenta subunits are indicated; (B) Overall side stereo-view of pb1(970-1396) in the same orientation as part A. Pb1(970-1263) is shown in gray, the monomers of the intra-molecular chaperone domain (ic) coloured differently (chain A green, chain B blue, chain C magenta); (C) Space-filling diagram in the same orientation as in part B. The carboxy-terminal part of the protein is coloured green, the intra-molecular chaperone domain in magenta.

The Carboxy-Terminal Domain of the Mature Fibre
The structure of pb1(970-1263) resembles a bullet (Figure 2A). We divided the structure in five different regions and have named the secondary structure elements as shown in Figure 3 (based on the podovirus P22 tailspike (PDB entry 1TSP) [38]). The amino-terminal part (amino acids 989-1009, beta-strands I and J) forms a first "interdigitated" region (ir1). In this region, the beta-strands of the three monomers wrap around each other to form a triple beta-helix [38,39]. Next, a first "triangular domain" (residues 1010-1129, beta-strands K through P) is present. Here, three concave beta-sheets form a beta-prism (td1). It is composed of the anti-parallel beta strands K through O, with strand P oriented parallel to strand O and connected to it by a long loop containing an alpha-helix. 7 the podovirus P22 tailspike (PDB entry 1TSP) [38]). The amino-terminal part (amino acids 989-1009, beta-strands I and J) forms a first "interdigitated" region (ir1). In this region, the beta-strands of the three monomers wrap around each other to form a triple beta-helix [38,39]. Next, a first "triangular domain" (residues 1010-1129, beta-strands K through P) is present. Here, three concave beta-sheets form a beta-prism (td1). It is composed of the anti-parallel beta strands K through O, with strand P oriented parallel to strand O and connected to it by a long loop containing an alpha-helix.  There are no proteins in the protein data bank (PDB) with the same topology as the pb1 bullet-shaped domain, although the topologies of some bacteriophage tailspikes, such as the podovirus P22 tailspike (PDB entry 1TSP) [38], partially overlap. Phage P22 tailspike has a beta-helical domain, an interdigitated region, a triangular beta-prism domain and a second interdigitated region (called caudal fin). Of these, the triangular beta-prism is the most similar to the second triangular domain of pb1 and has the same topology ( Figure 4A). The P22 tailspike has endo-rhamnosidase activity; however, pb1 has no known enzymatic activity. Consequently, the catalytic residues of the P22 tailspike are not conserved in pb1.
P22 tailspike (PDB entry 1TSP) [38], partially overlap. Phage P22 tailspike has a beta-helical domain, an interdigitated region, a triangular beta-prism domain and a second interdigitated region (called caudal fin). Of these, the triangular beta-prism is the most similar to the second triangular domain of pb1 and has the same topology ( Figure 4A). The P22 tailspike has endo-rhamnosidase activity; however, pb1 has no known enzymatic activity. Consequently, the catalytic residues of the P22 tailspike are not conserved in pb1. The second interdigitated region ir2 (residues 1130-1160, beta-strands Q and R) again forms a short triple beta-helix, with the beta strands in parallel disposition. This is followed by a second triangular domain td2 (residues 1161-1238). Like the first triangular domain, this domain is also a beta-prism formed by three concave beta-sheets, each containing five anti-parallel beta-strands (S through W). At the distal end of the mature protein, a third interdigitated region (ir3) is located, made up of residues 1239-1263, which form a tapered triple-helical structure. The taper makes the end of the structure pointed. In the mature, chaperone-free structure, the third interdigitated region does not contain any regular beta-structure, although there are many inter-chain main-chain hydrogen bonds. In the structure with the chaperone, two short beta-strands are present in the third interdigitated region, forming a four-stranded beta-sheet with the C′ and F′ beta-strands of the chaperone domain ( Figures 2B and 3). Generally, the loops connecting the beta-strands in the structure are long and elaborate, especially the aforementioned OP-loop and the RS-loop, which is bent back as a hairpin towards the amino-terminus. The second interdigitated region ir2 (residues 1130-1160, beta-strands Q and R) again forms a short triple beta-helix, with the beta strands in parallel disposition. This is followed by a second triangular domain td2 (residues 1161-1238). Like the first triangular domain, this domain is also a beta-prism formed by three concave beta-sheets, each containing five anti-parallel beta-strands (S through W). At the distal end of the mature protein, a third interdigitated region (ir3) is located, made up of residues 1239-1263, which form a tapered triple-helical structure. The taper makes the end of the structure pointed. In the mature, chaperone-free structure, the third interdigitated region does not contain any regular beta-structure, although there are many inter-chain main-chain hydrogen bonds. In the structure with the chaperone, two short beta-strands are present in the third interdigitated region, forming a four-stranded beta-sheet with the C 1 and F 1 beta-strands of the chaperone domain ( Figures 2B and 3). Generally, the loops connecting the beta-strands in the structure are long and elaborate, especially the aforementioned OP-loop and the RS-loop, which is bent back as a hairpin towards the amino-terminus.

The Intra-Molecular Chaperone Domain
Serine 1264, the residue responsible for the auto-proteolytic cleavage of the carboxy-terminal domain from the bullet-shaped domain, is located between the bullet-shaped and intra-molecular chaperone domains. In the structure where Ser1264 has been substituted with alanine, the chain continues with a loop (residues 1265-1276) winding around the other three, subsequently Viruses 2015, 7, 6424-6440 forming a hairpin (residues 1277-1322). This first hairpin contains two alpha-helices and two beta-strands (A' and B') and is oriented side-ways, nestled under the aforementioned loop of amino acids 1265-1276. Three 50 Å-long hairpins (residues 1332-1371 from each monomer, containing beta-strands C' through F', protrude upwards and embrace the preceding bullet-shaped domain). The intra-molecular chaperone domain (ic) ends in a central triple-helical coiled-coil, to which each monomer contributes their carboxy-terminal alpha-helix (formed by residues 1371-1394).
The structure of the intra-molecular chaperone domain shows similarity to the equivalent domain of the E. coli phage K1F endosialidase (PDB entry 3GW6) [37] and can be superposed to an r.m.s.d. of 2.4 Å with 122 superposed Cα atoms, despite a sequence identity of only 18%. When monomers of the two structures are superposed ( Figure 4B), it is clear that they have the same fold and topology, with all the secondary structure elements conserved. The beta-hairpin tentacles in our structure are about 20 Å shorter than in the K1F endosialidase, while the A'B'-loop, that is partially disordered in our structure, contains an alpha-helix in the structure of the K1F endosialidase chaperone domain. The similarity between pb1 and the K1F endosialidase extends into the triple-helical tapered domain and the last strand of the second triangular domain (strand W), down to residue 1232 ( Figure 4B; r.m.s.d. of 3.3 Å with 188 Cα-atoms superposed). However, where in pb1 the VW-loop folds back, in the K1F endosialidase the triple beta-helix continues for one more strand. When the quaternary structures are compared, they are still similar, with the parallel beta-sheet permuted onto another side of the trimer. The structure of the intra-molecular chaperone domain of the neck appendage protein of Bacillus phage GA-1 (PDB entry 3GUD) [37] is also very similar to the pb1 chaperone, while the structure of the neck appendage protein chaperone domain from Bacillus phage phi29 (PDB entry 3SUC) [40] reveals more differences: its body has a beta-stranded fold, is not elongated as in the other chaperones and the beta-hairpin tentacles are much shorter.
In these experiments, a sample of the protein is incubated in the presence of a hydrophobic fluorescent dye and the temperature is slowly raised. When the protein unfolds, the dye increasingly binds to hydrophobic residues of the protein and the fluorescence increases. The experiments showed that pb1(970-1263) unfolds at 62 to 64˝C, while pb1(970-1396) unfolds between 64 and 65˝C, indicative of stable proteins.
The extensive intertwining of the monomers in the trimer and close association in other regions leads to the formation of numerous hydrogen bonds, van der Waals interactions and salt bridges (there are 16 putative inter-monomer salt bridges for pb1(970-1263) and 25 for pb1 (970-1396)). The central, three-fold symmetric, longitudinal cavity between the three monomers is surrounded by hydrophobic and aromatic residues, for example Ile1134, Ile1149, Phe1044, Phe1077, Phe1174 and Trp1226 ( Figure 5A,B). The interdigitated regions may be involved in the stabilization of the structure, keeping the three chains linked together as a molecular clamp [41].
In the chaperone, the bottom part of the centre of the chaperone coiled-coil is also hydrophobic, with Phe1383, Leu1387, Ile1390 and Leu1394 contributing to the core ( Figure 5C). The top part is more hydrophilic, with Glu1373 and Asn1376 also pointing towards the centre, interacting with its equivalents from the two other monomers. The three Arg1386 residues point outwards from the coiled-coil but project their guanidine groups to between Glu1384 and Glu1391 of a neighbouring monomer. These hydrophilic interactions and salt bridges, apart from contributing to the stability of the trimer, may ensure correct registration (i.e., avoid staggered interactions between the three Viruses 2015, 7, 6424-6440 monomers). The interactions also suggest that the fibre-distal part of the chaperone may be more rigid, and the fibre-proximal part more flexible, facilitating interaction with the fibre.
Journal 2015, volume, page-page 10 interacting with its equivalents from the two other monomers. The three Arg1386 residues point outwards from the coiled-coil but project their guanidine groups to between Glu1384 and Glu1391 of a neighbouring monomer. These hydrophilic interactions and salt bridges, apart from contributing to the stability of the trimer, may ensure correct registration (i.e., avoid staggered interactions between the three monomers). The interactions also suggest that the fibre-distal part of the chaperone may be more rigid, and the fibre-proximal part more flexible, facilitating interaction with the fibre. Although the interface between the bullet-shaped domain and the intra-molecular chaperone domain appears extensive ( Figure 2C), it is only 3 × 10 3 Å 2 large and the association is accompanied by a free energy gain of just −12 kcal/mol. The interface contains almost exclusively hydrophilic interactions. Salt bridges are formed by the Glu1355 residues from the top of the tentacle with Arg1150 and Arg1222 projecting from the sides of the bullet-shaped domain. These residues are from three different monomers. After cleavage of the three Thr1263-Ser1264 peptide bonds, the chaperone must be released; therefore, the interaction between the bullet-shaped domain and intramolecular chaperone domain must not be overly strong.

The Catalytic Site
For the intra-molecular chaperone of the endosialidase of E. coli phage K1F, a mechanism has been proposed for the auto-proteolytic reaction [37,42]. All the residues involved in this proposed mechanism are conserved in the phage T5 L-shaped tail fibre sequence and in very similar positions in the structure, making it likely the reaction mechanism is the same. Transposing the mechanism to the L-shaped tail fibre, Arg1250 of a neighbouring subunit of the third interdigitated region acts as a sensor for correct folding, positioning its positively charged guanidine group between Asp1239 and the side chain carbonyl of Thr1263, forming the necessary oxy-anion hole for stabilizing the Although the interface between the bullet-shaped domain and the intra-molecular chaperone domain appears extensive ( Figure 2C), it is only 3ˆ10 3 Å 2 large and the association is accompanied by a free energy gain of just´12 kcal/mol. The interface contains almost exclusively hydrophilic interactions. Salt bridges are formed by the Glu1355 residues from the top of the tentacle with Arg1150 and Arg1222 projecting from the sides of the bullet-shaped domain. These residues are from three different monomers. After cleavage of the three Thr1263-Ser1264 peptide bonds, the chaperone must be released; therefore, the interaction between the bullet-shaped domain and intra-molecular chaperone domain must not be overly strong.

The Catalytic Site
For the intra-molecular chaperone of the endosialidase of E. coli phage K1F, a mechanism has been proposed for the auto-proteolytic reaction [37,42]. All the residues involved in this proposed mechanism are conserved in the phage T5 L-shaped tail fibre sequence and in very similar positions in the structure, making it likely the reaction mechanism is the same. Transposing the mechanism to the L-shaped tail fibre, Arg1250 of a neighbouring subunit of the third interdigitated region acts as a sensor for correct folding, positioning its positively charged guanidine group between Asp1239 and the side chain carbonyl of Thr1263, forming the necessary oxy-anion hole for stabilizing the transition state ( Figure 6). A kink in the main chain at the position of Thr1263 places the hydroxyl oxygen of the scissile Ser1264 near the amine group of Lys1269, activating it for the cleavage reaction. The involvement of Arg1250 of a neighbouring monomer makes the correct formation of the triple beta-helix fold a requirement for the auto-proteolytic cleavage to proceed. In the mature fibre, we do not observe density for the catalytic Ser1264, suggesting that this residue is removed with the intramolecular chaperone domain upon maturation of the fibre. The same was observed for the K1F tailspike [37,42], while in the phi29 appendage the serine was modelled as the last residue of the mature protein, albeit with less convincing density [40].
transition state (Figure 6). A kink in the main chain at the position of Thr1263 places the hydroxyl oxygen of the scissile Ser1264 near the amine group of Lys1269, activating it for the cleavage reaction. The involvement of Arg1250 of a neighbouring monomer makes the correct formation of the triple beta-helix fold a requirement for the auto-proteolytic cleavage to proceed. In the mature fibre, we do not observe density for the catalytic Ser1264, suggesting that this residue is removed with the intramolecular chaperone domain upon maturation of the fibre. The same was observed for the K1F tailspike [37,42], while in the phi29 appendage the serine was modelled as the last residue of the mature protein, albeit with less convincing density [40].

Electron Microscopy of Phage T5 Tail and Distal Pb1 Assemblies
Purified T5 tails were analyzed by negative stain electron microscopy ( Figure 7A) as reported before [14]; pb1(970-1263) and pb1(970-1396) assemblies were analyzed in the same way ( Figure  7B,C). Averaged side views ( Figure 7A-C, insets) were used to calculate a three-dimensional model by applying rotational symmetry over the longitudinal axes, as described [14]. The volumes obtained for pb1(970-1263) and pb1(970-1396), about 18 and 25 nm long, respectively, correspond to dimers of the trimeric protein fragments, interacting tail-to-tail through their amino-termini. Notably, this interaction was also observed in the crystal packing of both pb1(970-1263) and pb1(970-1396). Superimposition of the truncated pb1 assembly volumes onto the intact L-shaped tail distal domain volume showed that these structures matched well, indicating that the truncation did not alter the T5 tail fibre tip structures ( Figure 7D-F); the additional volume of the intramolecular chaperone flower bud-shaped domain was located at the tips of the dimeric pb1(970-1396) ( Figure 7C, arrows). Docking analysis of pb1(970-1263) and pb1(970-1396) into their electron microscopy maps, as well as that of the T5 tail fibre, confirmed good matching (correlation coefficients of 93%-96%), with no notable structural alterations ( Figure 7G-I). It also confirmed the tail-to-tail dimeric nature of pb1(970-1263) and pb1(970-1396).

Electron Microscopy of Phage T5 Tail and Distal Pb1 Assemblies
Purified T5 tails were analyzed by negative stain electron microscopy ( Figure 7A) as reported before [14]; pb1(970-1263) and pb1(970-1396) assemblies were analyzed in the same way ( Figure 7B,C). Averaged side views ( Figure 7A-C, insets) were used to calculate a three-dimensional model by applying rotational symmetry over the longitudinal axes, as described [14]. The volumes obtained for pb1(970-1263) and pb1(970-1396), about 18 and 25 nm long, respectively, correspond to dimers of the trimeric protein fragments, interacting tail-to-tail through their amino-termini. Notably, this interaction was also observed in the crystal packing of both pb1(970-1263) and pb1(970-1396). Superimposition of the truncated pb1 assembly volumes onto the intact L-shaped tail distal domain volume showed that these structures matched well, indicating that the truncation did not alter the T5 tail fibre tip structures ( Figure 7D-F); the additional volume of the intra-molecular chaperone flower bud-shaped domain was located at the tips of the dimeric pb1(970-1396) ( Figure 7C, arrows). Docking analysis of pb1(970-1263) and pb1(970-1396) into their electron microscopy maps, as well as that of the T5 tail fibre, confirmed good matching (correlation coefficients of 93%-96%), with no notable structural alterations ( Figure 7G-I). It also confirmed the tail-to-tail dimeric nature of pb1(970-1263) and pb1(970-1396).

Discussion
Like other bacteriophage fibres and tailspikes [38][39][40]43,44], the bullet-shaped distal domain of pb1 is mainly beta-structured. It is likely the structure down to the kink (i.e., between the white and black asterisks in Figure 7A) also has beta-structure, while the thin proximal domain of pb1 may have a collagen fold [14].
In phage T5 pb1, the folding may start from the distal part of the chaperone with the association of the three helices of the short triple coiled-coil. The chaperone domains fold and the upward-pointing tentacles are formed. These tentacles then interact with the bullet-shaped domain of the fibre, confining the chains of the bullet-shaped domain while they fold. Once the carboxy-terminal triple-helix is correctly formed, proteolysis can take place as described above [37,42], leading to release of the chaperone domain.
Bacteriophage T5 binds to the polymannose O-antigens of Escherichia coli LPS. The minimal unit described to bind to T5 fibre is a tri-mannoside [6]. Many proteins bind to carbohydrates through aromatic residues [53], but aromatic residues do not cluster on the surface of our pb1(970-1263) structure; therefore, we can currently not draw any conclusions on where the oligo-mannose units might bind to pb1. An attractive hypothesis would be a receptor-binding site in the groove that becomes exposed upon intra-molecular chaperone cleavage, and this is consistent with the results of the phage infection inhibition assays by competition with pb1(970-1263) but not by the chaperone-containing pb1(970-1396). Future site-directed mutagenesis and co-crystallization studies should lead to more insight into the exact location and nature of the receptor-binding site.

Conclusions
We have solved the structure of the carboxy-terminal domain of the Siphoviridae phage T5 tail fibre, with and without its carboxy-terminal intra-molecular chaperone domain. The structures reveal an extensively intertwined, beta-structured receptor-binding domain and an alpha/beta-structured intra-molecular chaperone domain, with long beta-hairpin fingers enveloping its folded substrate. Once receptor-binding sites have been identified, the structure may form the basis of rational design of phage-based fibres for specific recognition of target bacteria, together with the structures of the receptor-binding domains of fibres of Myoviridae phages [43,54] and Podoviridae phages [44].