Structural Conservation and Functional Diversity of the Poxvirus Immune Evasion (PIE) Domain Superfamily

Poxviruses encode a broad array of proteins that serve to undermine host immune defenses. Structural analysis of four of these seemingly unrelated proteins revealed the recurrent use of a conserved beta-sandwich fold that has not been observed in any eukaryotic or prokaryotic protein. Herein we propose to call this unique structural scaffolding the PIE (Poxvirus Immune Evasion) domain. PIE domain containing proteins are abundant in chordopoxvirinae, with our analysis identifying 20 likely PIE subfamilies among 33 representative genomes spanning 7 genera. For example, cowpox strain Brighton Red appears to encode 10 different PIEs: vCCI, A41, C8, M2, T4 (CPVX203), and the SECRET proteins CrmB, CrmD, SCP-1, SCP-2, and SCP-3. Characterized PIE proteins all appear to be nonessential for virus replication, and all contain signal peptides for targeting to the secretory pathway. The PIE subfamilies differ primarily in the number, size, and location of structural embellishments to the beta-sandwich core that confer unique functional specificities. Reported ligands include chemokines, GM-CSF, IL-2, MHC class I, and glycosaminoglycans. We expect that the list of ligands and receptors engaged by the PIE domain will grow as we come to better understand how this versatile structural architecture can be tailored to manipulate host responses to infection.


Introduction
Poxviridae comprise a diverse family of large double-stranded DNA viruses that undergo replication exclusively in the host-cell cytoplasm. Poxvirus virions are easily identified by their characteristic brick-shaped appearance in electron micrographs. Each virion contains a single linear genome that varies in length (130-360 Kb) depending on the virus strain [1]. The genomes are compact, with open reading frames (ORFs) being closely spaced and non-overlapping with no evidence of mRNA splicing. Although individual strains may contain more than 200 ORFs [1], only 50 are thought to encode proteins essential for viral transcription, DNA replication, or the formation of new virions [2]. These ORFs cluster in the central region of the genome and are well conserved of this family within the published genomes of poxviruses, especially among ORFs of still unknown function. We employed bioinformatics tools and an analysis of the published literature. We examined 33 representative chordopoxvirus genomes to find putative PIE-domain-containing proteins. These potential PIE proteins are extremely sequence diverse, dividing into 20 separate families across seven genera. All appear to contain a β-sandwich core domain, but each family is decorated by a unique set of insertions that encode secondary structural elements. Originally identified as chemokine binding proteins, it is clear the members of the PIE family are functionally diverse, and this diversity is likely to grow as more roles for these proteins are revealed. Finally, we explore the origins of the PIE domain by examining the distribution of PIE sequences across the chordopoxvirus subfamily.

vCCI
Probably the most extensively studied member of the poxvirus PIE domain family is vCCI, a protein secreted from infected cells by nearly all orthopoxviruses and leporipoxviruses. Members of this family have been given different names depending on their species of origin (vCCI, EVM1, T1, 35 kDa, vCKBP, or CBP-II). The presence of the vCCI protein in infected cell supernatants was noted long before its function was identified [29]. The vCCI protein binds chemokines in solution, preventing them from reaching their cognate receptors on target cells, and so interferes with their capacity to establish leukocyte migration. It appears to act as a competitive inhibitor of chemokine function, binding the same determinants used to engage cellular chemokine receptors [30][31][32]. The chemokine superfamily can be divided into subfamilies (C, CC, CXC, and CX 3 C) based on the spacing of conserved N-terminal cysteine residues in each cytokine [33,34]. Members of the vCCI family generally bind with high affinity to most human and mouse CC-chemokines, but not C-, CXC-, or CX 3 C-chemokines [19,32,[35][36][37][38]. Consistent with their binding capacity, members of the vCCI family have been shown to block CC-chemokine-induced calcium flux and cell migration in vitro [32,37,39], and cell migration in vivo [32,[39][40][41][42][43][44]. This combination of potency and specificity has drawn considerable interest for their potential use as anti-inflammatory agents [43,45,46].
The structure of vCCI has now been determined for three different species: cowpox (alone, pdb code 1CQ3) [18], ectromelia (alone, 2GRK) [19], and rabbitpox (with MIP-1β/hCCL4, 2FFK) [20]. Each vCCI structure shares the characteristic PIE domain with very similar decorations. RMSDs between the three structures, after removing loops that adopt different conformations, are remarkably close, ranging from 0.45 Å to 1.33 Å. As shown in Figure 1, the core structure of the PIE domain consists of a compact globular β-sandwich formed from two nearly parallel β-sheets connected by loops that frequently contain short β-strands and α-helices.
For vCCI, β-sheet I consists of five large anti-parallel strands, here numbered β5, β6, β1, β10 and β11 (Figure 1). Likewise β-sheet II also consists of five strands, numbered β2, β3, β4, β7, and β9. β-sheet II divides along strands β7 and β9, which are the only two core strands that run parallel to each other. Four highly conserved disulfide bonds hold the β-sheets of vCCI together. These disulfides are labeled "A" through "D" in the connectivity diagram (see Figure 2). The A disulfide connects the N-terminus of the protein to the end of strand β10, the B disulfide connects the C-terminus to the start of strand β2, while C and D occur at the beginning and end of strand β7, respectively.
The β6-β7 loop crosses between sheets I and II and contains a large α-helix. The face of vCCI β-sheet I is largely solvent inaccessible due to two large loops; the first being (β7-β9) and second being the C-terminus, which wrap around and occlude that side of the molecule. One of the most unique features of vCCI is the prominently extended β2-β3 loop that projects from β-sheet II. Both the length and sequence of this loop vary greatly between species (Figure 3 and Supplementary Figure S1), but the loop is always highly acidic, with 50% of the residues being Glu or Asp. This loop is adjacent to a highly sequence-conserved patch of acidic residues on the solvent-exposed face of 4880 Viruses 2015, 7, 4878-4898 β-sheet II (Figure 4a,b). Structure-based mutational analysis was used to demonstrate that conserved residues within this patch are important to support high-affinity chemokine binding [19].
The specific molecular details of ligand binding were revealed by the NMR solution studies of Zhang et al. [20], who determined the structure of rabbitpox vCCI bound to human MIP-1β (hCCL4). Consistent with previous studies [18][19][20]30], the vCCI/chemokine complex formed with 1:1 stoichiometry. As predicted, the large negatively-charged sequence-conserved patch on β sheet II was found to accommodate the positively-charged chemokine (dashed oval in Figure 4). The binding by vCCI covers regions of the chemokine that are important for homodimerization, receptor binding, and GAG interactions. The negative charge and flexibility of the extended β2-β3 loop appear to allow it to interact well with different chemokines, favoring those with positive residues at certain positions, yet not excluding chemokines if large hydrophobics occupy those positions. The chemokine-binding profile of different vCCI proteins appears species independent [37]. One noteworthy difference is that lepopripovirus myoxma virus CC-chemokine inhibitor, M-T1, has been shown to interact with GAGs [48] using clusters of basic amino acid residues on the face opposite from the chemokine binding site. These basic residues are not found in orthopoxvirus versions of vCCI.

A41
Like vCCI, A41 is secreted from infected cells [49]. It was noted early on that, although the two proteins share little sequence identity (~22% over the entire sequence in vaccinia virus strain Lister), they are of similar size and all eight cysteines of A41 align with those of vCCI. A vaccinia mutant lacking the A41 ORF replicates normally in cell culture, yet in two different models of dermal infection it displayed an altered inflammatory response. The average lesion size for mice infected with the knockout virus was larger, and the influx of inflammatory cells greater, than for the wild-type or revertent control viruses [49]. Deletion of the A41 ORF both enhances vaccinia virus immunogenicity and increases its efficacy when used as a vaccine to immunize mice [50]. Vaccinia A41 binds a subset of CC-chemokines: CCL21, CCL25, CCL26, and CCL28. However, even the tightest of these bind with two orders of magnitude lower affinity than does vCCI to a wide range of CC-chemokines. Further, vaccinia A41 does not inhibit chemokine receptor binding, but instead blocks the GAG-binding domain on chemokines [22]. Addition of GAGs such as heparin or dextran at high concentration can disrupt the A41-chemokine interaction. Recombinant chemokine analogs with alterations in their GAG-binding domain fail to bind

A41
Like vCCI, A41 is secreted from infected cells [49]. It was noted early on that, although the two proteins share little sequence identity (22% over the entire sequence in vaccinia virus strain Lister), they are of similar size and all eight cysteines of A41 align with those of vCCI. A vaccinia mutant lacking the A41 ORF replicates normally in cell culture, yet in two different models of dermal infection it displayed an altered inflammatory response. The average lesion size for mice infected with the knockout virus was larger, and the influx of inflammatory cells greater, than for the wild-type or revertent control viruses [49]. Deletion of the A41 ORF both enhances vaccinia virus immunogenicity and increases its efficacy when used as a vaccine to immunize mice [50]. Vaccinia A41 binds a subset of CC-chemokines: CCL21, CCL25, CCL26, and CCL28. However, even the tightest of these bind with two orders of magnitude lower affinity than does vCCI to a wide range of CC-chemokines. Further, vaccinia A41 does not inhibit chemokine receptor binding, but instead blocks the GAG-binding domain on chemokines [22]. Addition of GAGs such as heparin or dextran at high concentration can disrupt the A41-chemokine interaction. Recombinant chemokine analogs with alterations in their GAG-binding domain fail to bind ectromelia encoded A41 (E163) [51]. In addition to the CC-chemokines bound by A41, ectromelia E163 has been shown to bind a limited set of CXC chemokines with high affinity and also to bind GAGs directly [51]. The current working model is that A41 prevents the establishment of the chemokine concentration gradients required for leukocyte migration, employing a mechanism that is different but complementary to that used by vCCI.
Viruses 2015, 7, 4878-4898 ectromelia encoded A41 (E163) [51]. In addition to the CC-chemokines bound by A41, ectromelia E163 has been shown to bind a limited set of CXC chemokines with high affinity and also to bind GAGs directly [51]. The current working model is that A41 prevents the establishment of the chemokine concentration gradients required for leukocyte migration, employing a mechanism that is different but complementary to that used by vCCI. A crystal structure for A41 has been reported [22] showing a globular β-sandwich domain strikingly similar in fold to vCCI ( Figure 1). Like vCCI, the β6-β7 loop of A41 contains a large α-helix. Also like vCCI, A41 has a large negatively charged patch in sheet II (Figure 4a), Although the sequence of this charged patch is not conserved with vCCI, the sequence of the nearby area is, leading to the suggestion that this region may contain the chemokine binding site [22]. There are a few notable differences between these stuctures. A41 has a much shorter β2-β3 loop than vCCI ( Figure 3). Transfer of the extended loop from vCCI to A41 does not confer any additional ability to bind chemokine [49]. Also, the large β7-β9 loop that passes across the face of β-sheet I in vCCI adopts a different position in A41. Here the loop makes a unique decoration, and along with residues in the C-terminus forms a small anti-parallel β-sheet. Ectromelia E163 has been shown to bind GAGs [51]. The myoxma virus vCCI (M-T1) is known to interact with GAGs through a basic patch on β sheet I in the face opposite the chemokine binding site [48]. It is likely that all members of the A41 family conserve a large positively charged patch in that region ( Figure 4a). A crystal structure for A41 has been reported [22] showing a globular β-sandwich domain strikingly similar in fold to vCCI ( Figure 1). Like vCCI, the β6-β7 loop of A41 contains a large α-helix. Also like vCCI, A41 has a large negatively charged patch in sheet II (Figure 4a), Although the sequence of this charged patch is not conserved with vCCI, the sequence of the nearby area is, leading to the suggestion that this region may contain the chemokine binding site [22]. There are a few notable differences between these stuctures. A41 has a much shorter β2-β3 loop than vCCI ( Figure 3). Transfer of the extended loop from vCCI to A41 does not confer any additional ability to bind chemokine [49]. Also, the large β7-β9 loop that passes across the face of β-sheet I in vCCI adopts a different position in A41. Here the loop makes a unique decoration, and along with residues in the C-terminus forms a small anti-parallel β-sheet. Ectromelia E163 has been shown to bind GAGs [51]. The myoxma virus vCCI (M-T1) is known to interact with GAGs through a basic patch on β sheet I in the face opposite the chemokine binding site [48]. It is likely that all members of the A41 family conserve a large positively charged patch in that region ( Figure 4a).  Figure 1. (a) Electrostatic potential surfaces calculated using APBS [52]. Negative charge in red and positive charge in blue from −3kT/e to +3kT/e. Crystallographically observed contact surfaces for ligand are circled; (b) Sequence conservation within individual families was mapped to the molecular surface and colored magenta for highly conserved and green for variable. Because so few CrmD exist and CrmB and CrmD are closely related, sequences for CrmB and CrmD SECRET domains were aligned and conservation mapped to the CrmD molecular surface.  [20], CPV203 (T4) uses the edge of the β-sandwich plus part of sheet I to bind MHC class I/peptide complexes (4HKJ) [28], and CrmD appears to use sheet I for the binding of a low-affinity chemokine (3ON9) [25]. All ribbon diagrams are shown with the PIE domain in the same orientation.

CrmD C-Terminal Domain
Poxviruses encode a family of secreted immune evasion proteins with N-terminal sequence similarity to host tumor necrosis factor (TNF) receptors. Originally discovered in Shope fibroma virus, many other orthologues have since been identified in both leporipoxviruses and orthopoxviruses, and all have been shown to be sufficient for TNF binding [24,[53][54][55][56]. Referred to as cytokine response modifiers (Crms), the family now includes four different proteins called CrmB, CrmC, CrmD, and CrmE. Data has shown that deletion of CrmD from ectromelia virus results in a severely attenuated virus in a mouse model with the median lethal dose increasing by six orders of magnitude. Interestingly, the mice given WT virus show no signs of inflammation at the site of infection, while mice given the CrmD-KO virus display a vigorous inflammatory response. This data clearly shows that CrmD is a potent anti-inflammatory factor [57].
While all Crm proteins display the characteristic cysteine-rich N-terminal domains common to TNF receptors, both CrmB and CrmD contain an approximately 160 amino acid C-terminal domain that is quite distinct from the TNF-binding region. It was later discovered that the C-terminal extension confers the ability to bind to a distinct set of chemokines [24]. Using a sequence alignment of the CrmB and CrmD C-terminal domains from VACV, CPXV, and ECTV, Alejo and coworkers identified three additional proteins containing similar domain sequences, which they termed the smallpox virus-encoded chemokine receptor or SECRET domain. The proteins were named SECRET-containing proteins SCP-1, SCP-2, and SCP-3. These SCPs bind the same set of chemokines (human and mouse CCL28, CCL25, CXCL12b, CXCL13, CXCL14, and mouse CCL27 and CXCL11) as the C-terminal domains from VACV CrmB and ECTV CrmD. Their data indicate that the chemokine-binding specificity profile of the SECRET domain may be similar for all members of the family [24]. Although no crystal structure has yet been reported for SCP-1, -2, or -3, the fact that all bind the same chemokines despite their relatively low sequence similarity suggests a structural similarity.
Based on de novo modeling, it was predicted that the SECRET domain of CrmB would have structural similarity to vCCI and A41 [23]. This was confirmed in a report describing the crystal structure of ECTV  [20], CPV203 (T4) uses the edge of the β-sandwich plus part of sheet I to bind MHC class I/peptide complexes (4HKJ) [28], and CrmD appears to use sheet I for the binding of a low-affinity chemokine (3ON9) [25]. All ribbon diagrams are shown with the PIE domain in the same orientation.

CrmD C-Terminal Domain
Poxviruses encode a family of secreted immune evasion proteins with N-terminal sequence similarity to host tumor necrosis factor (TNF) receptors. Originally discovered in Shope fibroma virus, many other orthologues have since been identified in both leporipoxviruses and orthopoxviruses, and all have been shown to be sufficient for TNF binding [24,[53][54][55][56]. Referred to as cytokine response modifiers (Crms), the family now includes four different proteins called CrmB, CrmC, CrmD, and CrmE. Data has shown that deletion of CrmD from ectromelia virus results in a severely attenuated virus in a mouse model with the median lethal dose increasing by six orders of magnitude. Interestingly, the mice given WT virus show no signs of inflammation at the site of infection, while mice given the CrmD-KO virus display a vigorous inflammatory response. This data clearly shows that CrmD is a potent anti-inflammatory factor [57].
While all Crm proteins display the characteristic cysteine-rich N-terminal domains common to TNF receptors, both CrmB and CrmD contain an approximately 160 amino acid C-terminal domain that is quite distinct from the TNF-binding region. It was later discovered that the C-terminal extension confers the ability to bind to a distinct set of chemokines [24]. Using a sequence alignment of the CrmB and CrmD C-terminal domains from VACV, CPXV, and ECTV, Alejo and coworkers identified three additional proteins containing similar domain sequences, which they termed the smallpox virus-encoded chemokine receptor or SECRET domain. The proteins were named SECRET-containing proteins SCP-1, SCP-2, and SCP-3. These SCPs bind the same set of chemokines (human and mouse CCL28, CCL25, CXCL12b, CXCL13, CXCL14, and mouse CCL27 and CXCL11) as the C-terminal domains from VACV CrmB and ECTV CrmD. Their data indicate that the chemokine-binding specificity profile of the SECRET domain may be similar for all members of the family [24]. Although no crystal structure has yet been reported for SCP-1, -2, or -3, the fact that all bind the same chemokines despite their relatively low sequence similarity suggests a structural similarity.
Based on de novo modeling, it was predicted that the SECRET domain of CrmB would have structural similarity to vCCI and A41 [23]. This was confirmed in a report describing the crystal structure of ECTV CrmD both alone and in complex with the chemokine CX 3 CL1 [25]. Like vCCI and A41, the outside of the β-sheet II surface is completely solvent exposed (Figure 1). On the opposite face, one half of β-sheet I is covered by a long C-terminal loop that follows from strand β11. But the long β7-β9 loop that spans the center of β-sheet I in vCCI stays on the β-sheet II side of CrmD, where it becomes a new strand anti-parallel to β9 (called β8). Also, the length of the β2-β3 loop is much shorter than in vCCI, closer in length to that of A41. Likewise, the β6-β7 loop that crosses between sheets I and II, and forms the distinctive α-helix found in both vCCI and A41, is much 4885 Viruses 2015, 7, shorter, resulting in the helix being absent. Together, these differences make the CrmD structure appear more compact. CrmD has three of the four disulfides found in vCCI and A41, but is missing the one that normally connects the N-terminus to the end of β10. CrmD shares 47% sequence identity with CrmB, suggesting that the respective PIE domains in CrmD and CrmB will be very similar in structure. The SCPs may also be similar in structure to CrmD. For example, even SCP-1, which is the most divergent of the SCPs in sequence from ectromelia virus CrmD, is predicted by the Phyre2 server [58] to resemble CrmD in structure-yielding a 93% confidence score with 79% coverage and 25% sequence identity over ectromelia virus CrmD (pdb 3ON9).
The structure of CrmD with a chemokine [25] indicates the chemokine-binding site is located on the face opposite that used by vCCI and A41 (on β-sheet I, see Figures 4 and 5). This study was performed with human CX 3 CL1, a chemokine that binds CrmD with lower affinity (K D = 0.68uM) as compared to previously characterized chemokine ligands for the SECRET domain. Chemokines such as CCL28, CCL25, CXCL12b, CXCL13, CXCL14, XCL1 and CCL20 bind in the low nM range [24]. However, swapping three negatively charged binding site residues to alanine by site directed mutagenesis appeared to disrupt CrmD binding of several CC-and CXC chemokines, indicating that the SECRET domain may bind different chemokines in a similar manner. Still, the residues that contact CX 3 CL1 in the CrmD structure are not highly conserved in SCP-1, SCP-2, and SCP-3 despite their uniform chemokine binding profiles. Additional studies will be required to determine if all SECRET domains engage chemokine in the same way. Further, due to the apparent redundancy, it is worth considering that chemokine binding by SECRET domains may be a vestigial property and no longer their primary function.

CPXV203
Another PIE domain containing protein, CPXV203, is encoded by cowpox ORF CPXV_BR_203 and shares roughly 25% sequence identity with cowpox vCCI over 69 of 231 residues. CPXV203 down regulates MHC class I in both murine and human cells during normal poxvirus infection [26,27]. CPXV203 works to prevent T-cell killing of infected cells in concert with another cowpox protein, CPXV12, a protein that effectively blocks the TAP-mediated transport of cytosolic peptides for MHC class I loading [59][60][61]. CPXV203 binds a wide array of both classical and non-classical MHC class I proteins and prevents them from trafficking to the plasma membrane by a mechanism dependent upon its C-terminal "KTEL" motif (recognized by the KDEL receptor) [27,62]. The KDEL-receptor recycling pathway normally functions as an ER-retrieval system, and is employed to capture defective chaperone-complexed MHC class I proteins in the Golgi and return them to the ER for new attempts at peptide-loading [63]. CPXV203 binds fully assembled MHC proteins in a highly pH dependent manner, with tighter complexes formed at the more acidic pH associated with the Golgi compartment. CPXV203 engages the underside of the MHC class I peptide-binding platform, contacting both the heavy chain α2 and α3 domains as well as β2m. These surfaces are extremely well conserved among MHC family members. In fact, elements of the MHC class I interface contacted by CPXV203 are required for tapasin, CD8, and natural killer (NK)-receptor engagement. Once back in the ER, CPXV203 releases MHC class I proteins due to the higher pH of that compartment in a process controlled by at least two His residues in CPXV203.
Our crystallographic analysis revealed that CPXV203 is structurally related to the poxvirus chemokine binding proteins vCCI and A41 (Figure 1) [28]. In contrast to vCCI and A41 which bind chemokines through β-sheet II, CPXV203 uses β-sheet I to bind MHC (see Figure 5). It divides the interface almost equally among the peptide-binding platform, β2m, and α3 domain. As in CrmD, the β7-β9 loop does not block accessibility to β-sheet I but remains in β-sheet II where if forms a new edge strand (β8). In addition to the four disulfides found in vCCI and A41, CPXV203 has one additional disulfide linking the β1 strand to a decoration of two α-helices located in the C-terminus. This decoration is used by CPXV203 to contact the MHC α2-domain. The β5-β6 loop is the source of nearly all α3 domain contacts. The β2m contacts come from the edge of the β-sandwich contributed by stands β8 and β10. Like CPXV203, other members of the orthopoxvirus T4 family contain a C-terminal KTEL motif, and so presumably interact with the KDEL-receptor. Members of the T4 family are also found in leporipoxvirus, cervidpoxvirus, and caporipoxvirus, and these share 45% sequence identity with CPXV203. No greater than eight of the 22 CPXV203 residues known to make close contact with MHC class I in the co-complex structure ( Figure 5) are conserved in any other member of the T4 family. Also missing are the two histidine residues that control the pH-dependent variation in MHC-binding affinity displayed by CPXV203. Therefore it appears unlikely that all T4 family members bind MHC. Further, it is also unclear if all members of the T4 family interact with the KDEL receptor. For example, cervidpoxvirus (DPXV W83-004) encodes a C-terminal YDEL, the capipoxviruses a C-terminal HNEL (LSDV_2490_003, SPPV_A_002, GTPV_G20_002, GTP PEL_002), while the leporipoxvirus contains an RDEL (MYXV LAU_M-T4). And although wildtype myxoma virus M-T4 is retained in the ER, deletion of the RDEL sequence did not alter its intercellular localization. However increased inflammation and edema at the site of injection was observed in rabbits infected with the deletion virus versus the parental control [64]. Although the T4 proteins are likely to prove functionally distinct, they do appear evolutionarily related and so are presented here as a single family.

ORF-GIF Family
The parapoxvirus orf virus causes a contagious pustular dermatitis, primarily in ruminants. Cells infected with orf virus secrete a 28-kDa GM-CSF inhibitory factor (GIF) displaying low sequence identity with cowpox A41 (28% over 88 of 202 residues) suggesting that these proteins may be related [49,65]. Orf virus GIF from strain NZ2 binds and inhibits ovine GM-CSF and IL-2 and although these cytokines share little primary sequence similarity, they are both short-chain four helical bundle cytokines [66]. GIF binds ovine GM-CSF with a K D of 0.4 nM and ovine IL-2 with K D of 1.0 nM, but does not bind human GM-CSF or IL-2, despite the fact that orf virus can infect humans [65]. The protein is highly glycosylated, and believed to form dimers and tetramers in solution as assessed by size exclusion chromatography [65]. The GIF protein contains seven cysteine residues, six of which align well with vCCI and A41. These likely correspond to disulfides A, B, and D ( Figure 2). The same six cysteine residues are conserved in the GIF proteins of pseudocowpox virus (PCPVgp121) and parapoxvirus red deer (SB87gp117), which are respectively 89% and 40% identical to orf GIF over the entire sequence. Although pseudocowpox GIF from strain BO74 has been shown to bind GM-CSF and IL-2 [67], the parapoxvirus red deer GIF remains to be tested. No GIF protein has been reported to bind chemokine.

ORF-CBP Family
Members of the genus parapoxvirus secrete chemokine-binding proteins (CBP) that are functionally similar to members of the vCCI family in their ability to bind with high affinity and inhibit many CC-chemokines. In addition, parapoxvirus CBPs can also bind C- [68] and some CXC-chemokines [69]. Although bovine papular stomatitis virus CBP (BPSVgORF112) and orf virus CBP (ORFVgORF112) share only 40% sequence identity, both have been reported to inhibit these three classes of chemokines. Functionally, orf virus CBP has been shown to inhibit the recruitment of pro-inflammatory monocytes into skin using a mouse model of lipopolysaccharide-induced inflammation [70], as well as dendritic cell trafficking and subsequent activation of T-cells [71]. Orf virus CBP shares 26% sequence identity with orf GIF, and conserves the positioning of the six cysteine residues within GIF that form three of the four disulfides found in vCCI and A41. Site-directed mutagenesis identified four residues of CCL2 that, when changed to alanine, alter orf CPB binding [68]. These residues lie within a region contacted by the CCL2 receptor CCR2 [72], and the same residues were previously shown to be required for high affinity interaction of CCL2 with vCCI [30,31]. Together, these data demonstrate that orf CBP inhibits chemokine activity by blocking the receptor-binding site on chemokines, in a manner similar to that employed by the vCCI family.

Search for Additional PIE Domains
The PIE and putative PIE proteins described above share some important attributes. All are soluble proteins having signal peptides for targeting to the secretory pathway. So far, all PIEs are noteworthy in that they lack obvious sequence relationships with any other eukaryotic or prokaryotic protein and all appear to be nonessential for virus replication. They are small in size, ranging between about 17 and 35-kDa. Additional information can be taken from a structure-based alignment of the four known PIE domains (Figure 3). The β-sandwich scaffold allows large insertions at only a few postitions. The most prominent insertions occur in the β6-β7 loop, the β7-β9 loop, and at the C-terminus. These contain decorations of short β-strands or α-helices, but in all cases are anchored to the scaffold by a pattern of disulfide bonds (labeled A-E in Figures 2 and 3). The main structural differences between the PIE domain families occur in the length and placement of the decorations. Correspondingly, they may contain as few as a single disulfide bond, but three or four are more common. As the contact sites for the different ligands map mostly to the decorations (Figures 3  and 4), it is clear their effect is to alter ligand specificity.
With this information in hand and a review of the literature, we set out to find other PIE-containing proteins in chordopoxvirus. A total of 33 genomes spanning 10 genera were examined (Table 1). A combination of methods was used to detect possible PIE domains. First, a hidden Markov model of the PIE domain core was constructed by removing the decorations from the alignment in Figure 4. (HMMER program, available on the web at http://hmmer.janelia.org/) [73]. This Markov model was used to screen the genomic sequences. The genomes were also screened by Blastp [74] analysis using existing PIE amino acid sequences. Candidate ORFs were included as PIEs if they had a signal peptide and threaded to any of the known PIE structures using Phyre2 [75,76]. Additional Markov models were constructed as the set expanded, and the screening process was repeated. The resulting sequences were distributed into 20 families based on primary sequence similarity and available functional data (see Table 2). Table 2 provides a framework in which to begin the discussion of PIE domain variants. The list is likely incomplete as there may be PIE encoding ORFs that were too sequence diverse to be detected by these methods. Also, the family assignments should be considered tentative as functional and structural information is still scarce. The first 10 families across the top of the table are arranged in order of their position within the cowpox genome (vCCI-CrmD). For each strain, the number of ORFs in each PIE family is entered in the table. A number 2 in the table indicates that the virus contains two identical copies of that ORF, with one exception. The number 2 in the ORF-CBP column represents two closely related but distinct genes, SB87-111 and SB87-112. A number 0 indicates that less than half of that ORF is present, or that the ORF is reported to lack expression. Further investigation at the nucleotide level would detect additional ORF remnants [77] but was considered beyond the scope of this study. Figure 6 contains a sequence alignment showing a representative member from each PIE family.
When possible, the sequence from cowpox strain Brighton Red is shown in Figure 6, but sequences from other strains or viruses are shown when not in Brighton Red. The multiple sequence alignment was constructed in ClustalX [78] from the structure-based alignment of Figure 4 by matching all other sequences to that profile one at a time and merging the results. Hand editing was kept to a minimum. A summary of predicted physical properties for proteins in the representative sequence alignment is given in Table 3.
Alignments of individual PIE families can be found in the supplementary online materials (Figures S1-S15). Figure 7 contains a midpoint-rooted phylogenetic tree representing the sequence relationships among the 20 families constructed using the PIE domain alignment of Figure 6 after removal of the signal peptide sequences (default settings at ViPR tools, http://www.viprbrc.org). PIE proteins share a core fold, and therefore patterns of secondary structure elements, hydrophobic packing residues, and disulfide bonds, but these properties do not always translate into similarities at the level of primary sequence. Consequently, many of the branch lengths between families in the dendrogram are large. A more detailed phylogram, employing all the PIE sequences from Table 2 is given in the supplement ( Figure S15). The family assignments presented in Table 2   were taken from the alignment in Figure 6 and lettered as in Figure 2. The disulfides shown in bold typeface were determined from the structures in Figure 1.

PIEs of Unknown Function
Survey of the published genomes revealed 10 potential PIE families of unknown function. The largest of these families is M2, found in almost all orthopoxvirus examined, and also in leporipoxvirus, yoka poxvirus, and cotia poxvirus. M2 is predicted to have an unusual disulfide-bonding pattern (Table 3). It appears to have four disufides like vCCI (ABCD). However, the second cysteine of bond A is predicted to occur 11 residues earlier than in vCCI, implying a difference near the C-terminus in the β10 strand. Also the first cysteine of bond C is missing and looks to have been replaced by a new cysteine about four residues from the C-terminus. Both new cysteine positions are conserved in nearly all members of the M2 family. While one study suggested M2 expression interferes with NF-kB activation [82], it appears likely the primary function of M2 remains unreported. Yoka poxvirus also contains a protein that appears to be a member of the M2 family (~69% to cowpox M2) and a second ORF with slightly less similarity to M2 (~34% identity). Presumably the two proteins, being in the same virus, serve different functions. We refer to the second protein as M2-like. Yaba monkey tumor virus and deerpox virus also encode M2-like proteins.

PIEs of Unknown Function
Survey of the published genomes revealed 10 potential PIE families of unknown function. The largest of these families is M2, found in almost all orthopoxvirus examined, and also in leporipoxvirus, yoka poxvirus, and cotia poxvirus. M2 is predicted to have an unusual disulfide-bonding pattern (Table 3). It appears to have four disufides like vCCI (ABCD). However, the second cysteine of bond A is predicted to occur 11 residues earlier than in vCCI, implying a difference near the C-terminus in the β10 strand. Also the first cysteine of bond C is missing and looks to have been replaced by a new cysteine about four residues from the C-terminus. Both new cysteine positions are conserved in nearly all members of the M2 family. While one study suggested M2 expression interferes with NF-kB activation [82], it appears likely the primary function of M2 remains unreported. Yoka poxvirus also contains a protein that appears to be a member of the M2 family (69% to cowpox M2) and a second ORF with slightly less similarity to M2 (34% identity). Presumably the two proteins, being in the same virus, serve different functions. We refer to the second protein as M2-like. Yaba monkey tumor virus and deerpox virus also encode M2-like proteins.
A second putative PIE family, C8, is found in orthopoxvirus. C8 bears a strong resemblance to SCP-3 (28% identity in cowpox). C8 and SCP-3 are two of the smallest PIE proteins, and the only two predicted to have two disulfides.
The remaining families have only one or two members. The single ORF encoding CPXV-007 is unique to strain Germany 91-003 of cowpox, not being found in the strains Brighton Red or GRI-90.
It shares 17% identity with cowpox SCP-1 and is nearly identical in size. Both CPXV-007 and SCP-1 are predicted to contain a single disulfide. The CPXV-007 protein is also predicted to have one of the highest pI values (9.1) of all the PIE domains examined (Table 3). Cotia virus contains a protein, COTV030, that is very similar in size, disulfide-bonding pattern, and pI (9.2) to CPXV-007 but they share very little sequence similarity. Parapoxvirus of red deer strain HL953 contains two proteins of unknown function, SB87-112 and SB87-113. These are similar in size, sequence, and cysteine placement to that of orf GIF and orf CBP. By sequence similarity they have been placed in the ORF-CBP family. Cotia virus encodes two familes with duplicate ORFs. The first of these, the SCP-like family, contains COTV007 and COTV179. The second, the A41-like family, contains COTV011 and COTV175. Four other ORFs from Cotia virus may encode PIE domains but the signal peptides do not appear to be functional. These include COTV001, COTV004, COTV182, and COTV185. The cervidpoxvirus protein DVXV-016 looks very similar to the SECRET domains of orthopoxvirus. The BPSV-GIF protein was cloned from bovine papular stomatitis virus because it shared 37% with orf GIF. It conserves the six cysteines and the WSXWX-like motif required for orf GIF function, but does not bind GM-CSF or IL-2 [67]. RANTES (CCL5) binding has been reported in the supernatant of PCPV-infected cells [67]. The ORF-CBP protein PCPV-116 is a likely candidate for this activity.

Conclusions
Our sequence analysis suggests that many poxvirus ORFs of unknown function likely encode members of the PIE domain superfamily. We predict that these proteins share a core structural scaffold, one that has been modified repeatedly to create different binding specificities for host molecules. Although the PIE domain was initially discovered in proteins that bind chemokines, it is now clear that it is functionally diverse and that decorations to the domain core confer unique binding specificities. This situation is reminiscent of the immunoglobulin (Ig) domain superfamily. Each Ig domain consists of two β-sheets, assembled as a β-sandwich, that are held together by an inner layer of buried hydrophobic residues. The binding specificity of each Ig domain is determined largely by the loops that connect its β-strands. These loops can vary in size, containing insertions ranging from a few residues to whole domains. Consequently, within the Ig domain superfamily, structure is often more conserved than primary sequence [83]. It is this ability to accommodate shifts in surface decoration that make the Ig fold a useful scaffold for generating new binding specificities. Ig domains are extremely versatile. They are found in: antibodies, cell surface receptors, extracellular matrix proteins, bacterial chaperones, intracellular regulatory proteins, and poxviral proteins, to name just a few [84,85]. The PIE fold is distinct from the Ig fold, both in strand organization and connectivity, but it similarly accommodates shifts in surface decoration. Judging from the extensive sequence diversity of the PIE-domains surveyed here, we predict cytokine and MHC binding to be just the tip of the PIE function iceberg.