Compartmentalization of aquaporins in the human intestine.

Improper localization of water channel proteins called aquaporins (AQP) induce mucosal injury which is implicated in Crohn’s disease and ulcerative colitis. The amino acid sequences of AQP3 and AQP10 are 79% similar and belong to the mammalian aquaglyceroporin subfamily. AQP10 is localized on the apical compartment of the intestinal epithelium called the glycocalyx while AQP3 is selectively targeted to the basolateral membrane. Despite the high sequence similarity and evolutionary relatedness, the molecular mechanism involved in the polarity, selective targeting and function of AQP3 and AQP10 in the intestine is largely unknown. Our hypothesis is that the differential polarity and selective targeting of AQP3 and AQP10 in the intestinal epithelial cells is influenced by amino acid signal motifs. We performed sequence and structural alignments to determine differences in signals for localization and post-translational glycosylation. The basolateral sorting motif “YRLL” is present in AQP3 but absent in AQP10; while N-glycosylation signals are present in AQP10 but absent in AQP3. Furthermore, the C-terminal region of AQP3 is longer compared to AQP10. The sequence and structural differences between AQP3 and AQP10 provide insights into the differential compartmentalization and function of these two aquaporins commonly expressed in human intestines.


Introduction
Aquaporins (AQPs) are cell surface proteins that provide selective and rapid transport of water across the cell membrane [1]. There are 13 known mammalian aquaporins (AQP0-AQP12). Improper localization of AQPs has been reported to affect fluid permeability [2][3][4][5]. Alterations in fluid flux can result in disease conditions such as ulcerative colitis and Crohn's disease as a consequence of mucosal injury [3,6]. In the epithelial cells, AQPs are localized either in the apical or basolateral membrane [3]. Most cell surface proteins contain signals within their cytoplasmic termini that permit their recruitment into endocytic vesicles, which in turn facilitates their selective compartmentalization in the apical or basolateral membranes selectively [7,8].
There are four known classes (I-IV) of endocytosis recruitment/localization signals [9]. Class I: Tyrosinebased signals within YXXØ sequence (where X can be any amino acid and Ø is an amino acid with a bulky hydrophobic group) [10]. Many of these signals are constitutively active and are found to be independent of ligand binding [10]. Class II: The di-leucine motif (LL) is another well known endocytosis motif that is present within many transmembrane cell surface proteins [11]. Class III: Ligand-induced phosphorylation of serine residues in G protein-coupled receptors (GPCR) serves as a signal for receptor endocytosis. Class IV: A phenylalanineisoleucine motif together with the acidic EEDE cluster has been reported to regulate intracellular traffic of the endoprotease furin [12]. Class V: Ubiquitination has been recently identified as another regulator of the endocytosis of several membrane receptors. Ubiquitination is a post translational modification (PTM) that likely affects all proteins at some point in their life cycle. The most common role ubiquitination plays is in protein degradation [13].
In many cases, more than one endocytosis signal act in tandem, as in the case of the T-cell receptor CD3 gamma where the di-leucine motif acts in cooperation with a phosphorylated serine residue to mediate endocytosis. The low-density lipoprotein receptor-related proteins consist of NPXY motif in addition to the YXXL motif. However, recent reports suggest that only YXXL motif is sufficient for endocytic trafficking [14]. Other than phosphorylation, glycosylation is another PTM that can facilitate compartmentalization of cell surface proteins. Almost all cell surface proteins are glycosylated. Proper folding of the translated protein is facilitated by glycosylation events, and thus protein function is often dependent on or refined by the carbohydrate moieties attached to the polypeptide. Heterogeneity often exists in the multiple oligosaccharide chains attached to a single protein. The structures of these oligosaccharides are regulated by very specific glycosyltransferases and glycosidases [15].
The amino acid sequences of AQP3 and AQP10 are 79% similar and belong to the mammalian aquaglyceroporin subfamily. AQP10 is localized on the apical compartment of the intestinal epithelium called the glycocalyx while AQP3 is selectively targeted to the basolateral membrane [16]. Despite the high sequence similarity and evolutionary relatedness, the molecular mechanism involved in the polarity, selective targeting and function of AQP3 and AQP10 in the intestine is largely unknown. Our hypothesis is that the differential polarity and selective targeting of AQP3 and AQP10 in the intestinal epithelial cells is influenced by amino acid signal motifs. We performed sequence and structural alignments to determine differences in signals for localization and post-translational glycosylation. We observed that the YRLL motif is present in AQP3 but absent in AQP10 while the N-glycosylation signal is present in AQP10 but absent in AQP3. Finally, the C-terminal region of AQP3 is longer compared to AQP10. It is therefore of interest to decipher the significance in the structural and sequence differences between AQP3 and AQP10.

Methods
Examination of AQP3 and AQP10 for localization and glycosylation signals: The SwissProt Protein Sequences for human aquaporins AQP3 (Q92482) and AQP10 (Q96PS8) were retrieved and aligned using ClustalX [17]. The Prediction of N-Glycosylation sites for AQP3 and AQP10 was done with NetNGlyc 1.0 Server (www.cbs.dtu.dk/services/NetNGlyc/). Structural models of AQP3 and AQP10 were built with MODELLER (http://www.salilab.org/modeller/) using the high-resolution X-ray crystal structure of E. coli glycerol facilitator protein (1LDF) structure as a template (Protein Data Bank; http://www.rcsb.org/). The resultant models were subjected to Molecular Dynamics simulations and energy minimization using DISCOVER module of Insight II (Accelrys, Inc). Molecular dynamics simulations consisted of an initial equilibration of 5 pico seconds (ps) and followed by 100 ps dynamics at 300 K followed by a sequence of 10,000 steps of steepest descent and conjugate gradient energy minimization procedure. For all the above calculations, a distancedependent dielectric constant and non-bonded distance cutoff of 20 Å were used. Molecular graphics images were produced using SYBYL7.0 and the UCSF Chimera package from the Computer Graphics Laboratory, University of California, San Francisco.

Results and Discussion
Membrane trafficking proteins especially aquaporins found in the intestine are sorted proteins that show variations in functions and specificity. In order to understand the function of aquaporins it is essential to understand their modular three-dimensional architecture. In cellular membranes, four AQP monomers associate into a tetramer. To date only AQP1 has been crystallized and its biological unit (tetramer) has been reported [18]. In addition to providing four pores for bidirectional water transport, an additional pore formed by the biological unit transports ions (Fig. 1).  Due to high homology of AQPs, structure of water specific aquaporins can be predicted using AQP1 crystal structure as a model system. Several theoretical models of AQP3 and AQP10 were generated using the X-ray crystal structure of E. coli glycerol facilitator protein (1LDF) as a template. The final homology model was aligned with mammalian AQP1 crystal structures to predict/compare residue interactions and the overall structure-function relationship. The AQP1, AQP3 and AQP10 monomers contain 269, 292 and 301 amino acid residues respectively. Each AQP is comprised of two tandem repeats that fold in a similar manner, with three transmembrane (TM)-helices and one short connecting loop helix. Both the N-and Cterminus of AQP are to be found on the cytoplasmic side of the membrane.
There are six TM helices, plus two half-helices that form the overall helical bundle in the membrane. A single aqueous pathway is formed through the centre of the protein. There is a site at R41 of AQP1, within loop A (extracellular), for attachment of a polylactosaminoglycan moiety; loop D, the cytoplasmic counterpart of loop A, does not have such a site. Loops B and E, each containing an asparagine-proline-alanine (NPA) motif and short helices B and E, bend into the six-helix bundle to form the channel. Sequence analysis of all human AQPs shows that the inter-motif distances of functional water only AQPs is 113 residues and that of aquaglyceroporins is 129 residues. The analysis also revealed a conserved [KR]xxxY motif at a distance of 14 residues from the first NPA . Differential display of hydrophobic residues from the side ( Fig. 2A) and the top view (Fig. 2B) post structural superposition of AQP3 and AQP10 theoretical models reveal a strikingly similar orientation of corresponding conserved amino acids and motifs (Fig. 2) and a similar display of polar residues is illustrated in Fig 3. Three dimensional molecular surface of the aquaglyceroporins from the side (Fig. 4A [ Fig. 4D [transparent surface]) reveal the pore architecture and volume available for fluid transport. A residue level analysis shows that the AQP loops are held together by interactions between P77 and P193 in AQP1; P84 and P216 in AQP3; and P83 and P215 in AQP10. Also the positively charged arginine residues cap the N-terminal ends of helix-B and helix-E via hydrogen-bonding interactions. The positions of helices B and E are further stabilized by the ion-pair H74 with E17 in AQP1; H81-E28 in AQP3 and H80-E27 in AQP10, the salt bridge between R195-E142 in AQP1 and the corresponding arginine residue in AQP3 and AQP10 form hydrogen bonds with glutamine Q164 and Q163 respectively. In addition to providing comparative structural analyses of residue level interactions, the theoretical models have also provided an opportunity to understand the volume available for fluid transport and the critical residues on the periphery for future protein-protein or protein-small molecular interaction studies.

Sorting Signals
The N-terminus of AQP3 contains a four amino acid motif YRLL which includes the crucial tyrosine (Y) residue and the dileucine (LL) motif typical of basolateral targeting proteins. Alignment of AQP3 with that of AQP10 shows that AQP10 lacks the basolateral sorting motif (Fig.  5).  In order to check if this motif is conserved across the species, we examined all the AQP3 sequences deposited in GenBank. Of the mammalian AQP3 sequences (Fig. 6) only rhesus monkey (Macaca mulatta) lack the YRLL motif denoting the significance of the sorting signal (highlighted in Fig. 6), the sorting motif was absent in all the non-mammalian sequences examined. The absence of the YRLL motif in AQP10 may be responsible for the differential compartmentalization.

Glycocalyx and Glycosylated Proteins
Glycocalyx is the site for absorption of water, minerals, simple sugars and amino acids while the tight junctions regulate permeability of water and electrolytes across intestinal epithelium and prevents leakage of macromolecules from the gut lumen. AQP3 has only one glycosylation site 141-NGT-143 in loop C while the AQP10 has three glycosylation sites 75-NVS-77; 128-NYT-130; and 133-NLT-135 in loops B & C. The additional glycosylation sites in AQP10 provide the necessary signal to sort the aquaporin in the apical region.

Conclusion
The fluid balance in the intestinal epithelium is controlled by the sorting of AQP3 and AQP10. Misrouting of aquaporins result in fluid imbalance, which may induce Crohn's disease and ulcerative colitis. Sorting signals YRLL in AQP3 and additional N-glycosylation sites in AQP10 lead to differential sorting. Our analysis provides valuable insight into membrane sorting of these functionally related AQPs and structural basis for drug discovery.