Reclassification of SLC22 Transporters: Analysis of OAT, OCT, OCTN, and other Family Members Reveals 8 Functional Subgroups

Among transporters, the SLC22 family is emerging as a central hub of endogenous physiology. The family consists of organic anion transporters (OATs), organic cation transporters (OCTs) and zwitterion transporters (OCTNs). Despite being known as “drug” transporters, these multi-specific, oligo-specific, and relatively mono-specific transporters facilitate the movement of metabolites and key signaling molecules. An in-depth reanalysis supports a reassignment of these proteins into eight functional subgroups with four new subgroups arising from the previously defined OAT subclade. These OAT subgroups are: OATS1 (SLC22A6, SLC22A8, and SLC22A20), OATS2 (SLC22A7), OATS3 (SLC22A11, SLC22A12, and Slc22a22), and OATS4 (SLC22A9, SLC22A10, SLC22A24, and SLC22A25). We propose merging the OCTN (SLC22A4, SLC22A5, and Slc22a21) and OCT-related (SLC22A15 and SLC22A16) subclades into the OCTN/OCTN-related subgroup. Functional support for the eight subgroups comes from network analysis of data from GWAS, in vivo models, and in vitro assays. These data emphasize shared substrate specificity of SLC22 transporters for characteristic metabolites such as prostaglandins, uric acid, carnitine, creatinine, and estrone sulfate. Some important subgroup associations include: OATS1 with metabolites, signaling molecules, uremic toxins and odorants, OATS2 with cyclic nucleotides, OATS3 with uric acid, OATS4 with conjugated sex hormones, particularly etiocholanolone glucuronide, OCT with monoamine neurotransmitters, and OCTN/OCTN-related with ergothioneine and carnitine derivatives. The OAT-like and OAT-related subgroups remain understudied and therefore do not have assigned functionality. Relatedness within subgroups is supported by multiple sequence alignments, evolutionarily conserved protein motifs, genomic localization, and tissue expression. We also highlight low level sequence similarity of SLC22 members with other non-transport proteins. Our data suggest that the SLC22 family can work among itself, as well as with other transporters and enzymes, to optimize levels of numerous metabolites and signaling molecules, as proposed by the Remote Sensing and Signaling Theory.


Introduction
The SLC (solute carrier) gene family includes 65 families with over 400 transporter genes. In humans, 52 of these families are expressed, encompassing more than 395 genes and it has been estimated that ~2000 (10% of the genome) human genes are transporter-related (1).
Various solute carrier 22 (SLC22) members are expressed on both the apical and basolateral surfaces of epithelial cells where they direct small molecule transport between body fluids and vital organs, such as the kidney, liver, heart, and brain (2). SLC22 transporters are also found in circulating cell types such as erythrocytes (e.g. SLC22A7), monocytes, and macrophages (e.g. SLC22A3, SLC22A4, SLC22A15, and SLC22A16) (3,4). With recent calls for research on solute carriers, there has been a large influx of data over the past five years, including novel roles in remote sensing and signaling, leading to the need for a more comprehensive understanding of the functional importance of transporters (5).
The SLC22 family is comprised of at least 31 transporters and is found in species ranging from Arabidopsis thaliana of the plant kingdom to modern day humans (6,7). Knowledge surrounding this family of proteins has expanded greatly since its proposed formation in 1997, when SLC22A6 (OAT1, originally known as novel kidney transporter or NKT) was first cloned (8). Its homology to SLC22A1 (OCT1) and SLC22A7 (OAT2/NLT) led to the establishment of a new family (SLC22, TC# 2.A. 1.19) of transport proteins within the Major Facilitator Superfamily (TC# 2.A.1, MFS) as classified by the IUBMB-approved Transporter Classification (TC) system (8,9). These proteins all share 12 α -helical transmembrane domains (TMD), a large extracellular domain (ECD) between TMD1 and TMD2, and a large intracellular domain (ICD) between TMD6 and TMD7 (10). Research has shown these transporters to be integral participants in the movement of drugs, toxins, and endogenous metabolites and signaling molecules, such as prostaglandins, urate, -ketoglutarate, carnitine, and cyclic nucleotides across the cell membrane (11).
As key players in small molecule transport, SLC22 members are hypothesized to play a role in the Remote Sensing and Signaling Theory (12)(13)(14)(15). The Remote Sensing and Signaling Theory posits that several drug related genes, like transporters and enzymes, aid in maintaining homeostasis through remote communication between organs via small molecule substrates in the blood that may also serve as signaling molecules that regulate gene expression (16). This communication is supported by the example of serum uric acid levels. In the setting of compromised kidney function, the increase in serum uric acid seems to be partly mitigated through a compensatory increase in the expression and/or function of ABCG2 in the intestine, which allows the excretion of uric acid in the feces rather than the urine (17,18). Current research is focused on determining the ways in which these transporters collaborate to regulate metabolite levels throughout the body (19).
These subclades consist, on average, of three to four members with the exception of the OAT subclade which claims more than half of the 31 known members of SLC22 (10). Although these subclades are phylogenetically sound, the endogenous functions of many SLC22 members within the six subclades remain ill-defined or unknown. With the emergence of new data, we performed a re-analysis of the SLC22 family to better characterize the functional, endogenous grouping of these transporters. Our re-analysis shows eight apparent subgroups, with four of these subgroups arising out of the previously defined OAT subclade. Because these groupings are more closely related to well-known OATs rather than OCTs, OCTNs or other subclades, we refer to these as OAT subgroups (OATS1, OATS2, OATS3, OATS4).
We considered many factors in our re-analysis of SLC22 and subsequent designation of functionally based subgroups. To better describe the subgroups while still highlighting the nuances of each individual transporter, we utilized data from genomic loci, tissue expression, sequence similarity searches, proteomic motif searches, and functional transporter-metabolite data from GWAS, in vitro assays, and in vivo models. In place of phylogenetic studies, we performed multiple sequence alignments (MSA) and generated guide-trees that are based on sequence similarity or homology and thus provide more insight into function than solely phylogenetic studies. While the SLC22 family is composed of putative transporters, some members, like Slc22a20 and Slc22a17, have proposed mechanisms that differ from those of classic transporters (20,21). To that effect, we explored the sequence similarities between SLC22 transporters and non-transport related proteins. This work elucidates the diversity of the endogenous functions of SLC22 transporters in various tissues and provides an updated functional framework for assigning each transporter to a subgroup. Considering the importance of SLC22 transporters, forming functional groups that incorporate endogenous substrates and tissue expression patterns can help better define their roles in intra-organ, inter-organ, and interorganismal communication.

Data collection
SLC22 human and mouse sequences were collected from the National Center for  (40).

Results
Emerging data continues to indicate the centrality of the SLC22 family (particularly OATs, OCTs, and OCTNs) in endogenous physiology (5,16 datasets were used to build functional networks that support the subgroups ( Figure 1). In addition to these functional data, subgroups were also supported by structural, genomic, and other analyses explained below. Because some SLC22 members remain understudied, we also investigated low level sequence identity with non-transport proteins to better characterize these "orphaned" transporters.

Analysis of Substrate Specificity and Selectivity Helps Categorize Mono-, Oligo-, and Multi-Specificity of SLC22 Members
The concept of multi-, oligo-, and mono-specific SLC22 transporters was supported in part based on the number of unique drugs that are known to interact with each SLC22 member (  Table 1). We used these data to validate our initial specificity assignments, and found that, for the most part, the metabolite data were in agreement with the drug data. A transporter linked to many unique drugs was often linked to many unique metabolites. For example, OATS1 members SLC22A6 and SLC22A8 are linked to 99 and 126 drugs, respectively. This is reflected in the metabolite data, as each transporter was associated with at least 50 unique metabolites, confirming their multi-specific nature. OATS4 members SLC22A9, SLC22A10, SLC22A24, and SLC22A5 are understudied with respect to drugs. As a group, they are only associated with three drugs, making it difficult to predict their substrate selectivity. Endogenously, the group appears to have relatively mono-specific members that are dedicated to conjugated sex steroids, and oligo-specific members which are linked to conjugated sex hormones, short chain fatty acids, and bile acids.

Construction of Functional Networks from Metabolite-Transporter Interaction Data Support the Eight Subgroups
To visualize these transporter-metabolite interactions, which were acquired from a combination of GWAS, in vivo, and in vitro studies, we created networks using Cytoscape (40).
These networks allowed us to see the extent of unique and overlapping substrate specificity between transporters in the SLC22 family and within the proposed subgroups ( Supplementary   Figures 1 and 2). In these networks, all edges are undirected and represent a statistically significant result linking an SLC22 member to a metabolite. To give an example, the OATS1 network uses the members (SLC22A6, SLC22A8, SLC22A20) as central nodes. Each associated metabolite is connected to the member, and the networks are then combined to represent the entire subgroup and demonstrate how a metabolite may be linked to multiple transporters (Supplementary Figure 1A). Functional data were available for 21 of 31 known SLC22 transporters. The trimmed SLC22 network is displayed in Figure 1, the individual subgroup networks are in Supplementary Figures 1 and 2, and the total SLC22 network is in Supplementary Figure 3. The compiled data with transporter, metabolite, study, quantitative metric, and citation are present in Supplementary Table 1. While there is no single metabolite that is associated with all SLC22 transporters, some are linked to multiple family members, and thus may be characteristic for the family as a whole. These metabolites are prostaglandin E2, prostaglandin F2, estrone sulfate, uric acid, carnitine and creatinine, which are each linked to at least five different SLC22 members, respectively (Supplementary Figure 3). This result demonstrated that SLC22, as a group, is involved in regulating several metabolic processes, ranging from blood vessel dilation through prostaglandins to cellular energy production through carnitine (41,42). This also implies that the particular structural features of the SLC22 family in general (12 TMD, large ECD between TMD1 and TMD2, and large ICD between TMD6 and TMD7) lends itself well to interacting with these compounds. This is further supported by the subgroup-specific network analyses and motif analysis we performed ( Figure 1)

Molecules, Uremic Toxins, and Odorants
Several metabolites have been identified as substrates of SLC22A6 (OAT1) and SLC22A8 (OAT3). While many are unique, there is notable overlap. Both OAT1 and OAT3 interact with uremic toxins (indoxyl sulfate, p-cresol sulfate, uric acid) and gut microbiome derived products (indolelactate, 4-hydroxyphenylacetate), as well as many of the more general SLC22 metabolites, like prostaglandin E2, prostaglandin F2, uric acid, and creatinine (43)(44)(45)(46)(47). SLC22A20 (OAT6), while not as well-studied, has affinity for several odorants and short chain fatty acids that are also associated with OAT1 (48). OAT1 and OAT3 are clearly multi-specific, and OAT6 appears to be oligo-specific, as it handles both odorants and some short chain fatty acids. With respect to remote signaling, the shared metabolites among these transporters (Supplementary Figure 1A) are noteworthy because of their tissue localization (Table 4). OAT1 and OAT3 are primarily expressed in the kidney proximal tubule, with some expression in other tissues, like the choroid plexus and retina (Table 4). OAT6, however, is expressed in the olfactory mucosa of mice, presumably reflecting its affinity for odorants (21,48). In the kidney, OAT1 and OAT3, along with many other SLC22 transport proteins, help regulate the urine levels of many metabolites and signaling molecules which may potentially facilitate inter-organismal communication. For example, a volatile compound in one organism may be excreted into the urine through OAT1 and then somehow sensed by another organism through a mechanism involving OAT6 in the olfactory mucosa (12).

OATS2 (SLC22A7) is a Systemically-Expressed Transporter of Organic Anions and Cyclic
Nucleotides SLC22A7 (OAT2) is the only member of the OATS2 subgroup and is associated with prototypical SLC22 substrates, such as prostaglandins, carnitine, creatinine, and uric acid (38,46,49,50). OAT2 is also linked to cyclic nucleotides and dicarboxylic acids, which when taken with the previous metabolites, creates a unique profile worthy of its own subgroup (Supplementary Figure 1B) (51). Another distinguishing feature of OAT2 is its tissue expression patterns (Table 4). While its expression in the liver and kidney are common to many SLC22 members, it has been localized to circulating red blood cells, where it may function in cyclic nucleotide transport (3). Its expression in a mobile cell type and transport of cyclic nucleotides raises the possibility that it may act as an avenue for signaling.

OATS3 (SLC22A11, SLC22A12, Slc22a22) Functions to Balance Uric Acid and Prostaglandins
In humans, SLC22A11 (OAT4) and SLC22A12 (URAT1) share only two substrates, uric acid and succinate (Supplementary Figure 1C) (52,53). Uric acid is a beneficial metabolite in the serum as it is thought to be responsible for more than half of human antioxidant activity in the blood (54). However, high levels of uric acid can be harmful and are associated with gout (55).
URAT1 is associated with very few metabolites and is best understood for its role in uric acid reabsorption in the kidney proximal tubule, making it relatively mono-specific (52). OAT4, on the other hand, has been shown to transport prostaglandins and conjugated sex hormones in addition to uric acid, making it oligo-specific (56)(57)(58). URAT1 is almost exclusively expressed in the kidney, and OAT4 is expressed in the kidney, placenta, and epididymis ( Table 4). The more diverse tissue expression of SLC22A11 seems consistent with its wider range of substrates.
The subgroup differs in rodents because mice do not express Oat4. Instead, the rodent subgroup is composed of Slc22a12, known as renal-specific transporter (Rst) in mice, and Slc22a22, known as prostaglandin-specific organic anion transporter (Oat-pg). While Rst and Oat-pg do not share substrate specificity, together, they combine to play the role of URAT1 and OAT4 by handling uric acid and prostaglandins (59).  Figure 1D) (60). In terms of tissue expression, there is a distinct correlation between patterns and shared function amongst human OATS4 members (Table 4). We predict that all four members are conjugated sex steroid transporters with SLC22A9, A10, and A25 showing high expression in the liver where conjugation of glucuronides and sulfates to androgens and other gonadal steroids occurs (61).
SLC22A24 has low expression levels in the liver but is highly expressed in the proximal tubule, where it is predicted to reabsorb these conjugated steroids (61). This subgroup also includes a large rodent-specific expansion, consisting of Slc22a19 and Slc22a26-30. Although the rodentspecific expansion is greatly understudied, transport data for rat Slc22a9/a24 shows shared substrate specificity for estrone sulfate with SLC22A24, but not for bile acids or glucuronidated steroids, which is consistent with the lack of glucuronides in rat urine and serum (61). While sulfatases are extremely highly conserved amongst humans, rats, and mice, the separation of rodent-and nonrodent-specific OATS4 groups is likely due to the species differences in expression and function of glucuronidases (63). Despite their distinct differences from human OATS4 members in sequence similarity studies and minimal functional data, the rodent-specific transporters are also highly expressed in both liver and kidney (64).

OAT-Like (SLC22A13, SLC22A14) has Potentially Physiologically Important Roles
Very little functional data is available for the OAT-like subgroup. SLC22A13 (OAT10/ORCTL3) has been well characterized as a transporter of both urate and nicotinate, but SLC22A14 has no available transport data (65). However, N -methyl nicotinate is increased in the plasma levels of self-reported smokers, and GWAS studies have implicated SNPs in the SLC22A14 gene to be associated with success in smoking cessation (66,67). Although this data does not directly relate SLC22A14 to nicotinate, it suggests a possible route of investigation into the functional role of this transporter, one that may, in some ways, overlap with that of OAT10.
SLC22A13 is primarily expressed in the kidney, and although we found no human protein expression data for SLC22A14, transcripts for this gene are found at low levels in the kidney and notably high levels the testis (Table 4), which is in concordance with its critical role in sperm motility and fertility in male mice (68). Future studies are required to determine the functional classification of this subgroup; however, our genomic localization and sequence-based analyses provide enough data to support the notion that these two belong in their own individual subgroup.

Members but has Interesting Functional Mechanisms and Disease Associations
The OAT-related subgroup is an outlier within the SLC22 family, consisting of the orphan transporters SLC22A17, SLC22A18, SLC22A23, and SLC22A31. SLC22A17 and SLC22A23 are strongly related, with greater than 30% shared amino acid identity. When these two transporters were initially identified together as BOCT1 (SLC22A17) and BOCT2 (SLC22A23), it was noted that they both show high expression levels in the brain, as well as a nonconserved amino terminus that may negate prototypical SLC22 function (69). SLC22A17 is known as LCN2-R (Lipocalin receptor 2) and mediates iron homeostasis through binding and endocytosis of iron-bound lipocalin, as well as exhaustive protein clearance from the urine as shown by high affinities for proteins such as calbindin (20,70). SLC22A23 has no confirmed substrates, but SNPs and mutations within this gene have medically-relevant phenotypic associations such as QT elongation, inflammatory bowel disease, endometriosis-related infertility, and the clearance of antipsychotic drugs (71)(72)(73). SLC22A31 is the most understudied transporter of the SLC22 family but has been associated with right-side colon cancer (74).  Figure 2A) (38,(75)(76)(77)(78)(79). All three members of this subgroup are expressed in the liver, kidney, and brain (Table 4). When taken with the transport of neurotransmitters, this subgroup serves as an example of inter-organ communication between the brain and the kidney-liver axis via transporters. The systemic levels of these neurotransmitters and thus, their availability to the brain can be regulated by the expression of OCT subgroup members in the liver, where the metabolites can be enzymatically modified, and expression in the kidney, which may serve as an excretory route (7).
GWAS data show that SLC22A4 (OCTN1), SLC22A5 (OCTN2/CT1), and SLC22A16 (CT2/FLIPT2) are heavily linked to carnitine and its derivatives (38). This is consistent with in vitro data showing that OCTN2 and CT2 are carnitine transporters (80,81). Although OCTN1 has lower affinity for carnitine than OCTN2 and CT2, it has high affinity for the endogenous antioxidant ergothioneine, which GWAS data suggests may be a shared metabolite with both SLC22A15 (FLIPT1) and CT2 (Supplementary Figure 2B) (38,82). SLC22A15 is associated with many complex lipids that are not characteristic of any other SLC22 transporter (60). This anomalous SLC22 member may only share one substrate with this subgroup, but its inclusion is  (Table 4) (4,7). This broad tissue expression pattern, in conjunction with our network analysis, supports the belief that these transporters' main task is transporting carnitine derivatives, as carnitine metabolism is an energy producing mechanism in nearly every cell. It may also play a role in regulating levels of the antioxidant ergothioneine, which is uniquely a substrate of this subgroup (41,83).

Multiple Sequence Alignment Further Supports the Classification of Subgroups
Our new subgroupings are primarily based on the endogenous function of the transporters, but they are also supported by additional analyses. These analyses are necessary, as structural and evolutionary similarities can predict functional traits that have yet to be discovered. Though the previously established phylogenetic subclades remain sound, our reanalysis includes new and updated amino acid sequences that support the proposed subgroups with more confidence, especially when investigating similarities within functional regions (10).
MSA programs were favored over phylogenetics because MSA searches are based upon structural similarities rather than evolutionary relatedness (84). These structural similarities, especially in the large ECD and large ICD of SLC22 proteins, may indicate shared function. The branching pattern of OATS3 member Oat-pg (Slc22a22) differs between tree variations. These analyses consistently indicate a similar relationship between Oat-pg and OATS3, as well as OATS4. However, in an analysis of the SLC22 ECDs, it is most closely associated with OATS3 over any other subgroup. This, in conjunction with shared substrate specificity with both SLC22A12 and SLC22A11, and not OATS4 members, supports its membership within the OATS3 subgroup (46,52,53,59).
In full-length sequence alignments, the grouping of SLC22A4, SLC22A5, and Slc22a21 is consistently conserved, while the topology of both SLC22A15 and SLC22A16 is irregular.
Despite this, analysis of the large ECD shows similarity between all OCTN/OCTN-related members other than SLC22A15. Previous analyses have noted the large difference between the ECD of SLC22A15 and all other SLC22 members which is supported by our analysis in Figure   2B (10). Interestingly, there appears to be some similarity between the large intracellular domains of SLC22A16 and SLC22A15. Although much of the support for the establishment of the OCTN/OCTN-related subgroup comes from functional data (Supplementary Figure 2B), the described MSA analyses highlight shared structural, and possibly functional, regions.

Analysis of Genomic Localization Highlights Evolutionary Relatedness of Subgroup Members and Suggests Basis of Coregulation
Genomic clustering within the SLC22 family has been previously described (10).  (Table 4) (85,86). It has been proposed that genes within clusters are coordinately regulated and thus are predicted to have similar overall tissue expression patterns. Support for shared regulatory mechanisms of subgroup members within genomic clusters can be inferred from similar patterns of tissue expression or by expression of subgroup members along a common axis of metabolite transport such as the gut-kidney-liver axis.
Genomic localization from the UCSC Genome browser and resultant tissue expression patterns for all SLC22 members are shown in Table 4.

Analysis of OAT Subgroup Specific Motifs Highlight Patterns Potentially Involved in Specificity
Motif analyses revealed subgroup specific motifs within functionally important regions, such as the large ICD, large ECD, and the region spanning TMD9 and TMD10, for all novel OAT subgroups (10,87). However, the number of unique residues appears to be correlated to the range of substrate specificity.
Of the newly proposed OAT subgroups, OATS2 claims the smallest number of subgroupspecific amino acid motifs and is the only subgroup without a specific motif in TMD9 ( Figure   3B). The lack of multiple subgroup-specific regions is interesting not only because this subgroup consists of a single transporter but also because this may be indicative of a more promiscuous transporter with a wide range of substrates, which is substantiated by the functional data. This pattern is also seen in OATS1, which consists of multi-and oligo-specific transporters OAT1, OAT3, and OAT6. In addition to having few subgroup-specific motifs, the multi/oligo-specific nature of this subgroup is reflected by the shared evolutionary conservation of the large extracellular domain with other OAT subclade members ( Figure 3A).
To further clarify the membership of Oat-pg in OATS3, evolutionarily conserved motifs were determined between all three members, as well as just Slc22a11 and Slc22a12. This analysis revealed a total of ten evolutionarily conserved amino acid motifs between all three members, eight of which are present in the analysis of only OAT4 and URAT1. Specifically, both analyses exhibited a notably large motif in the large intracellular loop found at D313-Q332 on URAT1 and Q312-G331 on OAT-PG ( Figure 3C, Figure 3D). This larger number of conserved regions correlates with a more limited range of substrates (eg. uric acid and prostaglandins) (56).
Motif analysis was performed separately on the OATS4 rodent and non-rodent specific subgroups and the entirety of the OATS4 subgroup members. In all analyses, OATS4 claims the largest number of evolutionarily conserved and subgroup-specific amino acid residues amongst the OAT subgroups, indicating selective substrate specificity, possibly for conjugated sex steroids ( Figure 3E, Figure 3F). In the case of non-rodent transporters, a unique motif spans the sixth extracellular domain and TMD12. This region is predicted to govern substrate specificity of transporters of the MFS, to which the SLC22 family belongs (87). Recent publications defining the substrate specificity of SLC22A24 point to a more narrow range of substrates and conservation of this specific region amongst OATS4 members may explain the association of conjugated steroid hormones with SLC22A9, SLC22A10, SLC22A24, and SLC22A25 in GWAS studies (60,61). Although further analysis is required to fully understand the relationship between structure and substrate specificity in SLC22 transporters, we provided a basis for investigation into specific regions that may determine functional patterns. The sequences and p-values for each motif are in Supplementary Tables 2-7.

Sequence Similarity Study Suggests Novel Potential Functions and Possible Tertiary Structure of SLC22
Each SLC22 member is a putative transporter, but there is evidence that suggests some members may have alternative mechanisms of action ( (48,70)). To further explore this possibility and to potentially find sequence similarity to other proteins, the specific amino acid sequences for the extracellular and intracellular loops of each SLC22 member were compared to all proteins in the ICM-Pro v3.8-7c database. The extracellular loop of mouse Slc22a16 shares 26% sequence identity (pP=5.4) with chicken beta-crystallin B3 (CRBB3). Beta-crystallin is a structural protein mainly comprised of beta sheets (88). The similarity between the ECD of mouse Slc22a16 and CRBB3 could point to potential for a beta sheet like configuration. Because none of the SLC22 family members have been crystallized, any insight into tertiary structure is of interest.
SLC22A31, a member of the divergent OAT-Related subclade, is the most ambiguous member of the SLC22 family with no functional data available. An investigation of the human SLC22A31 large ECD shows at least 30% shared sequence identity with RNA-binding protein 42 (RBM42) in mouse, rat, cow and human. This analysis also showed a 37% sequence identity (pP=5.5) shared between the ECD of hSLC22A31 and human heterochromatinization factor BAHD1. These and other interesting sequence similarities are noted in Table 5.

Discussion
In the years following the establishment of the previous SLC22 subclades, there has been a notable increase in functional data concerning these transporters and their substrates (10). With these data, we are now in a position to better characterize these transporters, which play important physiological roles. However, our newly proposed subgroups are not entirely dependent on functional data, as we have considered multiple approaches including phylogenetics, multiple sequence alignments, evolutionarily conserved motifs, sequence homology, and both tissue and genomic localization. Each of these approaches has individual value in that they reveal unique characteristics of each transporter, but it is the combination of multiple approaches that ensures nearly all available data (though still incomplete) for these transporters is considered when forming functional subgroups. We support the subgroups with a thorough literature search of metabolites associated with SLC22 proteins. Although the functional data is inherently biased due to the high level of interest in some SLC22 members, particularly the "drug" transporters OAT1, OAT3, OCT1, and OCT2, for the majority of the transporters, there is enough data to create functional subgroups that play distinct and overlapping roles in metabolism ( Figure 1, Supplementary Table 1). Genomic localization reveals evolutionary information and provides insight on how genes may arise from duplication events. Phylogenetic analysis determines the evolutionary relatedness of these proteins, while MSA, motif analysis, and sequence homology focus on structural similarities, which can be indicative of function. We often see that members of a subgroup are expressed in the same tissues or along functional axes. For example, substrates transported from the liver via SLC22 transporters (e.g., SLC22A1, OCT1) can be either excreted into or retrieved from the urine by other SLC22 members (SLC22A2, OCT2) of the same subgroup. Establishment of these functional subgroups may also inform future virtual screenings for metabolites of understudied transporters.
Protein families are established based on shared ancestry and structural similarity, which is commonly considered grounds for shared functionality. This is exemplified amongst SLC22 members with the generally shared structural characteristics of 12 TMDs, a large extracellular loop between TMD1 and TMD2, and a smaller intracellular loop between TMD6 and TMD7.
Despite these shared features, we show here that there are many functional differences between these transporters. Although our analyses mostly align with previous evolutionary studies when considering ancestry, here, we show that phylogenetic grouping is not always reflective of similar structure and function. For example, although the previously established OCTN subclade of SLC22A4, SLC22A5 and Slc22a21 does not share common ancestry with Slc22a16, the newly proposed group shares functional similarity and ECD homology. Thus, by expanding our investigation beyond phylogenetic relationships, we can now more appropriately group proteins from the same family and better understand their roles in endogenous physiology.
An important concept in the Remote Sensing and Signaling Network is that of multispecific, oligo-specific, and relatively mono-specific transporters working in a coordinated function (16). Multi-specific transporters are able to interact with a wide variety of structurally different compounds, oligo-specific with a smaller variety, and relatively mono-specific transporters are thought to interact with only one or a few substrates. Existing functional data suggests that it is unlikely that any truly mono-specific transporters exist within the SLC22 family, yet the different subgroups we have formed imply that multi-specific, oligo-specific, and relatively mono-specific transporters are more likely to form subgroups with transporters that share substrate specificity. Multi-specific transporters, like those in the OATS1 and OCT subgroups, handle a diverse set of drugs, toxins, endogenous metabolites, and signaling molecules (14,75). Conversely, the OATS4 subgroup appears to be a collection of relatively mono-specific transporters with an affinity for conjugated sex steroid hormones, specifically etiocholanolone glucuronide (61). Previous evolutionary studies have suggested that multispecific transporters arose before the mono-specific transporters (10). As evolution has progressed, more specific transporters have developed to handle the burden of changing metabolism. The multi-specific transporters have been more extensively characterized because of their importance in pharmaceuticals, but in the case of endogenous metabolic diseases, the oligo and mono-specific transporters may be more appropriate targets for drugs or therapies.
One of the best examples of multi-specific transporters working in concert with oligo, and mono-specific transporters is the regulation of uric acid (17,18). Handling of uric acid mainly occurs in the kidney, but when renal function is compromised, multi-specific transporters regulate their expression to compensate. Two proteins, SLC22A12 and SLC2A9, are expressed in the proximal tubule and are nearly exclusively associated with uric acid. The multi-specific transporters SLC22A6 and SLC22A8 are also present in the proximal tubule and are able to transport uric acid. When the kidney is damaged, one would expect serum uric acid levels to increase because most of the proteins involved in its elimination are in the kidney. However, this is alleviated due to the increased expression of ABCG2 and/or functional activity in the intestine (17,18). SLC2A9 is a relatively mono-specific transporter and ABCG2 (BCRP) is a multispecific ABC transporter. The example of uric acid serves to illustrate how, when certain mono-, oligo-, and multi-specific transporters are unable to perform their primary function, multispecific transporters of the same or different function (even of the ABC superfamily) can use their shared substrate specificity to mitigate the consequences. It is generally assumed that all SLC22 family members are transporters. However, Slc22a17, a member of the outlier Oatrelated subclade, functions as an endocytosed iron-bound lipocalin receptor and some SLC22 members have been suggested to function as "transceptors" due to homology with GPCR odorant receptors and shared odorant substrates (20,21). Thus, to better understand the SLC22 In the past, the majority of functional data has come from hypothesis-driven transport assays using cells overexpressing a specific SLC22 transporter and a single metabolite of interest. These assays lack uniformity and, as the OAT knockouts have shown, are not necessarily reflective of endogenous physiology (43,45,47). Recently, GWAS studies have linked many metabolites to polymorphisms in SLC22 genes, and in vivo metabolomic studies using knockout models have also identified several metabolites that may be substrates of transporters (43,45,47,60). In upcoming years, the integration of multiple types of omics data related to SLC22 family members with functional studies of transporters and evolutionary analyses will likely produce a more fine-grained picture of the roles of these and other transporters in inter-organ and inter-organismal Remote Sensing and Signaling.   n/a n/a n/a n/a Understudied Table 3: Combined functional data for OATS4. These data were manually curated and collected from genome-wide association, in vitro, and in vivo studies. Only statistically significant results from each study are included. Column A is the SLC22 transporter, column B is the metabolite, column C is the source of the data (rsid for GWAS, cell line for in vitro, and the physiological measurement for in vivo), column D is the quantitative metric (p value for GWAS, Km, Ki, IC50, or inhibition percentage compared to control for in vitro, and p value for in vivo), and column E is the citation. Table 4: Genomic localization and tissue expression of SLC22 family. The following table describes the genomic localization and tissue expression patterns of all SLC22 members excluding the mouse-specific Slc22a19, Slc22a26, Slc22a27, Slc22a28, Slc22a29, and Slc22a30. Slc22a22 and Slc22a21 expression patterns described are from mouse (59,64). (m) denotes expression patterns observed exclusively in mice. Tissue expression data in humans was collected from various sources and databases (4,24,64). Expression is assumed from mRNA expression analysis, unless confirmed experimentally. Bolded transporters within subgroups are found in tandem within the human genome.  1: Pruned SLC22 network. All SLC22 transporters with functional data were initially included. Metabolites associated with only one transporter were removed for improved visualization. SLC22 transporters and metabolites are colored nodes. Each edge represents a significant transporter-metabolite association. Multiple edges connecting one metabolite to a specific transporter were bundled (e.g., in vitro and GWAS support).  F) OATS4 mapped onto mSlc22a7. In each panel, red sequences are subgroup specific motifs, blue sequences are OAT-major subgroup motifs, and green diamonds represent non-synonymous SNPs that affect serum metabolite levels. Conserved OAT-major subgroup motifs are assigned letters and specific, conserved OAT subgroup motifs are numbered. Data, including motif sequence identities, exact locations, and p-values can be found in Supplementary Tables 2-7.