Subcellular Localization Signals of bHLH-PAS Proteins: Their Significance, Current State of Knowledge and Future Perspectives

The bHLH-PAS (basic helix-loop-helix/ Period-ARNT-Single minded) proteins are a family of transcriptional regulators commonly occurring in living organisms. bHLH-PAS members act as intracellular and extracellular “signals” sensors, initiating response to endo- and exogenous signals, including toxins, redox potential, and light. The activity of these proteins as transcription factors depends on nucleocytoplasmic shuttling: the signal received in the cytoplasm has to be transduced, via translocation, to the nucleus. It leads to the activation of transcription of particular genes and determines the cell response to different stimuli. In this review, we aim to present the current state of knowledge concerning signals that affect shuttling of bHLH-PAS transcription factors. We summarize experimentally verified and published nuclear localization signals/nuclear export signals (NLSs/NESs) in the context of performed in silico predictions. We have used most of the available NLS/NES predictors. Importantly, all our results confirm the existence of a complex system responsible for protein localization regulation that involves many localization signals, which activity has to be precisely controlled. We conclude that the current stage of knowledge in this area is still not complete and for most of bHLH-PAS proteins an experimental verification of the activity of further NLS/NES is needed.


Introduction
The bHLH-PAS (basic helix-loop-helix/ Period-ARNT-Single minded) proteins are a family of transcriptional regulators commonly occurring in living organisms [1,2]. They play a significant role in essential physiological and developmental processes [2] and a number of them participates in adaptive responses to generalized and cellular stress [3]. bHLH-PAS family members act as intracellular and extracellular "signals" sensors, initiating primary response to endogenous compounds, foreign chemicals, gas molecules, redox potential, light, and others [4].
Despite the fact that bHLH-PAS proteins perform a high diversity of function, their structural properties have been well-conserved during evolution ( Figure 1) [1]. The N-terminal part comprises two domains: bHLH and PAS. The bHLH domain can be divided into two specific fragments: the basic region responsible for DNA binding, and the HLH region, which takes part in protein dimerization [5]. The followed PAS domain comprises two structurally conserved regions named PAS-1 and PAS-2 [6], separated by a poorly conserved link [1]. The PAS-1 takes part in a dimerization partner selection and ensures the specificity of target gene activation [2]. The PAS-2 is typically responsible for ligand binding and sensing diverse exogenous and endogenous signals, which enable protein activity regulation [2,7]. bHLH-PAS proteins, like many other transcription factors (TFs), dimerize with other family members to create functional heterodimer acting as a functional complex regulating the expression of genes under its control [4]. Consequently, most of bHLH-PAS proteins can be divided into two classes: class I proteins whose expression is specifically regulated by diverse physiological states and/or environmental signals [11], and class II proteins expressed in a constitutive way and serving as heterodimerization partners for class I members [4].
For a number of TFs regulating gene expression in response to extracellular signals, translocation from the cytoplasm to the nucleus is an important event, enabling TFs to recruit coactivators [12,13]. As shown, steroid/nuclear receptors continuously shuttle between the cytoplasm and the nucleus and their localization at any time is a consequence of the fine balance between the operational strength of the sequences for the nuclear localization signal (NLS) and the nuclear export signal (NES) [14]. The nuclear transport is usually mediated by the family of transport receptors known as karyopherins: importins, responsible for nuclear import, and exportins, like exportin-1 (CRM1, Chromosome region maintenance 1 protein homolog), responsible for nuclear export [15]. Karyopherins recognize specific NLS/NES signals presented at cargo protein to create the transporting complex [15]. The best-described transport signal responsible for nuclear import is the classical NLS (cNLS) consisting of monopartite or bipartite motifs rich in basic amino acid residues [16]. The most popular NES is a non-conserved motif-containing hydrophobic residues including leucine residues repeats [15]. Known inhibitor of protein export dependent on exportin-1 and often used for studies of NES activity is Leptomycin B (LMB) [17,18].
bHLH-PAS proteins are an important class of TFs which activity depends on nucleocytoplasmic shuttling. To perform their function as sensors, indispensable is receiving the signal in cytoplasm and its transduction, by translocation, to the nucleus. It leads to the activation of particular genes transcription. Previously, we have performed systematic research of NLS and NES motifs in Drosophila melonagaster Methoprene tolerant protein (MET) [19] and its paralog-germ cell-expressed (GCE) protein [20]. Then, we carried out similar analyzes for mammalian NPAS4 protein [21]. We have determined an interesting pattern of overlaying signals with opposing activity in bHLH and PAC domains as well as in C-termini. Our results were the first, suggesting the presence of multiple localization signals regulating TF shuttling and their complex localization pattern. Now, we ask a question if it is more general mechanism of bHLH-PAS proteins localization regulation and how precise and systematic studies in this area are reported to date.
In this review, we present results of performed predictions in the context of published data concerning NLSs/NESs taking part in regulation of bHLH-PAS TFs shuttling. As most of the published research and predictions were performed previously only for selected fragments of these proteins, the current state of knowledge in this area is still not complete.

AHR Localization Regulation
AHR is the only known bHLH-PAS cytoplasmic sensor activated by small ligands. It is involved in dioxin and related environmental pollutants metabolism [32]. It was shown [33] that non-ligated AHR creates an inactive heterodimer with Hsp90 chaperone protein and is present in the cytoplasm. Hsp90 prevents AHR proteolysis and maintains the receptor in a conformation susceptible to ligand binding [34]. Ligand binding enforces translocation of AHR to nucleus, where AHR dimerizes with its partner bHLH-PAS protein: ARNT, to create a functional complex [35]. The AHR/ARNT heterodimer regulates genes encoding xenobiotic metabolizing enzymes and mediates the severe toxicity, comprising a wasting syndrome, hepatotoxicity, teratogenesis, and tumour promotion [13,36]. Although the AHR is well-studied as a mediator of the toxicity, its normal physiological function still remains unclear [37]. Recent research data support a hypothesis that the AHR contribute to the proper functioning of the immune, hepatic, cardiovascular, vascular and reproductive systems [4,38]. AHR was shown to play a significant role in the cross-talk of signalling pathways governing cell proliferation, morphology, adhesion cell migration and cell cycle [38]. It has an important function in the regulation of hematopoietic stem cells (HSCs) [37]. Additionally, overexpression and constitutive activation of the AHR have been observed in various types of tumour [39].
AHR, as a TF, should be able to enter the nucleus, however maintaining of predominantly cytoplasmic compartmentalization is important for its ligand binding [40]. It was shown that human AHR possess, in the N-terminal bHLH domain, opposing signals: NLS (13-39aa) [35] and NES (55-75aa) [41]. Interestingly, Hsp90 is responsible for masking NLS activity in bHLH domain of unligated AHR [34]. Additional NES residing in the PAS domain of mARNT (214-222aa) [42] was shown to be active independently from the ligand binding. This signal sequence is highly conserved in hAHR (220-228aa). Very recently, Tkachenko et al. [40] described another putative NLS (648-671aa) and NES in the C-terminus of hAHR. Activity of this NES depends on the presence of V647 (or I647) residue. Interestingly, these signals are in close proximity or even partially overlapping.
The results of our predictions suggest the possible presence of additional localization signals in AHR (Table 1, Supplementary Materials 1, Supplementary Materials Summary), especially NLS in the PAS-2 domain (area of 247-280aa) and NES (area of 538-552aa) in the C-terminal region.

HIF-1-3α Localization Regulation
The hypoxia-inducible factor α-subunits (HIF-α) are key transcription factors in the mammalian response to oxygen deficiency. To achieve an adequate function, HIF-α levels, subcellular distribution, and activity, have to be tightly regulated [43]. The mammalian hypoxia inducible factor 1α (HIF-1α), plays an essential role in cellular and systemic oxygen homeostasis as cytoplasmic sensor [44]. The HIF-1α/ARNT heterodimer regulates genes transcription activity related to angiogenesis, erythropoiesis, glycolysis, iron metabolism and cell survival [45]. Interestingly, the HIF-α proteins are not only regulated by hypoxia, but also in response to various stresses, growth and coagulation factors, hormones, or cytokines under normoxia conditions [43]. Accumulation of HIF-1α in nucleus is observed during early development of organs, in response to ischaemia and in tumour tissue [45]. Thus, the activation of HIF-1α has been associated with proper embryonic development and with many diseases, such as cancer, stroke, and heart disease that generate a hypoxic microenvironment [44].
Under normoxic conditions, the HIF-1α is ubiquitinated by von Hippel-Lindau tumour suppressor (pVHL), translocated to the cytoplasm and targeted for proteasomal degradation [46]. Contrary, in the hypoxic conditions, HIF-1α becomes stable and is translocated to the nucleus [47], where dimerizes with ARNT creating functionally active complex. Phosphorylation of HIF-1α at S641/S643 residue by Extracellular signal-regulated kinase (ERK) masks an adjacent exportin-1-dependent NES and inhibits HIF-1α nuclear export, thereby increasing its nuclear concentration and transactivation ability [48]. Recently, Mylonis et al. [49] demonstrated an unconventional, controlled by ERK, non-genomic and anti-apoptotic function of HIF-1α. The protein can serve as an early protective mechanism upon oxygen limitation and promote cancer cell resistance to chemotherapy. Also, recently Depping and co-workers [50] proposed, that the modulation of nuclear translocation of HIFs could provide therapeutic applicability to tumours. Interestingly, one of factors influencing HIF localization, is insulin promoting its nuclear translocation [51].
It was shown that human HIF-1α possess two NLSs: one in the bHLH domain (17-33aa) and the other, responsible for import to nucleus in hypoxia, in the C-terminal part of protein (718-721aa) [12]; (717-757aa) [52]. Kallio et al. [12] proposed that PAS-2 is responsible for repression of nuclear import but did not indicate any NES position. Then, Mylonis et al. [48] documented phosphorylation-dependent NES (616-658aa) in the C-terminal part of protein. Importantly, both NLS and NES in C-terminus are located in close proximity. Zheng et al. [53] showed that localization of HIF-1α was cell-type dependent. All these facts confirm complex system of protein localization regulation, involving many localization signals and their precisely balanced control. Interestingly, Chun with colleagues showed that short variant of HIF-1α (1-516aa) without the C-terminal NLS, resides in cytoplasm and is not able to enter nucleus [54]. It suggests the possibility of the presence of additional NES, not detected to date.
Based on in silico analyses of the HIF-1α sequence, performed with various NLS and NES predictors ( Table 1, Supplementary Materials 1, Supplementary Materials Summary), we suggest the presence of additional localization signals. While further NLSs were predicted by a single predictor, NES not described previously (area of 558-572aa) was predicted by most of used NES predictors. Putative motif is located in N-terminal area of protein (1-651aa) previously determined as cytoplasmic [12], which confirms the probability of NES presence in this region of protein. We emphasize the need for experimental verification of supposed NES activity in PAS-1 domain (82-97aa).

SIM1-2 Localization Regulation
The other members of bHLH-PAS family: SIM1 and SIM2, are capable of binding mammalian HRE (Hypoxia Response Element) sequences as heterodimers with ARNT. This results in the competition between HIF-1α and SIM proteins for binding to ARNT and DNA, and consequently, in attenuation of hypoxic reporter gene transcription in hypoxia. Such a complex interplay between the bHLH-PAS proteins in cells where the factors are co-expressed, may enable adaptation of the cell to multiple environmental and developmental signals [57].
Conserved NLS in SIM1 (368-388aa) and SIM2 (367-386aa) was described by Yamaki et al. [58] in the middle part of proteins. The deletion of NLS resulted in cytoplasmic localization. Also separated SIM1 fragments: 1-289aa and 295-333aa, were located in the cytoplasm. The latter fragment (295-333aa) comprises highly hydrophobic aa residues and is suggested as NES signal [58]. Such idea is consistent with our prediction, indicating the presence of putative NES in this area (Table 1, Supplementary Materials 1, Supplementary Materials Summary). In the case of SIM2, fragment comprising 530-760aa residues was detected in the cytoplasm [58]. However, no further studies concerning putative NES were performed. The results of our in silico analyses show for both, SIM1 and SIM2, highly probable NLS in bHLH domain (SIM1 2-33aa; SIM2 2-35aa). Additionally, NLS in C-terminus of SIM2 (556-586aa) was predicted both by cNLS Mapper and NLStradamus (Table 1, Supplementary Materials 1, Supplementary Materials Summary). Predictors indicated also some sequences as NES candidates, especially in the bHLH domain.

CLOCK Localization Regulation
Circadian rhythms are internal processes that regulates all physiological functions and activities [59]. These rhythms, generated by the circadian clock, allow organisms to adjust their biology and behaviour to the daily light-dark cycles, as well as nutrition availability [60,61]. Circadian rhythms rely on the negative feedback loops: the gene activation is later repressed by its own protein product and the cycle can reinitiate [61]. The mammalian clock is activated by bHLH-PAS proteins CLOCK [62] and BMAL1 [63], which dimerize to create an active complex [61]. CLOCK/BMAL1 initiate the transcription of its own inhibitors, the PERIOD (PER) and CRYPTOCHROME (CRY) proteins. PER and CRY dimerize in the cytoplasm and translocate to the nucleus to inhibit CLOCK/BMAL1 and to stop further transcription. As suggested, CRY can play a significant role in the animal circadian system as blue-light photoreceptor for photo entrainment of the circadian clock [64,65].

NPAS1-4 Localization Regulation
NPAS1 (MOP5) was detected only in certain regions of the brain [69] and was connected with neurogenesis and schizophrenia [70]. NPAS1 was shown to be able to repress the transactivation functions of both ARNT and ARNT2 [71]. The first localization studies determined NPAS1 as a nuclear protein [70,72]. However, a few years later, Teh et al. [71] revealed the presence of NES located in PAS-2 domain of mNPAS1 (310-317aa). They proposed, that nuclear localization of NPAS1 depends on the dimerization with ARNT. In the absence of this partner, mNPAS1 is located in cytoplasm, as a result of the detected NES activity. Interestingly, Leptomycin B (inhibitor of protein export dependent on exportin-1) did not inhibit activity of this NES [71]. Contrary, the point mutation of this NES resulted in nuclear localization suggesting the activity of NLS/NLSs. However, authors declared no positive result for NLS prediction [71]. We performed in silico analyses on the human NPAS1 sequence using currently accessible NLS and NES predictors (  (33-76aa) and in the close proximity to PAS-1 (89-128aa) and PAS-2 (250-284aa) domains. Interestingly, additional NESs were predicted for bHLH (64-78aa and 87-101aa) and PAS-2 (273-287aa) domains, suggesting complex pattern of subcellular localization, based on many opposite and partially overlapping signals in NPAS1.
NPAS2 is a gas-responsive TF, which dimerize with BMAL1. Similarly to CLOCK, NPAS2 regulates gene expression as a function of day-night cycle. As a molecular sensor, NPAS2 regulates circadian oscillation of metabolic pathways including heme catabolism. Interestingly, both PAS domains of this protein are able to bind heme. The second candidate for NPAS2 ligand is CO [73]. The localization of monomeric NPAS2 was shown to be mainly nuclear. However, NPAS2 was detected also in cytoplasmic fraction. After heterodimerization with BMAL1 localization was exclusively nuclear [74]. To date, there is no published research concerning localization signals in NPAS2. However, our in silico predictions (  (4-47aa), proposed by all involved predictors. We suggest experimental verification of this motif activity. Importantly, NLS in bHLH domain was detected previously for NPAS2 homolog-CLOCK [67]. Additionally, a candidate NES sequence is located in C-terminus of protein (525-540aa).
NPAS3 is highly homologous to NPAS1 and also expressed in the brain [75]. The first localization study situated 901aa isoform of NPAS3 (Q8IXF0-4) in the nucleus, as a result of activity of bipartite NLS in the C-terminal part (568-585aa) [76]. Very recently, Luoma et al. tested the localization of the canonical human NPAS3 isoform (933aa) in HEK293T cells. The bHLH domain was localized in the cytoplasm, what they discussed as consistent with the result of their predictions with NetNES server [77]. However, no experimental verification of predicted NES was performed. Authors confirmed also the activity of NLS in C-terminus, previously defined by Kamnasaran et al. [76]. The expressed C-terminal part of protein (451-951aa) was detected exclusively in the nucleus. Further analysis revealed that NPAS3 alone is present both in nucleus and cytoplasm, while co-expression with ARNT resulted in mainly nuclear localization [77]. Our predictions (  (130-161aa), linker between PAS domains (266-297aa) and C-terminal part of protein (727-773aa). Additionally, most of predictors proposed putative NES in bHLH domain (88-103aa). Again, subcellular localization signals with opposite activity and located in close proximity were predicted in bHLH domain of bHLH-PAS protein.
NPAS4 was discovered in mammalian neurons [78], however further studies detected it also in non-neuronal tissues, like in human endothelial cells [79]. NPAS4 expression is highly induced to protect pancreatic β-cells from ER stress [80] and neuronal cells after ischaemia [81,82]. NPAS4 preferably interacts with class II bHLH-PAS partner -ARNT2, but interaction with ARNT/BMAL1 is also possible [78,82]. The first studies concerning subcellular localization of NPAS4 in mammalian cells, revealed strictly nuclear localization of this protein [83]. Shammlo et al. substantiated this finding for mammalian cells culture [81], however authors reported also expression of NPAS4 in the cytoplasm of rat coronal brain tissue [81]. Finally, Sullivan et al. [84] reported that NPAS4 although mostly nuclear in mammalian cells, was also present in the cytoplasm. The cytoplasmic localization of this protein suggests its additional roles in different cellular processes. NPAS4 was shown to induce autophagy in rat primary cortical neurons and to degrade tau proteins involved in the pathogenesis of Alzheimer's disease and other tauopathies [85]. The discrepancies in localization studies concerning NPAS4, led us to perform a detailed characterization of the subcellular localization motifs in NPAS4 sequence [21]. NPAS4 localized in the nucleus or the cytoplasm of COS-7 and N2a cells. The proportion of nuclear to cytoplasmic NPAS4 was dependent on the glucose concentration in the medium. Furthermore, cytoplasmic localization of NPAS4 was LMB sensitive. In silico analysis suggested the presence of NLSs in the bHLH domain, the PAS-2 domain and the C-terminal region of NPAS4. Accordingly, putative NESs were expected to be located in the bHLH domain, the PAS-2 domain and the C-terminal region of NPAS4 [21]. For the purpose of this review, we have repeated all predictions ( Table 1, Supplementary Materials 1, Supplementary Materials Summary), which are generally still consistent. We performed detailed in vivo experiments, which revealed the presence of two overlapping signals: NLS (10-52aa) and NES (26-45aa) in bHLH domain. Simultaneously in the region adjacent to PAS-2 domain and within this domain in close proximity are located: NLS (158-191aa), NES (227-242aa) and putative NLS (285-316aa) [21]. We demonstrated that C-terminal region of NPAS4 contains overlapping NES (591-600aa) and putative NLS (593-622aa). Additionally, we detected NES activity in the 460-580aa region, however this area was not predicted as NES and we were not able to identify precisely the location of specific sequence [21].

Regulation of the Subcellular Localization of Class II of bHLH-PAS Transcription Factors: ARNT1-4
As mentioned previously, the class I bHLH/PAS proteins dimerize with class II members, to create a functional transcription factor complex, regulating the expression of genes under their control [4]. One of the most common class II representant is the aryl hydrocarbon receptor nuclear translocator (ARNT, HIF-1β), acting as a dimerization partner for several class I proteins: AHR, HIF family, SIM1 and SIM2 [86]. However, each heterodimeric complex cooperate with distinct cis-acting DNA elements, to regulate extremely different genes and pathways. Therefore, ARNT participates in signal transduction pathways engaged in the xenobiotics metabolism, angiogenesis, vasculogenesis, hypoxia response and many various developmental processes [86]. Importantly, several ARNT variants have been identified in mammals: ARNT2, BMAL1 (ARNTL, ARNT3) and BMAL2 (ARNTL2, ARNT4). The defined bHLH and PAS domains are conserved in size and location between all proteins. However, all variants differ in the length of aa sequence [3], what results from a high divergence between their C-terminal parts [86]. Importantly, bHLH-PAS proteins C-termini are believed to be responsible for proteins function modulation and their activity regulation [2]. ARNT and ARNT2 differ in tissue distribution and only ARNT create functional dimer with AHR and SIM proteins [86]. While mRNA of ARNT is expressed in all tissues, mRNA of Arnt2 was detected only in brain, kidney, and embryos [87]. Similarly BMAL1 and BMAL2, involved in circadian rhythm signalling and expressed in an oscillatory manner [88], present different tissue distribution. BMAL1 transcripts are highly expressed in brain, skeletal muscle and heart, while BMAL2 mRNA is expressed predominantly in the human fetal brain and adult liver [89]. Recently, it was shown that insulin promotes BMAL1 S42 phosphorylation and interaction with 14-3-3 protein affecting localization of this protein by exclusion from nucleus to cytoplasm [90].
The first studies, concerning localization of ARNT were inconclusive. However, the following analysis showed that ARNT is located predominantly in nucleus, both in the absence and in the presence of ligands and suggested a putative NLS, conserved between ARNT proteins, in their N-terminal part [91]. In the same year, Eguchi et al. [92] proposed that ARNT though being mainly nuclear protein, in some conditions can translocate to cytoplasm. Authors detected active NLS in bHLH domain of human ARNT (39-61aa), which mutation resulted in shift of mutant to cytoplasm. The determined NLS motif is conserved for mouse homologues: ARNT and ARNT2, which suggests that its activity could be also conserved for human ARNT2 [92], localized in the nucleus of almost all transfected cells. Inhibition of this NLS activity in ARNT2/R46W mutant, shifts this mutant to the cytoplasm [93]. These findings suggest the presence of NES in both ARNT and ARNT2. Dougherty [86] proposed the presence of active NES in PAS-1 domain of ARNT2 due to the sequence similarity with NES documented for BMAL1 [94]. Finally, we performed in silico analyses in the context of putative NLSs and NESs presence in both human ARNT and ARNT2 ( Table 2 BMAL1 was shown to localise equally in nucleus and cytoplasm when expressed alone, while localized strictly in nucleus in the presence of NPAS2. To characterise motifs responsible for this protein shuttling, Kwon et al. [94] performed in silico analysis of murine BMAL1. They proposed as putative NLSs, sequences in 36-40aa and 82-88aa areas, while NESs in 109-116aa, 142-152aa and 360-369aa areas. Performed mutagenesis studies revealed, that active NLS is located in proximity to bHLH domain (36-40aa), while active NESs are located in PAS-1 (142-152aa) and PAS-2 (360-368aa) domains [94]. Interestingly, though Kwon et al. suggested, that putative NLS (82-88aa) is not active, Tamaru et al. [95] showed that for nuclear localization of BMAL1, phosphorylation of S90, which is located adjacent to predicted NLS, is necessary [94]. We performed additional NLS/NES predictions, using currently available servers ( Table 2, Supplementary Materials 2, Supplementary Materials Summary). While results of NES prediction were mostly consistent with these performed by authors, we found further putative NLSs located in linker between PAS domains (244-279aa) and in the C-terminus of protein (471-502aa).
Ikeda with coworkers [89] documented the presence of two NLSs (13-16aa and 32-35aa) in N-terminal part of BMAL2. Although deletion of N-terminal part of protein resulted in cytoplasmic localization of mutant [89], no in silico or experimental analysis were performed for identification of putative NES. We performed predictions of both putative NLS and NES motifs ( Table 2, Supplementary Materials 2, Supplementary Materials Summary). Results suggest the presence of three signals not described to date: NLS (103-127aa) and NES (140-156aa) in bHLH domain and an additional NLS (279-322aa) in linker between PAS domains. Interestingly, second putative NES (238-252aa) is located close to the same region between PAS domains.

Regulation of Subcellular Localization of Drosophila melanogaster bHLH-PAS Transcription Factors
Insect growth and development are controlled by the coordinated action of two hormones: 20-hydroxyecdysone (20E) and juvenile hormone (JH) [96]. Although the 20E receptor has been studied extensively, the identity and function of the JH receptor has long remained elusive [97]. Finally, in 2011, MET has been confirmed as JH receptor [98]. MET belongs to the bHLH-PAS family and prevents precocious metamorphosis of D. melanogaster during development [99]. The deletion of met gene is lethal to most species of insects, however in D. melanogaster exists MET paralog-GCE, ensuring survival of the met null mutants [100]. The functions of paralogues are not fully redundant and proteins exhibit tissue-specific distribution [101]. It was shown that MET is able to create inactive homodimers and heterodimers with GCE, in the absence of JH [102].
The first studies concerning MET subcellular localization were highly inconsistent [103,104]. Later, we have shown, that MET is able to translocate from the cytoplasm to the nucleus [19]. The translocation is associated with the presence of the Hsp83 (Drosophila homolog of Hsp90), which seem to be indispensable for JH binding and MET transport to nucleus [105]. We have shown that MET was mainly located in nucleus. However, some cells presented strictly cytoplasmic distribution. We identified dominant NLS in PAS-1 domain (98-102aa) and JH-inducible NLS (482-498aa) in PAS-2 domain. MET NESs are located in PAS-1 (126-139aa) and PAS-2 (446-456aa) as well as in the C-terminus. As no NES motif was predicted in the C-terminal area, this signal was difficult to identify [19]. Finally, we determined location of this NES (589-631aa) by sequence alignment with GCE [20]. Interestingly, in repeated for this review prediction, we got some positive result in this area (Table 3, NetNES L603, Supplementary Materials 3, Supplementary Materials Summary), however, still with very low probability. As mentioned previously, MET and GCE functions are not fully redundant. We showed that, in contrast to MET, GCE was distributed in both nucleus and cytoplasm. The homology between GCE and MET NLSs and NESs activities occurs only in the PAS-2 domains. Dominant NLS localized in MET PAS-1 domain is absent in GCE, while unique for GCE, NLS in C-terminus (840-859aa) can be distinguished. Interestingly, activity of this signal is crucial for ligand activation of NLS (580-610aa) in PAS-2 domain and GCE translocation to the nucleus [20]. It is worthy to note, that in our studies [20], we used 689aa isoform of GCE, which lacks 270 N-terminal aa residues. According to personal communication with A. Baumann, GCE used in that study and its longer variant [106], deposited in UniprotKB database as Q9VXW7, show similar function when expressed in transgenic D. melanogaster. For purpose of this review, we performed in silico predictions with use of the full-length GCE (Table 3, Supplementary Materials 3, Supplementary Materials Summary) and we renumbered experimentally documented sequences according to 959aa isoform. As suspected, there is no highly probable NLS motif in the first N-terminal 270aa area, however NetNES predicted 44-50aa as NES, which should be experimentally verified. We believe that all others active signals in MET and GCE proteins are well described [19,20], and all other predicted motifs are artefacts (Table 3). However, we realise that neighbouring or overlapping signals with opposite activities can mask each other and their detection can be difficult. On the other hand, we have revealed the presence of signals, not predicted by any of used servers. Therefore, we can conclude that only very systematic and detailed studies can lead to NLS/NES motif determination.
D. melanogaster is highly resistant to the lack of oxygen [107]. Interestingly, its hypoxia-responsive system shows significant similarity to mammalian system and is based on the activity of two bHLH-PAS proteins: SIMA and TANGO [107,108]. SIMA, as a homologue to mammalian HIF-1α, is sensitive to oxygen tension and shuttles continuously between the nucleus and the cytoplasm. It is mostly cytoplasmic in normoxi and accumulated in the nucleus in hypoxia. The nuclear export is mediated by exportin-1 [55]. The second protein -TANGO is expressed constitutively, similarly to its mammalian class II homolog -ARNT, and dimerize with SIMA to create active heterodimer regulating appropriate gene expression [107]. As shown, TANGO subcellular localization is developmentally regulated. TANGO with no dimerization partner is found predominantly in the cytoplasm, while in the presence of SIMA and Trachealess (TRH) is translocated to the nucleus [1,109]. Romero et al. [55] performed detailed in silico analysis of the SIMA sequence and determined putative NLSs in 537-568aa, 1210-1230aa, and 1406-1409aa areas, and putative NESs in 92-101aa, 115-124aa, 1011-1020aa and 1131-1140aa areas. They proved experimentally the NLS active motif (1210-1230aa), being a highly conserved sequence between all hypoxia factors. Also, active NESs (92-101aa and 115-124aa) were determined in the SIMA bHLH domain [55]. We have used currently available predictors for SIMA sequence analysis and we got some additional positive results (Table 3, Supplementary Materials 3, Supplementary Materials Summary). We suggest, that especially interesting would be experimental verification of putative NLS in bHLH domain of SIMA (46-91aa), predicted with high probability by most of used servers.
In the case of TANGO, no NLS/NES motifs were documented to date, as no detailed studies were performed. Using ClustalX server (www.clustal.org), we performed alignment of TANGO with ARNT sequence. However, it was no conservation in ARNT documented NLS area (Supplementary Materials 3, Supplementary Materials Summary). Our predictions suggest NLS activity in bHLH and PAS-2 domains, while NESs are proposed in PAS-1 and PAS-2 domains. We would recommend testing these experimentally, especially the activity of predicted with high probability NLS (23-54aa) in bHLH domain and putative NES (259-272aa) within PAS-2 domain.

Concluding Remarks
The subcellular distribution of the bHLH-PAS proteins is one of mechanisms regulating their functions and activities. Recently it was shown, that in addition to functioning as transcriptional regulators, some of bHLH-PAS transcription factors, when located in cytoplasm, take part in regulation of translational processes [110][111][112].
Most of bHLH-PAS family members are known to possess NLS and/or NES motifs in defined bHLH /PAS domains (Figure 3). The presence of localization signals within defined domains responsible for specific interactions and ligands binding can make the regulation of subcellular translocation highly complex interplay of different factors. Importantly, in the case of many transcription factors presented in this review, at least one of NLS/NES motifs is located also in the C-terminal region (Figure 3) [19][20][21]48,52,55,76] responsible for interaction with activators/repressors influencing protein activity. The simultaneous presence of NLS and NES with similar strength can be the reason for ubiquitous localization of proteins, or specific parts of proteins in cells and/or difficulties with precise signal detection. The presence of multiple localization signals with opposing activities enables complex and precisely balanced regulation of bHLH-PAS TFs shuttling, by masking and unmasking of specific localization signals in different parts of proteins in response to different stimuli. Interestingly, predicted in our review NLSs/NESs are often overlapping or located in close proximity. We believe that subcellular localization of presented TFs depends on the integrated action of several (no single!) localization sequences, including those not identified to date. The activity of signals can be modulated by ligands, posttranslational modifications (PTMs) and interactions with partner proteins. We emphasize the need of additional detailed studies to identify not described NLS/NES motifs.

Acknowledgments:
The authors apologize to investigators whose contributions were not cited more extensively because of space limitations.

Conflicts of Interest:
The authors declare no conflict of interest.