The PUF Protein Family: Overview on PUF RNA Targets, Biological Functions, and Post Transcriptional Regulation

Post-transcriptional regulation of gene expression plays a crucial role in many processes. In cells, it is mediated by diverse RNA-binding proteins. These proteins can influence mRNA stability, translation, and localization. The PUF protein family (Pumilio and FBF) is composed of RNA-binding proteins highly conserved among most eukaryotic organisms. Previous investigations indicated that they could be involved in many processes by binding corresponding motifs in the 3′UTR or by interacting with other proteins. To date, most of the investigations on PUF proteins have been focused on Caenorhabditis elegans, Drosophila melanogaster, and Saccharomyces cerevisiae, while only a few have been conducted on Arabidopsis thaliana. The present article provides an overview of the PUF protein family. It addresses their RNA-binding motifs, biological functions, and post-transcriptional control mechanisms in Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces cerevisiae, and Arabidopsis thaliana. These items of knowledge open onto new investigations into the relevance of PUF proteins in specific plant developmental processes.


Introduction
In most eukaryotic organisms, gene expression is commonly regulated at the transcriptional and posttranscriptional levels; this is considered as a powerful strategy for these organisms to flexibly adapt their growth and development to environmental inputs. Extensive investigations have reported that RNA-binding proteins (RBPs) regulate many aspects of RNA processing, such as RNA splicing, polyadenylation, capping, modification, transport, localization, translation, and stability, called RNA metabolism [1][2][3][4][5]. The resolution of protein structures and the functional characterization of RBPs have shown that these proteins possess several conserved motifs and domains such as RNA-recognition motifs (RRMs), zinc fingers, K homology (KH) domains, DEAD/DEAH boxes (highly conserved motif (Asp-Glu-Ala-Asp) in RNA helicases), Pumilio/FBF (Caenorhabditis elegans Pumilio-fem-3 binding factor, PUF) domains, and pentatricopeptide-repeat (PPR) domains [6].
The Pumilio RNA-binding protein family-the PUF family-is a large family of RBPs found in all eukaryotes; the number of PUF gene copies in each model organism is highly variable. The PUF family is mainly involved in post-transcriptional control by binding to specific regulatory cis-elements of their mRNA targets. Through this interaction they govern RNA decay and translational repression [7]. They also act by promoting ribosome stalling and facilitating the recruitment of microRNAs (miRNAs) and chromosomal instability [8][9][10][11]. Therefore, PUF protein influences the expression level of their target gene dramatically through the post-transcriptional level. For example, Puf6p can inhibit Asymmetric Synthesis of HO (ASH1) mRNA translation in yeast. The experiment from Gu et al. showed that the ASH1 in puf6 mutant had a higher expression level than in the wild type [12]. Suh et al. indicated that FBF, a PUF protein in Caenorhabditis elegans, can represses gld-1 expression through interact with gld-1 mRNA. GLD-1 level increased approximately sixfold in fbf mutant than in wild type [13].
The present article provides a rapid overview of PUF proteins, especially their binding motifs, biological functions, and regulation mechanisms, which seem to be conserved among eukaryotes. Given that our knowledge of the functional roles of RBPs in plants is lagging far behind our understanding of their roles in other organisms, this article ends by briefly underlying the interest of investigating the role of the PUF family in certain key mechanisms of plant functioning.

RNA-Binding Target of PUF Proteins
Drosophila melanogaster Pumilio (DmPUM) and Caenorhabditis elegans Pumilio-fem-3 binding factor (FBF) are the two founding members of the PUF protein family [14]. These canonical PUF proteins contain an extensively conserved RNA-binding domain (the Pumilio homology domain, PUM-HD), composed of eight consecutive α-helical PUF repeats that adopt a crescent-shaped structure [1,[14][15][16][17][18]. The crystal structure of PUM-HD revealed that each of the eight PUF repeats specifically recognizes a single nucleotide in its target RNA, and can thereby bind to as many as eight consecutive nucleotides, and this binding model is conserved [7,19]. PUF proteins initially appeared to bind RNAs containing a 5 -UGU-3 triplet (Figure 1), and were thought to act cooperatively with other proteins [14,[19][20][21][22][23][24]. For instance, DmPUM binds to the Nanos response element (NRE) that harbors motifs A (5 -GUUGU-3 ) and B (5 -AUUGUA-3 ) in the 3 UTR of hunchback (hb) mRNA [25][26][27]. Each motif contains the core UGU triplet and interacts with one Pumilio protein in a cooperative manner [28]. In Caenorhabditis elegans, FBF-1 and FBF-2 (C. elegans fem-3 mRNA-binding factors 1 and 2, two nearly identical proteins collectively called FBF) bind to the same core RNA-binding sites that possess the UGU trinucleotide and an AU pair located 3 nucleotides downstream (5 -UGUDHHAUA-3 ; D, A or G or T; and H, A or C or T) [29]. The binding activity of FBF-2 and other C. elegans PUMs (PUF6 and PUF11) is enhanced by an additional binding pocket for cysteine located upstream [30]. In Saccharomyces cerevisiae, Puf3p, which localized in mitochondria, binds the RNA sequence 5 -UGUANAUA-3 , while yeast Puf4p and Puf5p recognize 5 -UGUR-3 (R, purine)-containing sites [31,32]. The experiment also indicated that yeast Puf4p and Puf5p mainly function in nucleolus [33]. To be functional, PUF1p (Jsn1p) and the closely related protein PUF2p bind RNAs containing 5 -UAAU-3 rather than the more common motif 5 -UGUR-3 . This difference is assigned to their "non-canonical" features consisting of fewer PUF-repeats [34]. In murine, PUM2, which contains a C-terminal RNA-binding domain related to the Drosophila Pumilio homology domain (PUM-HD), can bind to the consensus sequence 5 -UGUANAUARNNNNBBBBSCCS-3 (N, any base; R, A or G; B, C or G or T; and S, G or C) [35]. According to many authors [7,19,36], the binding model of each PUF repeat to an RNA base could be similar. However, PUF proteins can recognize RNA sequences beyond the PUM-HD scaffold and also interact with non-cognate sequences, underlying the higher complexity and adaptability of their binding activity [37][38][39]. To support this point, other studies showed that PUF proteins bind to CDSs or 5 UTRs. They bind to paralytic (para) in the CDS region of its mRNA, which encodes the Drosophila voltage-gated sodium channel paralytic [40]. In Cryptococcus neoformans, Pum1, an ortholog of both S. cerevisiae Puf3p and Drosophila melanogaster Pumilio, can only bind to the consensus binding element 5 -UGUACAUA-3 in the 5 UTR of its own mRNA to participate to the regulation of hyphal morphogenesis [41].
In plants, few investigations have been led to discover PUF-binding sites and thereby their role in plant growth and development. The experimental results from Tam et al. indicated that AtPum2, an Arabidopsis PUF protein, binds the RNA of Drosophila Nanos Response Element I (NRE1) 5 -UGUAUAUA-3 located in its 3 UTR [7]. They also showed that APUM1 to APUM22 can shuttle between nucleus and cytoplasm through the exportin1 mediated pathway. However, APUM23 and APUM24 localized in nucleus [7]. Through three-hybrid assays, Francischini and Quaggio showed that among the 25 PUF members identified in Arabidopsis, APUM1 to APUM6 can specifically bind to the Nanos response element sequence, which is also recognized by Drosophila Pumilio proteins [42]. They also identified an APUM-binding consensus sequence through three-hybrid screening assay in Arabidopsis RNA library, i.e., a 5 -UGUR-3 tetranucleotide sequence reported to be present in all targets of the PUF family [1]. However, the "non-canonical" Arabidopsis PUM23 (APUM23) binding sequence is 10 nucleotides long, contains a 5 -UUGA-3 core sequence, and has a preferred cytosine at nucleotide position 8 [43]. These investigations showed that the consensus PUF-binding motif may be ubiquitous among eukaryotes, but no study in plants has reported a PUF motif in other regions than the 3 UTR. to the Nanos response element sequence, which is also recognized by Drosophila Pumilio proteins [42]. They also identified an APUM-binding consensus sequence through three-hybrid screening assay in Arabidopsis RNA library, i.e., a 5′-UGUR-3′ tetranucleotide sequence reported to be present in all targets of the PUF family [1]. However, the "non-canonical" Arabidopsis PUM23 (APUM23) binding sequence is 10 nucleotides long, contains a 5′-UUGA-3′ core sequence, and has a preferred cytosine at nucleotide position 8 [43]. These investigations showed that the consensus PUF-binding motif may be ubiquitous among eukaryotes, but no study in plants has reported a PUF motif in other regions than the 3′UTR.  [42]. They showed that the PUM-HD of APUM2 bound to the core nucleotides of 5′-UGUANAUA-3′. The each repeat of PUM-HD bound to corresponding nucleotide through Van der Waals force. The protein structure was generated by SWISS-MODEL (https://swissmodel.expasy.org/) [44][45][46][47].

Putative Biological Functions of PUF Proteins
Many studies demonstrated that individual PUF proteins can recognize hundreds of unique transcripts, suggesting that this family of proteins can regulate many aspects of eukaryote mechanisms, including stem cell control, developmental patterning, neuron functioning, and organelle biogenesis (Table 1). Up to now, the most extensive investigations about PUF proteins have focused on Caenorhabditis elegans, Drosophila melanogaster, and Saccharomyces cerevisiae, and only few reports are available about plants.  [42]. They showed that the PUM-HD of APUM2 bound to the core nucleotides of 5 -UGUANAUA-3 . The each repeat of PUM-HD bound to corresponding nucleotide through Van der Waals force. The protein structure was generated by SWISS-MODEL (https:// swissmodel.expasy.org/) [44][45][46][47].

Putative Biological Functions of PUF Proteins
Many studies demonstrated that individual PUF proteins can recognize hundreds of unique transcripts, suggesting that this family of proteins can regulate many aspects of eukaryote mechanisms, including stem cell control, developmental patterning, neuron functioning, and organelle biogenesis ( Table 1). Up to now, the most extensive investigations about PUF proteins have focused on Caenorhabditis elegans, Drosophila melanogaster, and Saccharomyces cerevisiae, and only few reports are available about plants. Based on the biological functions analyzed so far, PUF proteins significantly control diverse processes in these species. In Drosophila, Pumilio was identified initially from its requirement for embryonic development through regulating Hunchback (an important morphogen gene), in collaboration with the zinc finger protein Nanos [70]. Other processes such as stem cell proliferation, motor neuron function, and memory formation are under the control of Pumilio [70]. In Caenorhabditis elegans, FBFs control gametogenesis by mediating the sperm/oocyte switch, while PUF8 displays several functions, including the sperm-oocyte switch during normal development and its antagonistic effects on germline stem cell proliferation [14]. The latter depends on the genetic context, in that PUF8 and MEX3 (a KH-type RNA-binding protein) redundantly promote germline stem cell proliferation in Caenorhabditis elegans [52]. PUF8 also acts as a repressor of germline stem cell proliferation in temperature-sensitive glp-1(ar202) gain-of-function mutants whose GLP-1 activity is high [71]. Two groups of PUF/RNA-binding proteins, PUF-3/11 and PUF-5/6/7, play different roles in Caenorhabditis elegans oogenesis. All of them are involved in oocyte formation, but PUF-3/11 limits oocyte growth while PUF-5/6/7 promotes oocyte organization and formation [72]. Salvetti et al. identified a homologous protein of Drosophila Pumilio in Dugesia japonica, named DjPum. It is expressed in planarian stem cells and involved in the formation of the regenerative blastema [73]. Moreover, these same authors showed that DjPum is essential for neoblast maintenance.
The yeast PUF protein Mpt5p regulates the stability of HO mRNA by stimulating removal of its poly(A) tail [74]. HO is involved in mating-type switching in yeast: it introduces double-stranded DNA breaks that initiate recombination [75]. The PUF3 protein plays a key role because it can bind and regulate more than 100 mRNAs that encode proteins with mitochondrial functions [76]. A bioinformatics method showed that hmt1, a protein arginine N-methyltransferase, and dut1, which encodes a dUTP pyro-phosphatase, were predicted as putative mRNA targets of PUF4p in yeast [33]. PUF5p is a broad RNA regulator in S. cerevisiae that binds to more than 1000 RNA targets; it makes up around 16% of the yeast transcriptome. These RNAs regulate many aspects of S. cerevisiae development such as embryonic cell cycle, cell wall integrity, or chromatin structure [77]. Nop9, an S. cerevisiae PUF protein, recognizes sequences and structural features of 20S pre-rRNA near the nuclease cleavage site. It also associates with the SSU processome/90S pre-ribosome through protein-protein interactions before its 20S pre-rRNA target site is transcribed [78]. Mpt5p (also called Puf5p or Uth4p) promotes temperature tolerance and increased replicative life span in S. cerevisiae through an unknown mechanism thought to be partly involved in the cell wall integrity (CWI) pathway. mpt5∆ mutants also have a short life span; this defect is suppressed when CWI signaling is activated [79].
Certain reports reveal the pathways in which PUF proteins are involved in other species. Peronophythora litchi PIM90 encodes a putative PUF protein; its expression is relatively lower during cyst germination and plant infection, but it is highly expressed during asexual and sexual development [80]. In Plasmodium falciparum, PfPUF1 plays an important role in the differentiation and maintenance of gametocytes, especially female gametocytes [81]. In the PfPUF1-disrupted lines, gametocytes appeared normal before stage III but subsequently exhibited a sharp decline in gametocytemia. In Cryptococcus, Pum1 is auto-repressive during growth, controls its own morphotype expression, and positively stabilizes the expression of ZNF2 (a filamentation regulator) to achieve the filamentous morphotype required for sexual development [41]. In humans, the two Pumilio proteins PUM1 and PUM2 were identified as positive regulators of Retinoic acid-inducible gene I (RIG-I) signaling, which plays a pivotal role in innate immunity [82]. Overexpression of PUM1 and PUM2 increased IFN-β (an important factor in RIG-I signaling) promoter activity induced by Newcastle disease virus (NDV), while the opposite effect was reported when these Pum proteins were knocked down [82].
PUF proteins may also act as post-transcriptional repressors through a conserved mechanism in Plant. APUM5 is associated with both biotic and abiotic stress responses [66]. APUM5-overexpressing plants showed hypersensitive phenotypes under salt and drought treatment during germination at the seedling stage and vegetative stage. Further results indicated that the APUM5-Pumilio homology domain (PHD) protein bound to the 3 UTR of many salt and drought stress-responsive genes containing putative Pumilio RNA-binding motifs in their 3 UTR [66]. AtPUM23 regulates leaf morphogenesis by regulating the expression of KANADI (KAN) genes. KANADI genes are members of the GARP family, key regulators of abaxial identity [68]. Moreover, PUF proteins have also been predicted to participate in many mechanisms in Arabidopsis, such as responses to nutrients, light, iron deficiency, ABA (abscisic acid) signaling, and osmotic stress [7]. For example, APUM23, a nucleolar constitutively PUF-domain protein expressed at higher levels in metabolically active tissues, was upregulated in the presence of glucose or sucrose. APUM23 loss of function plants showed slow growth, with serrated and scrunched leaves, and an abnormal venation pattern via rRNA processing [4]. A transcriptome analysis in Arabidopsis revealed that several PUF members, in particular APUM9 and APUM11, showed higher transcript levels in reduced dormancy 5 mutant during seed imbibition. This study indicated that PUF proteins might also be involved in seed dormancy in plants [63]. Some studies showed that APUM-1 to APUM-6 may be involved in Arabidopsis growth and development in the early stage through binging to the RNA of their target genes such as CLAVATA-1, WUSCHEL, FASCIATA-2, and PINHEAD/ZWILLE, which are involved in the regulation of meristem growth and stem cell maintenance [42,83]. PUF protein APUM24 was also recently described as expressed in tissues undergoing rapid proliferation and cell division [65]. Moreover, APUM24 is required for timely removal of rRNA byproducts for rapid cell division and early embryogenesis in Arabidopsis. APUM24 loss of function plants displayed defects in cell patterning.

PUF Proteins Control Post-Transcriptional Processes through Different Mechanisms
PUF proteins exert their post-transcriptional action through various mechanisms such as activation of mRNA translation, repression of mRNA translation, and localization of mRNA [58,84,85]. One PUF repression mechanism probably correlates with shortening of the poly(A) tail of target mRNAs though deadenylation and repression awaits further research [1] (Figure 2). In yeast, PUF6p inhibits the initiation of ASH1 mRNA translation via interactions with Fun12p during its transport; this repression can be relieved by CK2 phosphorylation in the N-terminal region of PUF6p when the mRNA reaches the bud tip [86]. PUF6p can also form a protein-RNA complex with She2p and repress translation by interacting with translation initiation factors and preventing ribosome transit 12 . Mpt5p, a yeast PUF protein, regulates HO mRNA and triggers shortening its poly(A) tail. A yeast PUF protein physically binds Pop2p (a component of the Ccr4p-Pop2p-Not deadenylase complex) required for PUF repression activity. Simultaneously, the PUF protein recruits deadenylase Ccr4p and Dcp1p and Dhh1p, which are involved in mRNA regulation. The PUF-Pop2p interaction is conserved in yeast, worms, and humans [60].
In Caenorhabditis elegans, FBF regulates the activation of gld-1 (defective in germline development-1). A possible mechanism of that regulation is linked to cytoplasmic polyadenylation, i.e., extension of the mRNA poly(A) tail by cytoplasmic poly(A) polymerase [13]. FBF interacts with gld-1 mRNA and with the cytoplasmic polyadenylase, which it recruits [87]. However, it is also involved in another mechanism. In fact, FBF can bind the 3 UTR of EGL-4, a cGMP-dependent protein kinase, and may localize translation near the sensory cilia and cell body. Furthermore, the photoconvertible stony coral protein Kaede was used as a reporter gene in that experiment. The cell biology analysis showed that the subcellular distribution of newly synthesized Kaede dramatically changed in the fbf-1 mutant. This result suggests that the binding of FBF may direct the subcellular localization of EGL-4 translation and enhance its translation [88]. In humans, Nop9 is a PUF-like protein. It recognizes sequences and structural features of 20S pre-rRNA near Nob1, the cleavage site of the nuclease and thus reduces Nob1 cleavage efficiency [78]. Nob1 cleavage is the final processing step in the production of mature 18S small subunit ribosomal RNA.

Conclusions
Post-transcriptional regulation is an essential component of gene expression regulation. Numerous studies conducted over several decades have unveiled and characterized many factors involved in post-transcriptional regulation, such as micro-RNAs, poly(A)-binding proteins (PABPs), small nuclear RNAs (snRNAs), or RNA-binding proteins (RBPs). PUF family RNA-binding proteins are determining post-transcriptional regulators present throughout eukaryotes. PUF proteins influence many aspects of different metabolic pathways, and the expression of PUF genes is regulated by many endogenous signals [25,48,89,90]. This article provides an overview of PUF proteins, i.e., their RNA targets, biological functions, and regulation mechanisms. These findings may lead us to discover more information and functions about plant PUF proteins, as current knowledge about the regulation of PUF gene expression and their role in plant biology is scarce. Most studies on plant PUF proteins have only focused on Arabidopsis thaliana, in which 26 PUF family members have been reported [7,42]. The relevance of PUF proteins in specific plant developmental processes such as branching, rhizogenesis, flowering, that are well known to be finely and flexibly controlled by endogenous and exogenous stimuli, still remains to be investigated. For example, the 3′UTR of some branching related genes in Arabidopsis, such as MORE AXILLARY BRANCHES 2 (AtMAX2) and SMAX1-LIKE 6 (SMXL6), contained the putative binding sites of PUF protein. In addition, the putative PUF binding sites also were found in the 3′UTR of Flower Locus T (FT) and TERMINAL FLOWER 1 (TFL1), which are related to flowering in Arabidopsis. The 3′UTR of RETARDED ROOT GROWTH (RRG), a rhizogenesis related gene, harbors many putative PUF binding sites. Therefore, some plant developmental processes may be controlled by PUF protein at the post-transcriptional level. The detail mechanism of these developmental processes needs to be studied deeply in the future.

Conclusions
Post-transcriptional regulation is an essential component of gene expression regulation. Numerous studies conducted over several decades have unveiled and characterized many factors involved in post-transcriptional regulation, such as micro-RNAs, poly(A)-binding proteins (PABPs), small nuclear RNAs (snRNAs), or RNA-binding proteins (RBPs). PUF family RNA-binding proteins are determining post-transcriptional regulators present throughout eukaryotes. PUF proteins influence many aspects of different metabolic pathways, and the expression of PUF genes is regulated by many endogenous signals [25,48,89,90]. This article provides an overview of PUF proteins, i.e., their RNA targets, biological functions, and regulation mechanisms. These findings may lead us to discover more information and functions about plant PUF proteins, as current knowledge about the regulation of PUF gene expression and their role in plant biology is scarce. Most studies on plant PUF proteins have only focused on Arabidopsis thaliana, in which 26 PUF family members have been reported [7,42]. The relevance of PUF proteins in specific plant developmental processes such as branching, rhizogenesis, flowering, that are well known to be finely and flexibly controlled by endogenous and exogenous stimuli, still remains to be investigated. For example, the 3 UTR of some branching related genes in Arabidopsis, such as More Axillary Branches 2 (AtMAX2) and Smax1-like 6 (SMXL6), contained the putative binding sites of PUF protein. In addition, the putative PUF binding sites also were found in the 3 UTR of Flower Locus T (FT) and Terminal Flower 1 (TFL1), which are related to flowering in Arabidopsis. The 3 UTR of Retarded Root Growth (RRG), a rhizogenesis related gene, harbors many putative PUF binding sites. Therefore, some plant developmental processes may be controlled by PUF protein at the post-transcriptional level. The detail mechanism of these developmental processes needs to be studied deeply in the future.