The WD40 Gene Family in Potato ( Solanum Tuberosum L.): Genome-Wide Analysis and Identiﬁcation of Anthocyanin and Drought-Related WD40s

: WD40 proteins, also known as WD40 domain proteins, constitute a large gene family in eukaryotes and play multiple roles in cellular processes. However, systematic identiﬁcation and analysis of WD40 proteins have not yet been reported in potato ( Solanum tuberosum L.). In the present study, 178 potato WD40 ( StWD40 ) genes were identiﬁed and their distribution on chromosomes, gene structure, and conserved motifs were assessed. According to their structural and phylogenetic protein features, these 178 StWD40 genes were classiﬁed into 14 clusters and 10 subfamilies. Collinearity analysis showed that segmental duplication events played a major role in the expansion of the StWD40 gene family. Synteny analysis indicated that 45 and 23 pairs of StWD40 genes were orthologous to Arabidopsis and wheat ( Triticum aestivum ), respectively, and that these gene pairs evolved under strong purifying selection. RNA-seq data from di ﬀ erent tissues and abiotic stresses revealed tissue-speciﬁc expression and abiotic stress-responsive StWD40 genes in doubled monoploid potato (DM). Furthermore, we further analyzed the WD40 genes might be involved in anthocyanin biosynthesis and drought stress in tetraploid potato cultivars based on RNA-seq data. In addition, a protein interaction network of two homologs of Arabidopsis TTG1 , which is involved in anthocyanin biosynthesis, was constructed to identify proteins that might be related to anthocyanin biosynthesis. The result showed that there were 112 pairs of proteins interacting with TTG1 , with 27 being di ﬀ erentially expressed in pigmented tissues. This study indicates that WD40 proteins in potato might be related to anthocyanin biosynthesis and abiotic stress responses.


Introduction
WD40 proteins, also known as WD40 domain proteins, constitute a large gene family of eukaryotes [1]. The first WD40 protein was identified as a subunit of the bovine β-transduction varieties. The results provide a basis for further study on functional characterization of the WD40 protein family.

Sequence Analysis and Structural Characterization in Potato
The coding sequence (CDS), protein sequence, and genomic sequence of the identified StWD40 genes were downloaded from PGSC. All StWD40 sequences were submitted to EXPASY (https: //web.expasy.org/protparam/) to analyze the number of amino acids, molecular weight (MW), and theoretical isoelectric point (pl) [41,42]. The MEME program (version 5.0.4, http://alternate.meme-suite. org/tools/meme) was used to identify motifs in the StWD40 sequence with the following parameters: 10 motifs with an optimal motif width of 6 to 50 amino acid residues and any number of repeats [43]. The exon-intron structures of the StWD40 genes were graphically displayed by the Gene Structure Display Server (GSDS2.0, http://gsds.cbi.pku.edu.cn/) [44].

Chromosomal Localization and Gene Duplication in Potato
The chromosome distribution information of each StWD40 gene was acquired from PGSC. MapChart software was used to draw the chromosomal positions map of StWD40 genes and the relative distances. The events of the StWD40 duplication was confirmed based on two criteria: (a) The length of the shorter aligned sequence covered >70% of the longer sequence, and (b) the similarity of the two aligned sequences was >70% [45,46]. Two genes located in the same chromosomal fragment of less than 100 kb and separated by five or fewer genes were identified as tandem duplicated genes [47]. The segmental duplications were confirmed by searching the Plant Genome Duplication Database (http://chibba.pgml.uga.edu/duplication/) [48]. MCScanX was used to analyze the synteny of the WD40 genes between potato and other plants [49], and figures were drawn by Circos v0.69 [50]. To further estimate the duplication events of WD40s, Ka (non-synonymous)/Ks (synonymous) was calculated using KaKs Calculator 2.0 [51].

Phylogenetic Analysis and Classification of StWD40 Genes
The full-length amino acid sequences of 232 AtWD40s and the newly identified 178 StWD40s were used for phylogenetic analysis. Multiple sequence alignments were performed using ClustalW progress. An unrooted maximum likelihood phylogenetic tree was constructed using MEGA (Molecular Evolutionary Genetics Analysis) 7.0 software [52] with a bootstrap test including 1000 iterations using the Poisson method.

Plant Materials and Treatments
The research materials were three potato varieties of different colors, including "Heimeiren" (HM), a purple potato cultivar with purple skin and purple flesh, "Xindaping" (XD), a white potato cultivar with white skin and white flesh, and "Lingtianhongmei" (LT), a red potato cultivar with red skin and red flesh, and were grown in a greenhouse at Gansu Agricultural University in Lanzhou, Gansu Province, China. Six fresh tubers of each variety were used (diameter = 4-5 cm) to harvest the skin tissues and flesh tissues. Skin tissues were carefully obtained from cortex tissue. Flesh tissues were isolated with at least 5 mm distance from the skin. The samples were immediately frozen in liquid nitrogen and stored in a refrigerator at −80 • C for use.
For the drought stress experiment, "Atlantic" (A, a drought-sensitive cultivar) and "Qingshu No. 9" (Q, a drought-tolerant cultivar) were grown in the field under a rain exclusion shelter at the Dingxi Academy of Agricultural Science in Dingxi, Gansu Province, China. The treatments included drought stress and a well-watered control (watering provided by a drip irrigation system). The experimental design was a randomized block with three replications of each treatment, with 10 plants in each experimental unit. Before seedling emergence, both treatments were irrigated suitably and equally. After emergence, plants under the drought stress treatment did not receive water for the remainder of the growing season. At the early flowering stage, full-blooming stage, and flower-falling stages, the foliage of three plants in each experimental unit were harvested and pooled as a single sample. Samples were immediately frozen in liquid nitrogen and stored at −80 • C until later use.

RNA Extraction, qPCR, and Statistical Analysis
For quantitative real-time PCR (qPCR) analysis, total RNA extractions of different tissues (including skin and flesh obtained from three color potato cultivars and leaves obtained from two potato cultivars under droughts stress) were carried out using an RNA extraction kit (Tiangen DP419, Beijing, China). The integrity of RNA was monitored by agarose gel electrophoresis and the concentration was detected using a Nanodrop ND-2000 spectrophotometer (Nanodrop Technologies, Wilmington, DE, USA). A total of 2 µg RNA per sample was used for first-strand cDNA synthesis. Elimination of genomic DNA contamination and first-strand cDNA synthesis were carried out using FastKing RT kit with gDNase (Tiangen KR116, Beijing, China). The qPCR was conducted with a CFX96 TouchTM Real-Time PCR Detection System (Bio-Rad, Hercules, CA, USA) using SuperReal PreMix Plus (SYBRGreen FP205; Tiangen, Beijing, China) with three biological replicates. The qPCR reaction conditions were as follows: 30 s at 95 • C, followed by 40 cycles of 5 s at 95 • C, 30 s at 60 • C, followed by 65-95 • C melting curve detection. The qPCR efficiency of each gene was obtained by analyzing the standard curve of a cDNA gradient dilution. The housekeeping gene was StEF-1α (AB061263) [53]. Relative expression was calculated using the comparative Ct (2 −∆∆Ct ) method [54]. The olidodT primers were used for qPCR and were synthesized by Sangon Biotech (Shanghai, China). The primers and melt curves are listed in Table S1 and Figure S1.
Data quantified from the qPCR of three biological replicates were performed with a one-way ANOVA analysis of variance using the statistical software package SPSS 22.0. Differences in gene expression levels were processed by Duncan's multiple range tests and the least significant difference (LSD). Significance was defined at a level of p < 0.05.

RNA-Seq Data Analysis
Total RNA of the aforementioned samples was chosen for further RNA-seq library construction. Enrichment of mRNA, fragmentation, addition of adapters, size selection, PCR amplification, and Next-Generation Illumina sequencing were performed by Biomarker Technologies Corporation (Beijing, China). After RNA sequencing, the raw data were uploaded to NCBI (Project ID PRJNA541919 and PRJNA541096). High-quality, clean reads were obtained by trimming the raw reads and filtering out contaminants, adapters, phred scores of less than 20, and uncertain bases. The cleaned data were aligned to PGSC_DM_v3.4 gene models downloaded from Solanaceae Genomics Resource at Michigan State University (http://solanaceae.plantbiology.msu.edu/pgsc_download.shtml) by Bowtie2 (v2.2.9). The number of mapped clean reads were counted (by fragment) and subjected to differential expression analysis using the edge R package (http://www.r-project.org/). Genes with the absolute value of log2FC (log2 fold change) of ≥1 and a false discovery rate (FDR) of <0.05 were considered to be significant differentially expressed genes (DEGs). The DEGs were annotated against nonredundant database (Nr), SwissProt/UniProt Plant Proteins, Kyoto Encyclopedia of Genes and Genomes (KEGG), Cluster of Orthologous Groups of proteins (COG/KOG) and the potato protein database (ftp://ftp.jgi-psf.org/pub/compgen/phytozome/v9.0/ Stuberosum) by BLASTX, with a cut-off E-value of 1.0e-5. Then, DEGs were subjected to enrichment analysis of GO (Gene Ontology) functions and KEGG pathways.
To study the expression patterns of WD40 genes in various tissues (carpels, petals, sepals, stamens, whole mature flowers, immature fruits, mature fruits, internal to the fruit, stolons, tubers, stems, leaves, petioles, callus, roots, and shoots) and under abiotic stress (salt treatment: 150 mM NaCl; mannitol-induced drought stress: 260 µM mannitol; heat treatment: 35°C), the same methods as mentioned above were used, Illumina RNA-seq data were downloaded from the PGSC [55], and TBtools software was used to generate the heatmap [56].

The Interaction Nework of StAN11(TTG1) Proteins
STRING is a database of known and predicted protein-protein interactions. The interactions include direct (physical) and indirect (functional) associations and each protein-protein interaction is annotated with one or more "scores". These scores are indicators of confidence to show how likely STRING judges an interaction to be true, and to provide the available evidence. The threshold of combined score >400 indicate a confidence of medium or better.
Based on the RNA-seq dataset of pigmented cultivars, we selected PG0026477 and PG0000561 (annotated as StAN11) to predict the protein interaction network. Specific protein interactions were constructed using online STRING v11.0 software (Search Tool for the Retrieval of Interacting Genes/Proteins, http://string-db.org/) with a combined score >400 (medium confidence) [57]. As active interaction sources, text mining, experiments, databases, co-expression, neighborhood, gene fusion, and co-occurrence were selected. Genes with FPKM (fragments per kilobase of exon per million fragments mapped) > 5 were retained to construct the protein interaction network. The interaction network was visualized by Cytoscape v3.7.1 [58].

Identification of StWD40 Proteins
The Hidden Markov Model (HMM) search was performed against the potato and Arabidopsis genomes, excluding sequences without a typical WD40 domain via the Pfam (pfam00400) and SMART (SM00320) websites. Ultimately, a total of 178 StWD40 genes in potato and 232 AtWD40 genes in Arabidopsis were obtained, of which 230 AtWD40 genes were consistent with those identified by   [31] and two more AtWD40 genes were identified in our work. The StWD40 protein varied largely in length and physicochemical properties; the length of amino acid residues of StWD40 proteins ranged from 61 to 1552, the molecular weights were between 6.43 and 173.70 kDa, and the isoelectric point (pI) values ranged from 4.26 to 9.72 (Table S2). The transcript IDs were converted to their corresponding gene IDs, and the "SC0003DMG40" in the gene ID was omitted for convenience. The gene IDs are listed in Table S2.

Chromosomal Distribution, Gene Structure, and Subfamily Classification of StWD40 Proteins
The position of each StWD40 gene was determined by mapping their sequences to the chromosomes of the potato genome, showing that 177 StWD40 genes (PG0019022 was not located) were distributed extensively and unevenly on 12 chromosomes ( Figure 1A). A further 26 StWD40 genes were located on chromosome 3, which had the largest number of StWD40 genes, whereas chromosomes 5 and 10 had the least number of StWD40 genes ( Figure 1B). Most StWD40 genes were located at the proximal or distal ends of these chromosomes, and high densities of StWD40 genes were distributed at the bottoms of chromosomes 1, 3, 7, and 12. In contrast, relatively high densities of StWD40 genes were detected at the tops of chromosomes 5 and 11. Agronomy 2019, 9, x FOR PEER REVIEW 6 of 24 chromosomes 5 and 10 had the least number of StWD40 genes ( Figure 1B). Most StWD40 genes were located at the proximal or distal ends of these chromosomes, and high densities of StWD40 genes were distributed at the bottoms of chromosomes 1, 3, 7, and 12. In contrast, relatively high densities of StWD40 genes were detected at the tops of chromosomes 5 and 11. The full-length amino acid sequences of 178 newly identified StWD40s were used to construct an unrooted phylogenetic tree by MEGA 7.0 software, as shown in Figure 2A. In order to gain insight into the structures of the StWD40 genes, we analyzed their exon and intron organization. The results showed that the numbers of exons and introns in the StWD40 gene family differed greatly ( Figure  2B). There were 23 StWD40 genes (19.92%), which only had one exon and no intron. More than half (101) of the StWD40 genes contained 1-9 introns, and the rest of them (54) contained 10 or more introns. The details are listed in Table S2. The full-length amino acid sequences of 178 newly identified StWD40s were used to construct an unrooted phylogenetic tree by MEGA 7.0 software, as shown in Figure 2A. In order to gain insight into the structures of the StWD40 genes, we analyzed their exon and intron organization. The results showed that the numbers of exons and introns in the StWD40 gene family differed greatly ( Figure 2B). There were 23 StWD40 genes (19.92%), which only had one exon and no intron. More than half (101) of the StWD40 genes contained 1-9 introns, and the rest of them (54) contained 10 or more introns. The details are listed in Table S2. A, the light blue box represents subfamily B, the purple box represents subfamily C, the black box represents subfamily D, the orange box represents subfamily E, the red box represents subfamily F, the dark green box represents subfamily G, the yellow box represents subfamily H, the blue box represents subfamily I, and the grey box represents subfamily J (A). Exon/intron organization of the StWD40 genes. The blue box indicates the exon and the black line of the same length indicates the intron. The upstream/downstream area is indicated by a red box. The numbers 0, 1, and 2 represent the splicing phase of the intron (B). Distribution of conserved motifs in the StWD40 genes. Boxes of different colors represent 10 putative motifs (C). The 10 classification domains of StWD40 genes are represented by boxes of different colors (D).
To further study the diversification of these StWD40 genes in potato, we searched for conserved motifs of these proteins using the online MEME program. In total, 10 conserved motifs were identified and designated as motifs 1 to 10 ( Figure 2C). All 178 StWD40 proteins had motif 1 and motif 2, which contained 11 and 15 amino acids, respectively. There were 171 StWD40 proteins that contained motif 3. Motif 8 and motif 9 were the least distributed, of which nine StWD40 proteins contained motif 8 and six StWD40 proteins contained motif 9. We found that the WD40 domain was composed of a variety of motif combinations, such as motif 2 and motif 3, motif 3 and motif 6, motif 2 and motif 4, and motif 4 and motif 6. Motif 1, motif 2, and motif 6 predominantly occurred at the C-terminus of the WD40 domain, and motif 3 and motif 4 were predominantly present at the N-terminus of the WD40 domain. The details of the 10 motifs are in Table S3.
To analyze whether these motifs were conserved in orthologs of Arabidopsis and wheat, the special conserved motif in wheat and Arabidopsis, which were orthologous to StWD40, were identified using the MEME website (Table S4, Figure S2). We found that motif 1, motif 2, motif 4, motif 5, and motif 6 identified in StWD40 were relatively conserved in potato, wheat, and Arabidopsis, and that motif 1, motif 2, and the sequence "AFSPDGX" in motif 5 were more conserved than others. These motifs make up the largest part of the WD40 domain, but not entirely. The results indicated that only the WD40 domain was conserved during evolution in different species.
According to their domain compositions, the 178 StWD40s were grouped into 10 subfamilies, of which 129 StWD40s were classified into subfamily A, containing only WD40 domains, whereas the other 49 StWD40s containing the same domains were grouped into subfamilies B-J (Figure 2A,D, Table S5). Four StWD40s were classified into subfamily B, which contained the WD40 domain and CAF1 domains. Seven StWD40s, containing the WD40 domain and the LisH domain, were classified into subfamily C. Six StWD40s were grouped into subfamily D, which contained the WD40 domain and UTP (U3 small nucleolar RNA-associated protein) (12, 13, 15 or 21). Four StWD40s contained the WD40 domain and the zinc finger domain and were assigned to subfamily E. Five StWD40s were classified into subfamily F, containing the WD40 domain and the NLE (Notchless) (NUC139, NUC141) domain. Two StWD40s were classified into subfamily G, containing the WD40 domain, the Coatomer_WDAD (Coatomer WD associated region) domain and the COPI_C (Coatomer alpha subunit C-terminus) domain. Only one StWD40 contained the WD40 domain and the protein kinase domain and was grouped into subfamily F. Five StWD40s were classified into subfamily I, containing the WD40 domain and the F-BOX/U-BOX contained. Fifteen StWD40s were classified into subfamily J, which contained the WD40 domain and other domains.

Gene Duplication and Genome Synteny
Tandem duplication and segmental duplication contribute to expansion of new gene family members and new functions in plant genome evolution [59]. The segmental and tandem duplication events in the StWD40 family were investigated to elucidate the StWD40 gene duplication events in potato. In our study, three pairs of genes (3.37%) were identified as tandemly duplicated genes, of which two pairs were located on chromosome 3 and one pair was located on chromosome 7 ( Figure 1A, Table Agronomy 2020, 10, 401 9 of 24 S6). In addition, 14 pairs of genes (13.48%) might be related to segmental duplication events ( Figure 3, Table S6).

Gene Duplication and Genome Synteny
Tandem duplication and segmental duplication contribute to expansion of new gene family members and new functions in plant genome evolution [59]. The segmental and tandem duplication events in the StWD40 family were investigated to elucidate the StWD40 gene duplication events in potato. In our study, three pairs of genes (3.37%) were identified as tandemly duplicated genes, of which two pairs were located on chromosome 3 and one pair was located on chromosome 7 ( Figure  1A, Table S6). In addition, 14 pairs of genes (13.48%) might be related to segmental duplication events ( Figure 3, Table S6).  To further investigate the potential evolution processes of the StWD40 gene family, we analyzed the synteny relationship of WD40 genes among potato, Arabidopsis, and wheat. A total of 45 pairs of orthologs were identified between potato and Arabidopsis ( Figure 4, Table S7), and the Ka/Ks ranged from 0.0142 and 0.9421 with a mean value of 0.1797. Another 23 pairs of orthologs between potato and wheat were identified ( Figure 4, Table S8), and the Ka/Ks ranged from 0.0406 to 0.8655 with a mean value of 0.2763. These results imply that duplication events played an important role in evolution and functional divergence, and that segmental duplication played a predominant role in the expansion of the StWD40 family.
To further investigate the potential evolution processes of the StWD40 gene family, we analyzed the synteny relationship of WD40 genes among potato, Arabidopsis, and wheat. A total of 45 pairs of orthologs were identified between potato and Arabidopsis ( Figure 4, Table S7), and the Ka/Ks ranged from 0.0142 and 0.9421 with a mean value of 0.1797. Another 23 pairs of orthologs between potato and wheat were identified ( Figure 4, Table S8), and the Ka/Ks ranged from 0.0406 to 0.8655 with a mean value of 0.2763. These results imply that duplication events played an important role in evolution and functional divergence, and that segmental duplication played a predominant role in the expansion of the StWD40 family.

Phylogenetic Analysis of WD40 Genes
To analyze the evolutionary relationship of the WD40 family between potato and Arabidopsis, the protein sequences of 178 identified StWD40s and 232 identified AtWD40s were used to generate an unrooted phylogenetic tree using the Maximum Likelihood (M-L) method via MEGA 7.0 software. The WD40s were divided into 14 major distinct clusters (Clusters a-n): 13 AtWD40s and eight StWD40s belonged to Cluster a, 10 AtWD40s and nine StWD40s were assigned to Cluster b, seven AtWD40s and three StWD40s were classified into Cluster c, 10 AtWD40s and 11 StWD40s were grouped into Cluster d, and 10 AtWD40s and four StWD40s belonged to Cluster k ( Figure 5). The StWD40s of these clusters (a, b, c, d, and k) originated from subfamily A. The StWD40s belonging to subfamily B were assigned to Cluster n, which had four AtWD40s and 19 StWD40s. Noticeably, the genes in a set of orthologous pairs between potato and Arabidopsis were divided in the same cluster ( Figure 5, Table S7), indicating the conservation of synteny between potato and Arabidopsis proteins. We found that members of one subfamily were not assigned to one cluster; however, members of one subfamily and in one cluster had similar numbers and locations of WD40 domains.
StWD40s of these clusters (a, b, c, d, and k) originated from subfamily A. The StWD40s belonging to subfamily B were assigned to Cluster n, which had four AtWD40s and 19 StWD40s. Noticeably, the genes in a set of orthologous pairs between potato and Arabidopsis were divided in the same cluster ( Figure 5, Table S7), indicating the conservation of synteny between potato and Arabidopsis proteins. We found that members of one subfamily were not assigned to one cluster; however, members of one subfamily and in one cluster had similar numbers and locations of WD40 domains.

Expression Profiles of StWD40 Genes in Different Tissues
To understand the tissue-specific expression patterns of the StWD40 genes, the transcript abundance in different tissues, including flowers (carpels, petals, sepals, stamens and whole mature flowers), fruits (immature, mature and internal to the fruit), stolons, tubers, stems, leaves (leaves and petioles), and other tissues (callus, roots and shoots) of DM potato were analyzed based on transcriptome data downloaded from PGSC [55]. The results showed that 33.71% of StWD40 genes (60/178) were highly expressed in all tissues with FPKM > 5, 8.43% (15/178) of StWD40 genes showed little expression with FPKM less than 2, and one gene (PG0037204) was not expressed in any of the 16 tissues (FPKM = 0). Some of these displayed a tissue-specific expression pattern, such as PG0006520, PG1015451, and PG0010081, which were highly expressed only in stamens and mature flowers (FPKM > 14) and were expressed at low levels in the other tissues (FPKM < 2). Compared with other tissues, four genes were highly expressed in specific tissues (FPKM > 5 and log2FC > 1): PG0044039 was in fruit, PG0002566 was in tubers and calluses, and PG0026477 and PG0016973 were in tubers and stolons ( Figure S3A, Table S9).

Expression Profiles of StWD40 Genes in Pigmented Potato Cultivars
To investigate whether the StWD40 genes are involved in anthocyanin biosynthesis, expression profile analysis of the StWD40 genes in different tuber tissues (skin and flesh) of tetraploid pigmented potato cultivars was performed based on RNA-seq data. The upregulated and downregulated StWD40 genes in pigmented tuber skins of three potato varieties were analyzed. Six StWD40 genes (PG0002566, PG0013161, PG0015613, PG0016973, PG0028735, and PG0029342) were upregulated in pigmented skin compared with the white skin of XD. Nine StWD40 genes (PG0006156, PG0009000, PG0016078, PG0019207, PG0021607, PG0022734, PG1001493, PG0025006, and PG2012456) were downregulated in pigmented skin. Among them, PG0013161 belonged to subfamily D, PG0029342 belonged to subfamily J, and the other genes belonged to subfamily A. In pigmented skin, PG0002566 was upregulated both in the skin of HM and LT, and PG0013161, PG0015613, PG0016973, PG0028735, and PG0029342 were only upregulated in the skin of HM. Additionally, PG0019207 was downregulated in the skin of HM and LT. PG0006156, PG0016078, PG0021607, PG0022734, and PG1001493 were downregulated in the skin of HM, and PG0009000, PG0025006, and PG2012456 were downregulated in the skin of LT (FPKM > 1 and |log2FC)| > 1) ( Figure 6, Table S10). These downregulated genes all belonged to subfamily A. Seventeen StWD40 genes were not expressed (FPKM = 0) and 26 StWD40 genes showed little expression with an FPKM value less than 1.   Table S10). Among them, PG0010959 belonged to subfamily C, PG0025310 and PG0019361 belonged to subfamily J, and the other genes belonged to subfamily A.
Previous studies showed that TTG1 participates in the pigmentation of seed coat and the development of trichomes in Arabidopsis [60]. In the PGSC database, the best blast matches for TTG1 (StAN11) were PG0000561, PG0026477, and PG0044039, and their expression was analyzed in this RNA-seq dataset. PG0000561 showed higher expression in pigmented skin than in white skin (FPKM = 45.02 in HM and FPKM = 54.68 in LT with increases of 0.36 log2FC and 0.64 log2FC, respectively). PG0026477 was highly expressed in white and pigmented skin, with FPKM ranging from 42.17 to 54.11. In the flesh, PG0000561 and PG0026477 were upregulated in HM (log2FC > 1, FPKM > 1) and highly expressed in LT, with log2FC ranging from 0.94 to 0.97. However, PG0044039 showed little expression in all tissues, with FPKM < 1.
To further investigate how the StWD40 genes might respond to drought stress, the tetraploid drought-sensitive cultivar "Atlantic" (A) and the tetraploid drought-tolerant cultivar "Qingshu No. 9" (Q) were subjected to drought stress. The RNA-seq data showed that 15 StWD40 genes were not expressed (FPKM = 0), and 12 genes were upregulated or downregulated in response to drought stress, of which six StWD40 genes responded to drought stress at the early flowering stage, eight genes were expressed at the full-blooming stage, and five genes were expressed at the full-flowering stage (FPKM > 1 and |log2FC| > 1). PG0005946 showed no difference in A and Q at the early flowering stage and full-blooming stage but was downregulated in Q at the flower-falling stage under drought stress. PG0003339 was upregulated in Q at the flowering and full-blooming stages. PG0002566 was downregulated in Q from the early flowering stage until the flower-falling stage, and the other genes were only upregulated or downregulated at one stage ( Figure 6, Table S12). Combined with the data analysis of PGSC mannitol stress, we found that the upregulated genes (PG0011337, PG0006033m and PG0007963) in DM under mannitol stress were more highly expressed in drought-sensitive cultivar A than in drought-tolerant cultivar Q. The down-regulated genes (PG0009000, PG0010747, PG0015829, PG0017881, and PG0022734) in DM under mannitol stress were more highly expressed in drought-tolerant cultivar Q than in drought-sensitive cultivar A. These genes may be involved in drought stress and need further study.
We selected 19 StWD40 genes which had relatively higher expression levels in pigmented tissues or differential expression levels in Q and A under drought stress as targets to validate the reliability of the RNA-seq dataset via qPCR. The results confirmed that the qPCR expression patterns were in agreement with the RNA-Seq dataset (Figures 7 and 8), showing a high correlation (R 2 = 0.853) described by a simple linear regression equation (y = 1.0149x − 0.1912) between the RNA-seq dataset of pigmented potato cultivars and qPCR, and a high correlation (R 2 = 0.830) described by a simple linear regression equation (y = 0.8221x + 0.1007) between the RNA-seq dataset of two potato cultivars under drought stress and qPCR, indicating consistency between the two analytical methods.

The Interaction Network of StAN11 (TTG1) Proteins
Studies showed that TTG1 is involved in anthocyanin synthesis [23,61]. We further investigated proteins that interact with StAn11 (TTG1) and may be involved in the synthesis of anthocyanins. Based on the RNA-seq dataset of pigmented cultivars, we selected PG0026477 and PG0000561 (annotated as StAN11) to predict the protein interaction network. Genes with FPKM > 5 were retained to construct a protein interaction network with a median confidence (combination score >400). The results indicated that a total of 112 pairs of interacting proteins were predicted, of which 39 proteins interacted with PG0026477 and PG0000561, 27 proteins interacted with PG0026477, and seven proteins interacted with PG0000561. Among them, PG0026477 interacted with all MYB transcription factors in the network and with bHLH1-1. In addition, we found that PG0026477 directly interacted with structural genes such as leucoanthocyanidin oxygenase (ANS), flavonoid 3',5'-hydroxylase (F3'5'H), dihydroflavonol 4-reductase (DFR), flavanone 3 beta-hydroxylase (F3H) and flavanol synthase (FLS). Unlike PG0026477, PG0000561 did not interact with MYB and bHLH1-1, but with StWD40 genes in a drought-sensitive cultivar (Atlantic, A) and a drought-tolerant cultivar (Qingshu No. 9, Q) cultivar under drought stress at the early flowering stage (S1), full-blooming stage (S2), and flower-falling stage (S3). Data represent the mean of three biological replicates ± standard error of the mean. Standard errors are shown as bars above the columns. Different letters above bars denote significant difference at p < 0.05.

The Interaction Network of StAN11 (TTG1) Proteins
Studies showed that TTG1 is involved in anthocyanin synthesis [23,61]. We further investigated proteins that interact with StAn11 (TTG1) and may be involved in the synthesis of anthocyanins. Based on the RNA-seq dataset of pigmented cultivars, we selected PG0026477 and PG0000561 (annotated as StAN11) to predict the protein interaction network. Genes with FPKM > 5 were retained to construct a protein interaction network with a median confidence (combination score >400). The results indicated that a total of 112 pairs of interacting proteins were predicted, of which 39 proteins interacted with PG0026477 and PG0000561, 27 proteins interacted with PG0026477, and seven proteins interacted with PG0000561. Among them, PG0026477 interacted with all MYB transcription factors in the network and with bHLH1-1. In addition, we found that PG0026477 directly interacted with structural genes such as leucoanthocyanidin oxygenase (ANS), flavonoid 3 ,5 -hydroxylase (F3 5 H), dihydroflavonol 4-reductase (DFR), flavanone 3 beta-hydroxylase (F3H) and flavanol synthase (FLS). Unlike PG0026477, PG0000561 did not interact with MYB and bHLH1-1, but with DFR, F3H, and KCS10 (3-ketoacyl-CoA synthase 10), which were upregulated in pigmented tissues. (Figure 9, Tables S13 and S14).

Expansions of WD40 Gene Family in Potato
The series and segment repeat modes play a major role in expansion of the StWD40 family. In the present study, the proportion of tandem duplications was 3.37% of all StWD40 genes (6 of 178), and segmental duplications accounted for 13.48% of the total (24 of 178). All duplication events of StWD40s were from subfamily A, with no duplication events in other subfamilies, which was different from wheat [30]. Thus, duplication was the major driving force of the evolution of StWD40 subfamily A. This may be because StWD40 genes have multiple mutations at the C-terminus and the N-terminus and cannot form other conserved domains. Subfamily A contains 131 StWD40 genes with abundant functions. For example, AtJGB (AT2G26490) is a negative regulator of pollen germination [62] and PG0010081, PG0006520, and PG1015451 were clustered with AtJGB in one cluster, which was specifically expressed in mature flowers and stamens. ATG18A (AT3G62770) is involved in autophagy [63] and PG0024704 was clustered with ATG18A in one cluster. AtBUB31 (AT3G19590) is involved in cell division [64] and PG0025750 was orthologous to AtBUB31. PG2021193 was clustered with AtYAO (AT4G21130) in one cluster, which plays a role in embryo sac development [65]. AtTTG1 (AT5G24520) is involved in anthocyanin synthesis [61]. Based on the PGSC dataset, PG0026477, PG0000561, and PG0044039 were annotated as TTG1 (StAN11) in potato, but all members of this subfamily only contained the WD40 domain. In summary, gene duplication provided a large genome with numerous functional genes for adaptation to various environments in potato.

Phylogenetic Analysis and Evolution of StWD40s
Numerous phylogenetic analyses of WD40 proteins were conducted in Arabidopsis [31]. In order to obtain an overview of the 178 StWD40 proteins and their relationships with Arabidopsis, a phylogenetic tree combining potato and Arabidopsis WD40 proteins was constructed. The results showed that 410 WD40 members (178 StWD40 proteins and 232 AtWD40 proteins) were divided into 14 clades ( Figure 5). Comparing the full-length cDNA sequences with the corresponding genomic DNA sequences, we observed that 154 (86.52%) StWD40 coding sequences were interrupted by

Expansions of WD40 Gene Family in Potato
The series and segment repeat modes play a major role in expansion of the StWD40 family. In the present study, the proportion of tandem duplications was 3.37% of all StWD40 genes (6 of 178), and segmental duplications accounted for 13.48% of the total (24 of 178). All duplication events of StWD40s were from subfamily A, with no duplication events in other subfamilies, which was different from wheat [30]. Thus, duplication was the major driving force of the evolution of StWD40 subfamily A. This may be because StWD40 genes have multiple mutations at the C-terminus and the N-terminus and cannot form other conserved domains. Subfamily A contains 131 StWD40 genes with abundant functions. For example, AtJGB (AT2G26490) is a negative regulator of pollen germination [62] and PG0010081, PG0006520, and PG1015451 were clustered with AtJGB in one cluster, which was specifically expressed in mature flowers and stamens. ATG18A (AT3G62770) is involved in autophagy [63] and PG0024704 was clustered with ATG18A in one cluster. AtBUB31 (AT3G19590) is involved in cell division [64] and PG0025750 was orthologous to AtBUB31. PG2021193 was clustered with AtYAO (AT4G21130) in one cluster, which plays a role in embryo sac development [65]. AtTTG1 (AT5G24520) is involved in anthocyanin synthesis [61]. Based on the PGSC dataset, PG0026477, PG0000561, and PG0044039 were annotated as TTG1 (StAN11) in potato, but all members of this subfamily only contained the WD40 domain. In summary, gene duplication provided a large genome with numerous functional genes for adaptation to various environments in potato.

Phylogenetic Analysis and Evolution of StWD40s
Numerous phylogenetic analyses of WD40 proteins were conducted in Arabidopsis [31]. In order to obtain an overview of the 178 StWD40 proteins and their relationships with Arabidopsis, a phylogenetic tree combining potato and Arabidopsis WD40 proteins was constructed. The results showed that 410 WD40 members (178 StWD40 proteins and 232 AtWD40 proteins) were divided into 14 clades ( Figure 5). Comparing the full-length cDNA sequences with the corresponding genomic DNA sequences, we observed that 154 (86.52%) StWD40 coding sequences were interrupted by introns and 24 (13.48%) StWD40 sequences had no introns. Among them, PG0010959 had 30 introns, which was the most introns among the StWD40 genes. Furthermore, the StWD40 genes in the same branch not only had high sequence similarity, but also had similar intron-exon structures ( Figure 2B). We further analyzed the structure of 45 StWD40 genes that were orthologous to Arabidopsis. Genes in orthologous pairs usually exhibited the same genetic structure ( Figure S4). It is noteworthy that the exon length of the StWD40 genes, which were orthologous to Arabidopsis, were similar to that of Arabidopsis, but differed in intron length. The results suggested that introns in WD40 played a minor role in the evolution of the Arabidopsis and potato WD40 gene family through some unknown mechanisms. This is consistent with previous studies in cucumber [31].

Expression Profile Analysis of StWD40 Genes
Tissue expression profiling and analysis of published datasets revealed that many StWD40 genes were expressed in various tissues of potato, and some of them were tissue-specific. The JINGUBANG gene is a novel negative regulator of pollen germination in Arabidopsis, and JGB forms a regulatory complex with TCP4 to control pollen Jasmonic acid synthesis and ensure pollination in moist environments [61]. In potato, PG0010081, PG0006520, and PG1015451 were specifically expressed in mature flowers and stamens, which were annotated as JGB. PG0026477 and PG0016973 were differentially expressed in tubers and stolons of potato. PG0026477 was annotated as AN11 in the PGSC database, homologous with TTG1 in Arabidopsis, and PG0016973 was annotated as a transducing family protein. PG0002566 was annotated as Coatomer subunit beta'-2 in the PGSC database, and was specifically expressed in tuber and callus tissues. It is speculated that these may be involved in the formation of potato tubers but this has not yet been reported; their functions need to be further studied. PG0044039 and PG0000561 were a pair of tandem duplication genes, with PG0044039 being specifically expressed in fruit and PG0000561 being highly expressed in tubers. According to the results of previous studies, some evolving new members may lose their original functions or acquire new functions in gene duplication events [66], which may be a reason for differences in expression patterns between PG0044039 and PG0000561.
We also analyzed the expression profiles of StWD40 genes in different tuber tissues (skin and flesh) of tetraploid pigmented potato cultivars. Studies showed that TTG1 is involved in regulating anthocyanin biosynthesis [60,61]. PG0026477, PG0000561, and PG0044039 were annotated as StAN11 genes, namely TRANSPARENT TESTA GLABRA 1 (TTG1), with PG0044039 and PG000561 being a pair of tandem duplication genes. PG0026477 has already been cloned [67]. Previous studies found that PG0026477 is involved in anthocyanin biosynthesis [67,68], but PG0000561 and PG0044039 have not been reported. PG0026477 was only upregulated in purple and red flesh and less expressed in pigmented skin than in white skin. PG0000561 was upregulated in pigmented flesh and also was more highly expressed in pigmented skin. Based on the protein interaction network, PG0026477 interacted with HLH1-1 (PG0012891), which was upregulated in pigmented tissues but not expressed in white skin and flesh. Moreover, all of the MYB transcription factors in the interaction network interacted with PG0026477, further suggesting that PG0026477 synthesizes the MBW complex with HLH1-1 and MYB transcription factors and that this complex is involved in the synthesis of anthocyanins. In addition, PG0002566 was upregulated in pigmented skin (log2FC > 7) and HM flesh (log2FC = 3.48). PG0000235 and PG0004584 were upregulated in pigmented flesh, with log2FC > 2. These genes may be involved in the synthesis of anthocyanins, but they have not yet been reported in Arabidopsis. Therefore, their role in anthocyanin synthesis remains to be further studied.
Numerous WD40 proteins have been characterized by genetic analysis and found in response to various abiotic stresses [40,69,70]. There is no uniform gene expression pattern for StWD40 genes. A total of 178 StWD40 genes were differentially expressed under salt, mannitol, and heat stress, indicating that they may be involved in crosstalk between different signal transduction pathways in response to abiotic stresses [27,28,32]. Some StWD40 genes were induced under various stresses, however, some just responded to one stress stimulus. In addition, some StWD40 genes had different expression patterns, as some of these genes were highly expressed in one stress, but rarely expressed or did not change in response to other stresses.
In general, members within one cluster may have common evolutionary origins and conserved functions, therefore, the functions of certain members can be confirmed by the known functions of members in the same cluster [71]. PG0015970 was downregulated under salt, mannitol, and heat treatments. PG0015970 was clustered with three AtRACK1 homologous genes in one cluster, namely RACK1A (AT1G18080), RACK1B (AT1G48630), and RACK1C (AT3G18130), respectively. Previous studies showed that RACK1 can interact with many signal molecules and affect different signal transduction pathways [72][73][74]. Liu, et al. [72] found that overexpression of the MaRACK1 gene in Arabidopsis decreased tolerance to drought and salt stresses and MaRACK1 overexpression changed expression levels of genes in response to stress and stimuli. Guo et al. [73] found that the RACK1 gene may integrate ethylene signaling to regulate plant growth and development in Arabidopsis. PG1005658 was homologous to AT1G18080, which was downregulated under salt and mannitol treatments. The above results indicated that PG0015970 and PG1005658 may have similar functions to RACK1 under abiotic stress. Autophagy is a vital means of recovering damaged cellular components or degrading various proteins when plants are starved or oxidatively stressed [10,75]. The autophagy-related gene (ATG18), a phosphoinositide-binding protein that acts through the WD40 domain, is required for vesicle formation in autophagy [63]. In the present study, we found that PG0024704, which was in the same cluster as ATG18A, C, D, and E, was significantly downregulated under heat stress, but expression was increased under mannitol stress and did not respond to salt stress. These results were inconsistent with those from research in apple (Malus) [63]. PG0002566 and PG2005959 were differentially expressed in three periods (early flowering stage, full-blooming stage, and flower-falling stage) under drought stress in a drought-sensitive cultivar (A) and a drought-tolerant cultivar (Q) (Figure 8), and PG0002566 was also upregulated in pigmented skins. However, the function of PG0002566 has not yet been reported. In summary, StWD40 genes may play a role in response to abiotic stress in potato and possibly participate in the communication pathway during different signal transduction pathways.

The Interaction Network of StAN11 Proteins
In Arabidopsis, TTG1 interacts with MYB and bHLH1 to form MBW complexes that are involved in anthocyanin synthesis [23]. Based on the PGSC database, PG0026477 and PG0000561 were annotated as StAN11, which was homologous to Arabidopsis TTG1. PG0026477b not only interacted with the MYB transcription factor and bHLH1 (PG0012891), but also directly interacted with multiple genes that were involved in the flavonoid biosynthesis pathway, such as ANS (PG0003090, PG0003091, PG0003880, PG0027333, PG0027396, and PG0033567), F3 5 H (PG0000425), DFR (PG0003605), F3H (PG0003563), and FLS (PG1024988) [76][77][78]. PG0026477, annotated as StAN11, plays the same role as the TTG1 gene in anthocyanin synthesis, which was reported to form a MBW complex with the MYB transcription factor and bHLH1. In addition, PG0000516 was also annotated as StAN11, but did not interact with MYB and bHLH1. PG0000561 was highly expressed in pigmented tissues and directly interacted with DFR and F3H. KCS10 (PG0010819) interacted with PG0000561 and was upregulated in pigmented skin. Studies showed that KCS10 encodes a protein that is probably involved in the synthesis of long-chain lipids in the cuticle [79], but whether it is involved in the synthesis of anthocyanins has not been reported. Studies of StAN11 and the proteins interacting with it could provide further understanding regarding the complex regulatory system in potato anthocyanin synthesis.

Conclusions
In this study, 178 StWD40 genes were identified and analyzed at the genome-wide level. All these genes were distributed unevenly on 12 chromosomes. A total of 178 StWD40s were phylogenetically divided into 14 distinct clusters based on highly conserved gene structures and motifs. According to their domain compositions, the 178 StWD40s were grouped into 10 subfamilies. Collinearity analysis showed that segmental duplication events played a crucial role in the expansion of the StWD40 gene family. Synteny analysis indicated that 45 and 23 StWD40 genes were orthologous to Arabidopsis and wheat, respectively. According to Ka/Ks ratios, these gene pairs evolved under strong purifying selection. Based on RNA-seq, we performed a comprehensive expression analysis of StWD40 genes in different tissues, multiple abiotic stresses, colorful potato cultivars (XDP, HM, and LT), and different drought tolerance cultivars (Q and A) in order to identify associations of StWD40 genes with spatial distribution, anthocyanin biosynthesis, and multiple abiotic stresses. Furthermore, we analyzed two StAN11 protein interaction networks that were highly expressed in anthocyanin biosynthesis. The results of this study increase our understanding of the characterization of the StWD40 superfamily and provide valuable information regarding further functional elucidation of these genes in potato.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4395/10/3/401/s1. Table S1: Primers used for qPCR; Table S2: Characteristic features of 178 StWD40 genes; Table S3: Specific conserved motifs identified by MEME among potato WD40 proteins; Table S4: Specific conserved motifs identified by MEME among WD40 proteins in potato, wheat, and Arabidopsis; Table S5: Domain composition and number of members of 10 subfamilies of 178 StWD40s; Table S6: KaKs ratios of tandemly and segmentally duplicated StWD40s; Table S7: KaKs ratios of the syntenic relationships for WD40 between potato and Arabidopsis; Table  S8: KaKs ratios of the syntenic relationships for WD40 between potato and wheat;  Table S12: RNA-seq data of StWD40 genes under drought stress; Table S13: Protein interaction network of StAN11 (TTG1); Table S14: The FPKM value of genes in protein interaction networks; Figure S1: The melt curves of 19 StWD40 genes and the internal standard gene; Figure S2: Distribution of conserved motifs of the WD40 proteins in potato, wheat, and Arabidopsis. Boxes of different colors represent 10 putative motifs. Figure  Acknowledgments: We wish to thank Yuanyuan Pu, Lijuan Niu and Wenhao Li of Gansu Agricultural University for their help in the improvement of the manuscript. We also thank Shujuan Jiao, Yunchun Wei and Tiqian Han of Gansu Agricultural University for their help in the experiments.

Conflicts of Interest:
The authors declare no conflict of interest.