Identification and Expression Profiling of Protein Phosphatases (PP2C) Gene Family in Gossypium hirsutum L.

The protein phosphatase (PP2C) gene family, known to participate in cellular processes, is one of the momentous and conserved plant-specific gene families that regulate signal transduction in eukaryotic organisms. Recently, PP2Cs were identified in Arabidopsis and various other crop species, but analysis of PP2C in cotton is yet to be reported. In the current research, we found 87 (Gossypium arboreum), 147 (Gossypium barbadense), 181 (Gossypium hirsutum), and 99 (Gossypium raimondii) PP2C-encoding genes in total from the cotton genome. Herein, we provide a comprehensive analysis of the PP2C gene family in cotton, such as gene structure organization, gene duplications, expression profiling, chromosomal mapping, protein motif organization, and phylogenetic relationships of each species. Phylogenetic analysis further categorized PP2C genes into 12 subgroups based on conserved domain composition analysis. Moreover, we observed a strong signature of purifying selection among duplicated pairs (i.e., segmental and dispersed) of Gossypium hirsutum. We also observed the tissue-specific response of GhPP2C genes in organ and fiber development by comparing the RNA-sequence (RNA-seq) data reported on different organs. The qRT-PCR validation of 30 GhPP2C genes suggested their critical role in cotton by exposure to heat, cold, drought, and salt stress treatments. Hence, our findings provide an overview of the PP2C gene family in cotton based on various bioinformatic tools that demonstrated their critical role in organ and fiber development, and abiotic stress tolerance, thereby contributing to the genetic improvement of cotton for the resistant cultivar.


Introduction
The protein kinases (PKs) and protein phosphatases (PPs) are known to regulate the protein function, and are the fundamental molecular mechanism, by reversing protein phosphorylation during cellular signaling. Thus, it is involved in many biological processes, such as signal transduction, development, and environmental stimuli [1]. The PKs phosphorylate largely serine (Ser), threonine (Thr), and tyrosine (Tyr), whereas PPs can reverse this functioning by eliminating the phosphate group [2]. Mainly, the PPs are subcategorized into three major groups based on their requirement

Phylogenetic and Genomic Distribution and Organization Analysis of PP2C Gene Family
To study the phylogenetic relationship of the PP2C genes among cotton plants (G. arboreum, G. barbadense, G. hirsutum, and G. raimondii), a maximum likelihood (ML) tree was constructed with Arabidopsis using MEGA 7.0. The phylogenetic tree revealed that the PP2C genes can be further divided into 12 subgroups, as previously reported [26,27] (Supplementary Figures S1-S3). Moreover, 181 GhPP2C genes were clustered into 12 subgroups (A-L) with Arabidopsis. In the phylogenetic tree, subgroup D had the most members of the PP2C genes (38), followed by subgroup E (27), and the least number of PP2C genes was observed in subgroup L, having 4 GhPP2C genes (Figures 1 and 2a). Moreover, we also analyzed the conserved motifs and gene structure based on phylogenetic relationships to provide insight into the structural features of the PP2C members in G. hirsutum. For GhPP2C proteins, 10 conserved motifs were acquired with the help of MEME. The majority of the PP2C family contained motifs 7 and 2, except a few genes showed motifs 3 and 4. This indicates that all the identified PP2C genes have typical family features and the proteins classified into the same subgroup share similar amino acid sequences ( Figure 2b). Alongside, their logos were obtained by online MEME server. A total of 10 types of consensus motifs were obtained in all of the GhPP2C proteins and their distribution patterns are presented in Supplementary Figure S4. To better understand the gene structure of PP2C genes in cotton, exon-intron organizations of these genes were also tested ( Figure 2c) [26], and most of the subgroups contained 1-10 introns. These results specify that PP2C genes in the same subgroup show more or less similar exon-intron organization.

Chromosomal Localization and Syntenic Relationships of PP2C Gene Family
The chromosomal localization of A and D genomes of G. hirsutum was analyzed using Tbtools software. A total of 89 GhPP2C genes were allocated in D genome (D01-D13) ranging from 2-14 genes per chromosome and only 2 genes were found on the scaffold. Moreover, every chromosome showed variation in a number of genes, such as D04 exhibited the highest number of GhPP2C genes (14), followed by 12 genes in D02, and the least number of genes (2) were recorded for D07 ( Figure  3). On the other hand, we also demonstrated the chromosomal localization for A genome (A01-A12). A total of 79 genes were found ranging from 2-15 per chromosomes and 9 of them were located on

Chromosomal Localization and Syntenic Relationships of PP2C Gene Family
The chromosomal localization of A and D genomes of G. hirsutum was analyzed using Tbtools software. A total of 89 GhPP2C genes were allocated in D genome (D01-D13) ranging from 2-14 genes per chromosome and only 2 genes were found on the scaffold. Moreover, every chromosome showed variation in a number of genes, such as D04 exhibited the highest number of GhPP2C genes (14), followed by 12 genes in D02, and the least number of genes (2) were recorded for D07 ( Figure 3).
On the other hand, we also demonstrated the chromosomal localization for A genome (A01-A12). A total of 79 genes were found ranging from 2-15 per chromosomes and 9 of them were located on the scaffold. The highest number of genes was found on A05 (15), followed by A11 (8), and the least number of PP2C genes (2) were found on A13 (Figure 4). These findings suggested that GhPP2C were allocated unevenly to different chromosomal locations.
Moreover, a collinear correlation was also demonstrated between G. hirsutum and Arabidopsis (A and D genomes) (Figure 5a,b). To validate our results, we also performed a collinear relationship of PP2C genes using only G. hirsutum (Supplementary Figure S5). The results exhibited high conservation among PP2C members between the A and D genomes of cotton. the scaffold. The highest number of genes was found on A05 (15), followed by A11 (8), and the least number of PP2C genes (2) were found on A13 (Figure 4). These findings suggested that GhPP2C were allocated unevenly to different chromosomal locations. Moreover, a collinear correlation was also demonstrated between G. hirsutum and Arabidopsis (A and D genomes) (Figure 5a,b). To validate our results, we also performed a collinear relationship of PP2C genes using only G. hirsutum (Supplementary Figure S5). The results exhibited high conservation among PP2C members between the A and D genomes of cotton.    the scaffold. The highest number of genes was found on A05 (15), followed by A11 (8), and the least number of PP2C genes (2) were found on A13 (Figure 4). These findings suggested that GhPP2C were allocated unevenly to different chromosomal locations. Moreover, a collinear correlation was also demonstrated between G. hirsutum and Arabidopsis (A and D genomes) (Figure 5a,b). To validate our results, we also performed a collinear relationship of PP2C genes using only G. hirsutum (Supplementary Figure S5). The results exhibited high conservation among PP2C members between the A and D genomes of cotton.

Analysis of Putative Regulatory Cis-Element and Gene Duplication Analysis of PP2C Gene Family in Cotton
In the promoter region of PP2C genes, cis-acting elements play a critical role as stress-adaptive signaling in plants by interacting with their cognate transcription factor (TF). For instance, abscisic acid (ABA)-responsive elements (ABREs) are involved in salt, drought, and ABA signaling [28]. LTR is crucial to chilling response and regulation [29]. Likewise, TCA-elements and TGACG-motif are responsive to salicylic acid (SA) and MeJA treatments [30]. The exploration of the cis-acting element of G. hirsutum PP2C genes was executed by using the PlantCARE database and seven common cis-regulatory elements were briefly summarized ( Figure 6 and Supplementary Table S5). The results unveiled that most of the genes contributed in various signaling pathways such as phytohormones and biotic and abiotic regulatory stress factors. On the other hand, PP2C genes are recognized to show a major part in both biotic and abiotic stress phenomena. Most of the genes were highly (34.88%) responsive to light (AE-BOX, BOX-4, LAMP-ELEMENTS, GAG-motif, GATA-motif), followed by (24.32%) hormones (ABRE, CGTCA, TGA, TCA, AuxRe, GARE-motif), (18.61%) and other regulatory cis-elements (HD-ZIP3, o2-site, AT-Rich elements, CAT-BOX, A-Box, EIRE), while the fewest genes (0.93% and 0.85%) were mainly counted in circadian and enhancer elements (5UTR Py-rich stretch, TA-Rich Region, and GC-motif), respectively. In plants, the circadian cis-regulatory element is known to control the circadian rhythms. Moreover, these results indicate that PP2C genes are vital in various biotic-abiotic/hormone signaling which might be hypothesized by their diversity in nature.
For gene duplication analysis, we used MCScanX to determine the types of gene duplications and the results suggested most of the genes were segmental (167) and few of them were dispersed (12); thus, indicating that segmental duplication plays a major contribution to the expansion of the PP2C gene family. Moreover, the PP2C genes might have experienced functional discrepancy due to gene duplications, and few of them might have lost their unique functions, developed novel functions, or preserved partition of innovative functions [31,32]. During the evolutionary processes, genes are often exposed to various selective pressures such as positive, neutral, and purifying selection. Additionally, for a better understanding of the selection pressure between the duplicated genes, we calculated the Ka/Ks ratios among selected genes from segmental and dispersed ( Figure 7 and Table 1). It was shown that only two pairs have a positive selection (>1) Ka/Ks, while the rest were purifying in nature with <1.00, reducing divergence after duplication.

Analysis of Putative Regulatory Cis-Element and Gene Duplication Analysis of PP2C Gene Family in Cotton
In the promoter region of PP2C genes, cis-acting elements play a critical role as stress-adaptive signaling in plants by interacting with their cognate transcription factor (TF). For instance, abscisic acid (ABA)-responsive elements (ABREs) are involved in salt, drought, and ABA signaling [28]. LTR is crucial to chilling response and regulation [29]. Likewise, TCA-elements and TGACG-motif are responsive to salicylic acid (SA) and MeJA treatments [30]. The exploration of the cis-acting element of G. hirsutum PP2C genes was executed by using the PlantCARE database and seven common cis-regulatory elements were briefly summarized ( Figure 6 and Supplementary Table S5). The results unveiled that most of the genes contributed in various signaling pathways such as phytohormones and biotic and abiotic regulatory stress factors. On the other hand, PP2C genes are recognized to show a major part in both biotic and abiotic stress phenomena. Most of the genes were highly (34.88%) responsive to light (AE-BOX, BOX-4, LAMP-ELEMENTS, GAG-motif, GATA-motif), followed by (24.32%) hormones (ABRE, CGTCA, TGA, TCA, AuxRe, GARE-motif), (18.61%) and other regulatory cis-elements (HD-ZIP3, o2-site, AT-Rich elements, CAT-BOX, A-Box, EIRE), while the fewest genes (0.93% and 0.85%) were mainly counted in circadian and enhancer elements (5UTR Py-rich stretch, TA-Rich Region, and GC-motif), respectively. In plants, the circadian cis-regulatory element is known to control the circadian rhythms. Moreover, these results indicate that PP2C genes are vital in various biotic-abiotic/hormone signaling which might be hypothesized by their diversity in nature.
For gene duplication analysis, we used MCScanX to determine the types of gene duplications and the results suggested most of the genes were segmental (167) and few of them were dispersed (12); thus, indicating that segmental duplication plays a major contribution to the expansion of the PP2C gene family. Moreover, the PP2C genes might have experienced functional discrepancy due to gene duplications, and few of them might have lost their unique functions, developed novel functions, or preserved partition of innovative functions [31,32]. During the evolutionary processes, genes are often exposed to various selective pressures such as positive, neutral, and purifying selection. Additionally, for a better understanding of the selection pressure between the duplicated genes, we calculated the Ka/Ks ratios among selected genes from segmental and dispersed ( Figure 7 and Table 1). It was shown that only two pairs have a positive selection (>1) Ka/Ks, while the rest were purifying in nature with <1.00, reducing divergence after duplication.

Expression Profiling of PP2C Gene Family in Different Tissues of G. hirsutum
To gain insight into the tissue-specific expression patterns of the cotton, previously reported [24] transcriptome data was utilized for various tissues (root, stem, leaf, patel, and stamen) and fiber development (3,6,9,12, and 15 days post anthesis). For instance, Figure 8a reveals the expression levels of PP2C genes drastically varied in various tissues and most numbers of them were highly expressed. However, the expression of a few genes (GhPP2C12, 16
between these GhPP2C genes in tissue-specific responses, a correlation analysis was established based on the Pearson correlation coefficients (PCCs) (p = 0.05). Results showed a higher positive correlation among various specific tissues (Figure 8c and Supplementary Table S6).
Furthermore, the expression patterns of fiber developmental stages (3,6,9,12, and 15 days post anthesis) exhibited a dynamic expression level. Majority of the PP2C genes were highly expressed (Figure 9a), but some of them were not expressed, such as GhPP2C11, 21,25,26,31,40,42,45,54,76,80,86,89,91,98,100,110,117,137,148,155,171, and GhPP2C178, respectively. As shown in Figure  9b, the clustering analysis indicated the common developmental genes in 3 days post anthesis (DPA) (4), 6DPA (2), 9DPA (3), and 15DPA (1), respectively. The PCCs-based correlation analysis of the relative gene expression of selected genes suggested a high positive correlation and low inverse correlation between selected genes. In addition, some genes exhibited an inverse correlation among various fiber developmental stages (Figure 9c and Supplementary Table S6). These findings suggested that PP2C genes share diverse and high expression patterns in tissues and fiber development, which implied PP2C genes are conserved.  Furthermore, the expression patterns of fiber developmental stages (3,6,9,12, and 15 days post anthesis) exhibited a dynamic expression level. Majority of the PP2C genes were highly expressed (Figure 9a), but some of them were not expressed, such as GhPP2C11, 21 Figure 9b, the clustering analysis indicated the common developmental genes in 3 days post anthesis (DPA) (4), 6DPA (2), 9DPA (3), and 15DPA (1), respectively. The PCCs-based correlation analysis of the relative gene expression of selected genes suggested a high positive correlation and low inverse correlation between selected genes. In addition, some genes exhibited an inverse correlation among various fiber developmental stages (Figure 9c and Supplementary Table S6). These findings suggested that PP2C genes share diverse and high expression patterns in tissues and fiber development, which implied PP2C genes are conserved.

qRT-PCR Analysis of the Candidate PP2C Gene Family in Response to Various Stresses
A prediction of the cis-regulatory elements indicated that GhPP2C genes may participate in responses to heat, cold, drought, and NaCl stress tolerance. Moreover, the expression profiling of PP2C has been studied in various species after exposure to abiotic-biotic and hormones stresses [3,26]. To verify this hypothesis, we subjected the cotton seedlings to four various abiotic stress treatments such as heat, cold, drought, and NaCl stress, and then carefully selected 30 genes for qRT-PCR. The results showed that some GhPP2C genes exhibited high transcript levels after exposure to multiple treatments, but a few of them were induced by one or more treatments. For instance, heat stress possesses the dominant portion of down-regulated genes (52%). However, cold stress showed 70% of the genes were up-regulated and 30% decreased in transcript level, followed by drought stress, which exhibited about 53% and 47% of up-and down-regulated GhPP2C genes, respectively. On the other hand, exposure to salt stress resulted in 56% up-regulation and 44% down-regulation in genes ( Figure 10). Among these 30 genes, we also calculated the Pearson correlation coefficient (PCC) based on the expression by making three categories (i.e., highly positive

qRT-PCR Analysis of the Candidate PP2C Gene Family in Response to Various Stresses
A prediction of the cis-regulatory elements indicated that GhPP2C genes may participate in responses to heat, cold, drought, and NaCl stress tolerance. Moreover, the expression profiling of PP2C has been studied in various species after exposure to abiotic-biotic and hormones stresses [3,26]. To verify this hypothesis, we subjected the cotton seedlings to four various abiotic stress treatments such as heat, cold, drought, and NaCl stress, and then carefully selected 30 genes for qRT-PCR. The results showed that some GhPP2C genes exhibited high transcript levels after exposure to multiple treatments, but a few of them were induced by one or more treatments. For instance, heat stress possesses the dominant portion of down-regulated genes (52%). However, cold stress showed 70% of the genes were up-regulated and 30% decreased in transcript level, followed by drought stress, which exhibited about 53% and 47% of up-and down-regulated GhPP2C genes, respectively. On the other hand, exposure to salt stress resulted in 56% up-regulation and 44% down-regulation in genes ( Figure 10). Among these 30 genes, we also calculated the Pearson correlation coefficient (PCC) based on the expression by making three categories (i.e., highly positive >0.5, mild positive <0.5 and >0, and negative correlation <0). The results emphasized that both cold and salt stress showed 12 each PCC values having a highly positive correlation, while negative correlation was recorded in heat and drought with 6 each PCC values (Figure 11 and Supplementary Table S6). Taken together, all the 30 genes were induced by different abiotic stresses, but the diversity in the expression profiling of GhPP2C genes may suggest that these genes may be critical to abiotic-stress responses.  (Figure 11 and Supplementary  Table S6). Taken together, all the 30 genes were induced by different abiotic stresses, but the diversity in the expression profiling of GhPP2C genes may suggest that these genes may be critical to abiotic-stress responses.

Discussion
In this study, we systematically investigate the PP2C genes based on the genome-wide analysis. Although the PP2C gene family has been in many species, the knowledge of PP2C genes in cotton is yet to be elucidated and their systematic analysis has not been reported. Previously, the PP2C gene family has been reported in different plant species such as maize [12], rice [11], Arabidopsis [10], hot pepper [33], banana [34], and Brachypodium distachyon [3]. Thus, we observed high variations in the number of PP2C genes that might be present during whole-genome duplication (WGD) events. For exploring new biological functions, evolutionary implications and its expansions are mainly based on gene duplications [35]. Therefore, to study evolutionary analysis and polyploid formation, cotton is an excellent model and ideal crop by being typically allotetraploid [36]. The importance of evolutionary analysis is further reflected in the fact that most of the angiosperms have undergone either one or multiple polyploidization events [37][38][39]. Herein, we systematically found PP2C-encoding genes in four cotton species-87 (Gossypium arboreum), 147 (Gossypium barbadense), 181 (Gossypium hirsutum), and 99 (Gossypium raimondii). Cotton species represent the high number of genes, specifying that the PP2C gene family have undergone extensive expansion during the evolution of cotton. In the current study, a comprehensive genome-wide analysis was executed, such as gene identification, gene structure organization, phylogenetic characterization, syntenic relationships, chromosomal localization, and gene duplications. Moreover, transcriptional profiling of 30 PP2C genes was also performed, after exposure to heat, cold, drought, and salt stress was also carried out [12,27].
In this study, the subcellular predictions for most of the members of GaPP2C, GbPP2C, GhPP2C, and GrPP2C were mainly located in different organelles such as the chloroplast, nuclear region, mitochondria, cytoplasm, and others. In addition, the gene characteristics of GaPP2C, GbPP2C, GhPP2C, and GrPP2C drastically vary including the protein length (aa), molecular weight (MW), predicted isoelectric point (PI), and a grand average of hydropathicity (GRAVY), signifying that different PP2C proteins may play complex roles in variable microenvironments.
The estimate of evolutionary patterns, such as the rate of computing the selection pressure analysis (Ka/Ks), provides useful information about purifying, positive, and neutral selections of gene pairs during the rate of divergence [40]. In evolutionary events, if the value of the Ka/Ks ratio is <1.00, this represents purifying selection, a Ka/Ks ratio of 1.00 indicates neutral selection, and/or Ka/Ks >1.00 depicts positive selection [41,42]. Similarly, during evolutionary processes and expansion of a gene family, these indicators are used for the selection history. In this study, we calculated these values among segmental and dispersed pairs of PP2C genes with the help of the MEGA7.0 program. Thus, we estimated the selection pressure analysis of duplicated genes (i.e.,

Discussion
In this study, we systematically investigate the PP2C genes based on the genome-wide analysis. Although the PP2C gene family has been in many species, the knowledge of PP2C genes in cotton is yet to be elucidated and their systematic analysis has not been reported. Previously, the PP2C gene family has been reported in different plant species such as maize [12], rice [11], Arabidopsis [10], hot pepper [33], banana [34], and Brachypodium distachyon [3]. Thus, we observed high variations in the number of PP2C genes that might be present during whole-genome duplication (WGD) events. For exploring new biological functions, evolutionary implications and its expansions are mainly based on gene duplications [35]. Therefore, to study evolutionary analysis and polyploid formation, cotton is an excellent model and ideal crop by being typically allotetraploid [36]. The importance of evolutionary analysis is further reflected in the fact that most of the angiosperms have undergone either one or multiple polyploidization events [37][38][39]. Herein, we systematically found PP2C-encoding genes in four cotton species-87 (Gossypium arboreum), 147 (Gossypium barbadense), 181 (Gossypium hirsutum), and 99 (Gossypium raimondii). Cotton species represent the high number of genes, specifying that the PP2C gene family have undergone extensive expansion during the evolution of cotton. In the current study, a comprehensive genome-wide analysis was executed, such as gene identification, gene structure organization, phylogenetic characterization, syntenic relationships, chromosomal localization, and gene duplications. Moreover, transcriptional profiling of 30 PP2C genes was also performed, after exposure to heat, cold, drought, and salt stress was also carried out [12,27].
In this study, the subcellular predictions for most of the members of GaPP2C, GbPP2C, GhPP2C, and GrPP2C were mainly located in different organelles such as the chloroplast, nuclear region, mitochondria, cytoplasm, and others. In addition, the gene characteristics of GaPP2C, GbPP2C, GhPP2C, and GrPP2C drastically vary including the protein length (aa), molecular weight (MW), predicted isoelectric point (PI), and a grand average of hydropathicity (GRAVY), signifying that different PP2C proteins may play complex roles in variable microenvironments.
The estimate of evolutionary patterns, such as the rate of computing the selection pressure analysis (Ka/Ks), provides useful information about purifying, positive, and neutral selections of gene pairs during the rate of divergence [40]. In evolutionary events, if the value of the Ka/Ks ratio is <1.00, this represents purifying selection, a Ka/Ks ratio of 1.00 indicates neutral selection, and/or Ka/Ks >1.00 depicts positive selection [41,42]. Similarly, during evolutionary processes and expansion of a gene family, these indicators are used for the selection history. In this study, we calculated these values among segmental and dispersed pairs of PP2C genes with the help of the MEGA7.0 program. Thus, we estimated the selection pressure analysis of duplicated genes (i.e., segmental and dispersed) pairs. The majority of the GhPP2C pairs showed Ka/Ks ratios of less than 1.00, indicating the purifying selection, and only two pairs showed values more than 1.00, speculating the positive selection [31].
Plant productivity is always uncertain due to a range of climatic challenges such as heat, cold, drought, and salt stress being the major factors in limiting crop production. Previous studies reported that AtPP2CG1 regulates positively against salt tolerance in Arabidopsis and is induced by drought, salt, or exogenous ABA treatment [43]. In Arabidopsis, two members of PP2C genes responded inversely; for example, AP2C1 expression was powerfully tempted by drought, wounding, and cold but AP2C2 was slightly prompted by these treatments too [44]. These findings highlighted the significance of critical members of the PP2C gene family in the model plant Arabidopsis, but their specific functions may be significantly varied in cotton. For achieving gene expression patterns in diverse growth phases of GhPP2C, we systematically analyzed the previously reported transcriptome data in various tissues (root, stem, leaf, patel, and stamen) and fiber development (3,6,9,12,and 15 DPA). These results showed that most of the PP2C were highly varied, speculating the high diversity in their functions. However, few genes have shown tissue-specific expression, indicating their common importance to plant development. In GhPP2C genes, we also explored the promoter regions for identification of common conserved cis-regulatory elements. As a consequence, these results exhibit the participation of PP2C genes in various hormone signaling, including both biotic and abiotic stress factors. In the present study, to provide validation to our hypothesis, we also tested 30 candidate genes under various stresses using qRT-PCR. The majority of PP2C genes showed high striking transcriptional changes by exposure to heat, cold, drought, and salt stress, suggesting that PP2C genes might be crucial to stress tolerance in cotton. Noticeably, recent studies in Arabidopsis and rice also uncovered the pivotal role of PP2C candidate genes [11,45], however, in cotton, it is largely obscured. Moreover, 30 candidate genes involved in expression profiling demonstrated their critical role in Gossypium hirsutum.
Therefore, taken together, the results of our study provided valuable insight, signifying that PP2C has provided a stepping stone to the molecular mechanism in cotton. Alongside, the differential expression profiling and gene duplications analysis might have experienced functional divergence, and their further study will help considerably in improving the crop yield and quality and cultivating new resistant varieties.

Conclusions
We analyzed the genome analysis, evolutionary rates, and molecular characterization of PP2C genes in the Gossypium genome. Herein, we identified 87 (Gossypium arboreum), 147 (Gossypium barbadense), 181 (Gossypium hirsutum), and 99 (Gossypium raimondii) PP2C genes by bioinformatics analysis in cotton species. Gene synteny analysis showed that GhPP2C are highly conserved, while the gene duplications analysis reflected that only segmental duplication plays a starring role in the PP2C gene extension in cotton. Also, the results of the phylogenetic analysis categorized the PP2C genes into 12 subgroups. We further explored the previously published RNA-sequence (RNA-seq) data to compare the tissue-specific response of PP2C genes and their critical role in organ development and fiber. Various PP2C genes responded promptly to abiotic stresses, including heat, cold, salt, and drought, suggesting their crucial role in abiotic stress tolerance. As a consequence, our findings will facilitate advanced research on the functional analysis of PP2C genes regarding their critical role in tissues, fiber development, and abiotic stress tolerance.

Data Resources for Sequence Retrieval
For identification of PP2C genes in Gossypium and other species, we utilized the Plaza 4.0 database (https://bioinformatics.psb.ugent.be/plaza/) with the help of InterPro PP2C domain "IPR001932". The cotton genome sequences were downloaded from (https://www.cottongen.org/), and A. thaliana sequences were retrieved from TAIR (http://www.arabidopsis.org/). The domains of obtained GhPP2C proteins were further verified using the NCBI-Conserved Domain database (https://www. ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) search program and SMART databases (http://smart. embl-heidelberg.de/) [46]. Those proteins which lack PP2C domains were removed from further analysis. In addition, protein sequences that were found with obvious errors in their gene length or having less than 100 lengths were eliminated.

Multiple Sequence Alignment and Phylogenetic Analysis
The amino acid sequences of the GhPP2C proteins were used for further investigation, and multiple sequence alignment was performed by MUSCLE [47] using MEGA 7 software with the default options [48]. The phylogenetic trees were constructed using the maximum likelihood (ML) method. In order to determine the reliability of the resulting tree, bootstrap values of 1000 replications were performed with the Jones, Taylor, and Thornton amino acid substitution model (JTT model), while keeping the other parameters as a default.

Calculation of the Ka/Ks for Duplicated Genes
The Ka/Ks ratios were calculated for duplicated (segmental and dispersed) gene pairs using MEGA 7.0 [48].

Conserved Motifs, Exon-Intron Structure Analysis, and Physicochemical Parameters of PP2C Proteins
Conserved motif scanning of GhPP2C proteins was carried out through local MEME Suite Version 5.0.3. For this purpose, parameter settings were calibrated as follows: Maximum number of motifs 10, with a minimum width of 100 and a maximum of 150. The other parameters were set as default [49]. For the exon-intron structure, we used the Gene Structure Display Server (GSDS 2.0) (http://gsds.cbi. pku.edu.cn) [50]. The physicochemical properties of the proteins, including molecular weight (MW), isoelectronic points (pI), aliphatic index, and GRAVY values for each gene, were calculated using the ExPASY PROTPARAM tools (http://web.expasy.org/protparam/). The subcellular localization was predicted using the WOLF PSORT (https://wolfpsort.hgc.jp/) website.

Cis-Elements Predictions of GhPP2C
Every GhPP2C promoter sequence (selected as 2000 upstream bp) was imported in Generic File Format (GFF) file from the cotton genome. Then, the PlantCARE database (http://bioinformatics. psb.ugent.be/webtools/plantcare/html/) [51] was utilized to identify the cis-regulatory elements for promoters of each gene.

Chromosomal Location and Synteny Correlation Analysis
The chromosomal location of PP2C genes was illustrated from top to bottom concerning their position in the genome annotation by using TBtools software [52]. For synteny gene analysis, the relationships were verified between the homologs of A. thaliana and Gossypium hirsutum. Circos (using TBtools software) program was applied to exhibit the syntenic relationships among the chromosomes of G. hirsutum and A. thaliana [52].

Pearson Correlation Analyses (PCC)
Pearson correlation analysis was performed with the help of Excel 2013 in order to evaluate the PCC values that were used for qRT-PCR according to a previously reported study [53]. As well, the PCC of fragments per kilobase of transcript per million fragments mapped (FPKM) values was implemented using RStudio (R program) at 0.05 (p-value) significance level.

Plant Material and Treatments
In the present study, the germinated seeds of G. hirsutum cv. Junmian 1 were grown in plastic pots containing a mixture of soil and vermiculite (3:1). The pots were then placed in an artificial growth chamber for five weeks. The growth conditions were as follows: The temperature was set to 24/16 ºC , the photoperiod was 16/8 h, and the relative humidity was 65%-70%. The two-week-old seedling was exposed to specific treatment, as follows: For heat and cold treatments, seedlings were exposed to 38 ºC and 4 ºC , respectively. For NaCl and drought stress treatments, the seedlings were cultured in a nutrient solution medium with 250 µM and 6000 PEG (w/v). All treatments were carried out in continuous time intervals of 1, 6, and 12 h, respectively. Additionally, for every specific treatment, five randomly chosen whole seedlings were pooled to form a biological replicate. After that leaf samples were quick-frozen in liquid nitrogen and stored at −80 • C for further use.

RNA Isolation and Transcriptional Profiling of GhPP2C under Various Stresses
Total RNA was isolated from the treated frozen leaves with Trizol (Invitrogen) following the manufacturer's instructions. RNA was reverse-transcribed into cDNA using the Primer Script RT reagent kit (TAKARA, Dalian China) according to their instructions. Specific primers were designed using Becan Designer 7.9 and are presented in Supplementary Table S7. In order to check the specificity of the primers, we used the BLAST tool against the Gossypium hirsutum genome for confirmation. RT-PCR was performed according to the guidelines of previous studies [54]. Relative fold expression was calculated with the comparative Ct-method. The expression patterns of all GhPP2C genes were analyzed based on a previous study [55,56]. The cotton histone3 (AF024716) gene was used as the reference gene for qRT-PCR. In brief, the real-time PCR amplification reactions were performed on an ABI 7500 Real-Time PCR System (Applied Biosystems, California, CA, USA) using SYBR Green (Vazyme, Nanjing, China) with three replicates. The amplification parameters were denaturation at 95 • C for 10 min, 40 cycles of denaturation at 95 • C for 15 s, annealing at 60 • C for 15 s, and extension at 72 • C for 15 s.