Previous Article in Journal
Comparing Two Varieties of Blood Orange: A Differential Methylation Region Within the Specific Encoding Sequence of a Retrotransposon Adjacent to the Ruby Locus
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evolutionary Expansion, Structural Diversification, and Functional Prediction of the GeBP Gene Family in Brassica oleracea

1
Department of Bioinformatics, School of Life Sciences, North China University of Science and Technology, Tangshan 063000, China
2
Center for Genomics and Computational Biology, School of Life Sciences, North China University of Science and Technology, Tangshan 063000, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Horticulturae 2025, 11(8), 968; https://doi.org/10.3390/horticulturae11080968
Submission received: 25 June 2025 / Revised: 6 August 2025 / Accepted: 12 August 2025 / Published: 15 August 2025
(This article belongs to the Section Genetics, Genomics, Breeding, and Biotechnology (G2B2))

Abstract

The GLABROUS1 Enhancer Binding Protein (GeBP) gene family plays a crucial role in plant growth, development, and stress responses. In this study, 28 GeBP genes were identified in Brassica oleracea using HMMER and validated through multiple conserved domain databases. A phylogenetic tree was constructed based on the GeBP protein sequences from B. oleracea, Arabidopsis thaliana, Brassica rapa, and Brassica napus, dividing them into four evolutionary clades (A–D), which revealed a close evolutionary relationship within the genus Brassica. Conserved motif and gene structure analyses showed clade-specific features, while physicochemical property analysis indicated that most BoGeBP proteins are hydrophilic, nuclear-localized, and structurally diverse. Gene duplication and chromosomal localization analyses suggested that both segmental and tandem duplication events have contributed to the expansion of this gene family. Promoter cis-element analysis revealed a dominance of light-responsive and hormone-responsive elements, implying potential roles in photomorphogenesis and stress signaling pathways. Notably, the protein encoded by BolC01g019630.2J possesses both a transmembrane domain and characteristics of the Major Facilitator Superfamily (MFS) transporter family, and it is predicted to localize to the plasma membrane. This suggests that it may act as a molecular bridge between environmental signal perception and transcriptional regulation, potentially representing a novel signaling mechanism within the GeBP family. This unique feature implies its involvement in transmembrane signal perception and downstream transcriptional regulation under environmental stimuli, providing valuable insights for further investigation of its role in stress responses and metabolic regulation. Overall, this study provides a theoretical foundation for understanding the evolutionary patterns and functional diversity of the GeBP gene family in B. oleracea and lays a basis for future functional validation and breeding applications.

1. Introduction

Transcription factors (TFs) are key regulators in plants that play crucial roles in various growth and developmental processes, as well as in responses to abiotic stresses [1,2,3,4]. The GLABROUS1 Enhancer Binding Protein (GeBP) family is a family of transcription factors specific to plants whose members share a central DNA-binding domain and were initially discovered in Arabidopsis thaliana. GeBP and its homologs share two conserved regions: an unknown motif in the central region and a C-terminal hypothesized leucine zipper motif [5]. Both regions are crucial for downstream gene expression transactivation. At present, 16, 10, 10, 9, and 16 GeBP genes have been identified in Arabidopsis [5], Solanum lycopersicum [6], Mangifera indica L [7], Glycine max [8], and Bam busoideae [9], respectively. Previous studies highlight the importance of the GeBP gene family in plant growth and development. For instance, GeBP regulates trichome development through the expression control of the GLABROUS1 (GL1) gene [10]. In Arabidopsis, GeBP also influences trichome elongation by modulating gibberellins and cytokinins in vivo [11].
Trichomes in the epidermis are hair-like structures and constitute the aerial part of most terrestrial plants [12]. Trichomes are defined as unicellular or multicellular appendages, which are an extension of the above-ground epidermal cells in plants [13]. These appendages play a key role in the development of plant sand, which occurs in a wide variety of species. Trichomes are a protective barrier against natural hazards, such as herbivores, ultraviolet (UV) irradiation, pathogen attacks, and excessive transpiration, and they aid in seed spread and seed protection [14]. In Arabidopsis, their initiation requires the activity of GLABROUS1 (GL1), which is expressed in the epidermis [15]. Curaba et al. [5] identified and isolated GL1 enhancer-binding protein (GeBP), which specifically binds to the regulatory element of the promoter and regulates the expression of GL1 [5]. GeBP is predicted to play a role in various hormonal pathways [16]. It is worth noting that the GEBP/GPL genes represent a newly defined class of leucine zipper (Leu zipper) transcription factors, and they play redundant roles in the regulation of the cytokinin hormone pathway [10]. A recent study demonstrated that the Arabidopsis GeBPLIKE 4 (GPL4) transcription factor, as an inhibitor of root growth, is induced rapidly in the root tips in response to cadmium (Cd) [17]. These research outcomes suggest that the GeBP family gene is not only involved in the developmental process of the plant but also protects against environmental stress. The constitutive expressor of pathogenesis-related gene-5 (CPR5) in Arabidopsis displays highly pleiotropic functions, particularly in pathogen responses, cell proliferation, cell expansion, and cell death. It was found that GeBP/GPLs are involved in the control of cell expansion in a CPR5-dependent manner but not in the control of cell proliferation by regulating a set of genes that represents a subset of the CPR5 pathway [18].
Brassica oleracea is a diploid species (2n = 18) within the family Brassicaceae, encompassing a wide array of economically important vegetables, such as cabbage, broccoli, cauliflower, kale, brussels sprouts, and kohlrabi. These diverse morphotypes have arisen through centuries of selective breeding, leading to significant morphological and nutritional variation within the species [19].
Native to the coastal regions of southern and western Europe, B. oleracea has been cultivated since ancient times. Its wild ancestors are believed to have originated in the eastern Mediterranean, with early cultivation records dating back to Greek and Roman periods [20].
The species has undergone whole-genome duplication events, contributing to its genetic complexity and adaptability. Recent advancements have led to the development of high-quality reference genomes for various B. oleracea cultivars, including cabbage and broccoli, facilitating comparative genomic studies and providing insights into gene family evolution, structural variations, and metabolic regulation [21,22,23]. These genomic resources have proven invaluable in understanding the domestication processes, morphological diversification, and stress response mechanisms in B. oleracea, thereby supporting breeding programs aimed at improving crop yield, nutritional quality, and resilience to environmental stresses. In practice, the availability of high-resolution genomic data enables the identification of key loci associated with traits such as disease resistance, flavor enhancement, and texture improvement, thereby facilitating marker-assisted selection and accelerating the development of superior cabbage cultivars.
The genus Brassica includes economically important horticultural crops [24], with Brassica napus being an allopolyploid derived from hybridization between B. oleracea and Brassica rapa [25]. Understanding the evolutionary trajectories and functional diversification of gene families in these species can directly inform breeding strategies for yield stability, nutritional improvement, and stress resilience. Moreover, Arabidopsis, as the model plant of the family Brassicaceae, offers a well-annotated genomic reference and serves as a valuable experimental control for in silico comparative analyses [26].
Recently, GeBP genes have been characterized in many species. However, little is known about the evolutionary dynamics of the GeBP family in B. oleracea. In this study, we applied bioinformatic methods to predict and analyze 28 BoGeBP genes in B. oleracea, including phylogenetic analysis, chromosomal localization, homology analysis, tissue expression analysis, gene structure and protein structure analyses, and codon preference analysis. The general objective of this study is to generate fundamental genomic and functional insights into the GeBP transcription factor family in B. oleracea, with specific aims to (i) characterize its evolutionary expansion and duplication mechanisms in relation to B. rapa, B. napus, and Arabidopsis; (ii) predict functional diversification based on structural and regulatory features; and (iii) identify candidate genes with potential applications in molecular breeding for horticultural trait improvement. We provide a reference and theoretical basis for further functional studies of BoGeBP genes.

2. Materials and Methods

2.1. Acquisition of Materials and Identification of GeBP Gene Family Members

The complete genome sequences of Arabidopsis were obtained from the TAIR database (https://dev.arabidopsis.org, accessed on 3 March 2025) [27], while the genome data for B. oleracea (BOL20), B. rapa (Brara_Chiifu_V3.5), and B. napus (ZS11) were retrieved from the BRAD database (brassicadb.cn). Since the gene family under investigation contains only one type of conserved domain (DUF573), the Hidden Markov Model (HMM) profile of the GeBP family (Pfam ID: PF04504) was downloaded from the Pfam database by searching for “GeBP”.
Identification of GeBP gene family members in B. oleracea, B. rapa, and B. napus was conducted using HMMER searches against the respective genomic databases, employing the downloaded HMM profile. Candidate GeBP family members were filtered using an E-value threshold of <0.001. Redundant transcripts were removed, and for genes with multiple transcript variants, only the longest transcript was retained as the representative sequence.
GeBP gene IDs in Arabidopsis were retrieved based on previous literature and validated through the PlantTFDB database (http://planttfdb.gao-lab.org/index.php, accessed on 3 March 2025) [28], resulting in the identification of 23 family members. Corresponding gene sequences were extracted from the genome accordingly.
All candidate sequences were subsequently validated using domain analysis tools, including SMART (http://smart.embl-heidelberg.de/, accessed on 3 March 2025) [29], InterPro (https://www.ebi.ac.uk/interpro/, accessed on 3 March 2025) [30], and CDD (https://www.ncbi.nlm.nih.gov/cdd, accessed on 3 March 2025) [31]. Only sequences confirmed by at least one of these tools were retained, and the union of validated sequences was considered as the final set of GeBP gene family members.

2.2. Multiple Sequence Alignment and Phylogenetic Tree Construction of the GeBP Gene Family in B. oleracea

A total of 28 B. oleracea GeBP protein sequences were subjected to multiple sequence alignment using MUSCLE (https://sourceforge.net/projects/muscle/, accessed on 5 March 2025) [32]. Phylogenetic analysis was then performed using FastTree version 2.1.11, and a maximum likelihood (ML) method was employed to construct the phylogenetic tree of GeBP proteins across four species.
Based on the conserved domain structures and MEME motif analysis of GeBP family members from all species (Figure S1), as well as reference to previous studies, B. oleracea GeBP proteins were classified into four clades: Groups A, B, C, and D. The corresponding B. oleracea family members contained within each clade of the phylogenetic tree were accordingly assigned to these four groups to facilitate subsequent analyses.

2.3. Gene Structure and Conserved Motif Analyses of the GeBP Gene Family in B. oleracea

Conserved motifs within the GeBP gene family across the studied species were identified using the online MEME suite (https://meme-suite.org/meme/, accessed on 6 March 2025)) [33], with the number of motifs set to 10. The resulting motif data were visualized using CFVisual_V2.1.5 software (https://github.com/ChenHuilong1223/CFVisual, accessed on 6 March 2025) [34] to illustrate the distribution and composition of conserved motifs among GeBP family members.

2.4. Physicochemical Property Analysis of GeBP Gene Family Members in B. oleracea

Various physicochemical properties of B. oleracea GeBP proteins, including amino acid length, molecular weight (MW), theoretical isoelectric point (pI), instability index, aliphatic index, and grand average of hydropathicity (GRAVY), were analyzed using TBtools-II v2.323 (https://github.com/CJ-Chen/TBtools-Manual, accessed on 7 March 2025) [35]. Subcellular localization predictions were conducted using the Plant-mPLoc tool (http://www.csbio.sjtu.edu.cn/bioinf/plant-multi/, accessed on 7 March 2025) [36]. Transmembrane domains were identified using TMHMM 2.0 (https://services.healthtech.dtu.dk/services/TMHMM-2.0/, accessed on 7 March 2025) [37], and the presence of signal peptides was predicted using SignalP 5.0 (https://services.healthtech.dtu.dk/services/SignalP-5.0/, accessed on 7 March 2025) [38].

2.5. Chromosomal Distribution of GeBP Gene Family Members in B. oleracea

The chromosomal locations of B. oleracea GeBP gene family members were determined and visualized using MapChart 2.32 software (https://www.wur.nl/en/show/Mapchart.htm, accessed on 8 March 2025) [39], which was also employed to enhance the visual presentation of their chromosomal distribution.

2.6. Protein Structure Prediction of GeBP Family Members in B. oleracea

The secondary structures of B. oleracea GeBP proteins were predicted using the online SOPMA tool (https://npsa.lyon.inserm.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_sopma.html, accessed on 9 March 2025) [40] with default parameters. Tertiary structure prediction was conducted using SWISS-MODEL (https://swissmodel.expasy.org, accessed on 9 March 2025) [41], also under default settings. The best-quality models were selected based on their evaluation scores, and their corresponding PDB files were downloaded. These structures were then visualized and refined using VMD 1.9.4a53 software (https://www.ks.uiuc.edu/Development/Download/download.cgi?PackageName=VMD, accessed on 9 March 2025) [42] for further presentation enhancement.

2.7. Cis-Acting Element Analysis of Promoter Regions in B. oleracea GeBP Gene Family Members

Promoter sequences, defined as the 2000 bp upstream regions of the coding sequences (CDS) of B. oleracea GeBP genes, were extracted using TBtools. These sequences were then submitted to the PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/, accessed on 10 March 2025) [43] for the prediction of cis-acting regulatory elements. The identified elements were subsequently visualized using CFVisual to facilitate interpretation of their distribution and functional classification.

2.8. Protein–Protein Interaction Network Prediction of B. oleracea GeBP Family Members

The protein–protein interaction (PPI) network of each B. oleracea GeBP protein was predicted using the STRING database (https://cn.string-db.org/, accessed on 10 March 2025) [44] with default parameters. The types and numbers of interacting proteins were identified and summarized to explore potential functional associations.

2.9. GO and KEGG Functional Annotation of B. oleracea GeBP Gene Family Members

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations of B. oleracea GeBP family members were conducted using the online tool eggNOG-Mapper (http://eggnog-mapper.embl.de/, accessed on 11 March 2025) [45]. The resulting annotation data were further processed using TBtools. Subsequently, each GO and KEGG entry was manually checked and classified through the Gene Ontology (https://www.geneontology.org/, accessed on 11 March 2025) [46] and KEGG (https://www.genome.jp/kegg/, accessed on 11 March 2025) [47] databases to obtain detailed functional categories and pathway names.

2.10. Codon Usage Bias Analysis of B. oleracea GeBP Gene Family Members

The CDS sequences of Arabidopsis were downloaded from the TAIR database, while those of B. oleracea, B. napus, and B. rapa were obtained from the BRAD database. The GeBP family CDS sequences from these four species were extracted and subjected to codon usage bias analysis using CodonW (https://sourceforge.net/projects/codonw/, accessed on 12 March 2025) [48] with default parameters.
Based on the effective number of codons (ENc) values, genes were ranked, and the top and bottom 10% of the ranked dataset (rounded to the nearest integer) were selected as representative high-expression and low-expression gene sets, respectively. The difference in relative synonymous codon usage (ΔRSCU = RSCU_high − RSCU_low) was calculated between the two sets. Codons with ΔRSCU > 0.08 were identified as high-expression preferred codons [49].
Additionally, the average RSCU value of each codon across all GeBP members was calculated. For each amino acid, the codon with the highest average RSCU value was identified. If this value exceeded 1, the codon was considered a high-frequency preferred codon. Codons that were both high-expression preferred and high-frequency preferred were ultimately defined as optimal codons [49].

2.11. Tissue Expression Analysis of B. oleracea GeBP Gene Family Members

Tissue expression data were downloaded from the NCBI database (https://www.ncbi.nlm.nih.gov, accessed on 13 March 2025) [50] under the accession numbers PRJNA672666, PRJNA562412, PRJNA641876, PRJNA649862, PRJNA688405, PRJNA548819, and PRJNA1051351. Specifically, PRJNA672666 contains expression data of endosperm tissues subjected to different temperature and treatment durations (26 °C for 72 h, 26 °C for 24 h, 26 °C for 1 h, 16 °C for 72 h, 16 °C for 24 h, and 16 °C for 1 h). PRJNA562412 provides expression data for leaf tissues, PRJNA641876 for embryo tissues, PRJNA649862 for 6-day-old sprouts under blue light, PRJNA688405 for root tissues, PRJNA548819 for stem tissues, and PRJNA1051351 for leaf tissues under drought stress conditions in B. oleracea.
Quality control of each transcriptome dataset was performed using Trimmomatic (version 0.38) with modified parameters (Trimmomatic-0.38/adapters/TruSeq3-PE.fa:2:30:10 LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:50). Transcripts per million (TPM) values of the GeBP family genes were extracted from the processed expression data. Expression levels of BoGeBP genes were normalized by log2-transformation of TPM values and visualised using TBtools.

2.12. Collinearity Analysis of B. oleracea GeBP Gene Family

Collinearity analysis of the GeBP gene family in B. oleracea was performed using the MCScanX plugin integrated in TBtools (RCAC—Software: MCScanX-SuperFast, https://purdue.edu, accessed on 14 March 2025 [51]).

2.13. Analysis of Gene Duplication Types in the B. oleracea GeBP Gene Family

Gene duplication types within the B. oleracea GeBP gene family were analyzed using the downstream tool duplicate_gene_classifier in MCScanX.

3. Results

3.1. Identification and Retrieval of GeBP Gene Family Members

Using the downloaded Hidden Markov Model (HMM) profile of GeBP, HMMER searches were conducted against the B. oleracea genome database. Candidate genes were further validated through domain confirmation using InterPro, SMART, and CDD databases. The intersection of results from these three tools yielded 28 B. oleracea GeBP genes: BolC01g001690.2J, BolC01g019250.2J, BolC01g019630.2J, BolC01g037440.2J, BolC01g055250.2J, BolC03g069820.2J, BolC04g027520.2J, BolC04g041600.2J, BolC04g042880.2J, BolC04g042890.2J, BolC04g042900.2J, BolC04g042910.2J, BolC04g042920.2J, BolC04g042930.2J, BolC04g047580.2J, BolC04g061600.2J, BolC05g009330.2J, BolC05g060710.2J, BolC06g017320.2J, BolC06g018980.2J, BolC06g018990.2J, BolC07g015340.2J, BolC07g039050.2J, BolC07g051920.2J, BolC08g008250.2J, BolC08g055380.2J, BolC09g005140.2J, and BolC09g019010.2J.
Using the same approach, 20 GeBP gene family members were identified in B. rapa, and 44 were identified in B. napus (Table 1).
Generally, most whole-genome replication events include whole-genome duplication (WGD) and whole-genome triplication (WGT), with substantial gene losses also associated with the replication process [52,53]. Brassica ancestor species’ genomes experienced WGD events, separating from the Arabidopsis lineage and then undergoing WGT events specific to the Brassicaceae lineage, then B. rapa and B. oleracea were hybridized to form allotetraploid B. napus, which also included much genome reorganization and gene loss [21,25,54]. There are 23 GeBP family genes in Arabidopsis, so, in theory, there should be 69 (23 × 3), 69 (23 × 3), and 138 (23 × 3 × 2) GeBP genes, respectively, in B. rapa, B. oleracea, and B. napus, but their actual numbers are 20, 28, and 44. Hence, after the WGT event occurred in ancestral Brassica, B. rapa lost 49 genes, B. oleracea lost 41 genes, and B. napus lost 94 genes in the GeBP family. And four (20 + 28 − 44) genes of B. napus were lost after hybridization between B. rapa and B. oleracea.

3.2. Multiple Sequence Alignment and Phylogenetic Tree Construction of the GeBP Gene Family

Based on the conserved domain and MEME motif analyses of GeBP family members from B. oleracea, B. rapa, B. napus, and Arabidopsis, a maximum likelihood (ML) phylogenetic tree was constructed and classified into four groups: Group A (yellow), Group B (blue), Group C (green), and Group D (red) (Figure 1). Among them, B. oleracea contains 11 genes in Group A, 4 in Group B, 6 in Group C, and 7 in Group D.
The GeBP genes from the four species were largely clustered together, suggesting a relatively close evolutionary relationship. Notably, the three Brassica species (B. oleracea, B. rapa, and B. napus)—all belonging to the genus Brassica—demonstrated closer phylogenetic relationships compared to Arabidopsis, which exhibited fewer clustering events, reflecting greater divergence from the Brassica lineage.

3.3. Gene Structure and Conserved Motif Analyses of the GeBP Gene Family in B. oleracea

Among the 28 GeBP genes identified in B. oleracea, none contained UTR regions (Table 2). The gene with the greatest number of introns and CDSs was BolC04g061600.2J in Group C. The domain positions within each group were generally consistent, with all genes containing at least motif 10 (blue), motif 2 (orange), and motif 3 (red). In addition to Group D, the other three groups also contained motif 5 (Figure 2). Most members of Groups A and D included motif 6 (purple), except for BolC01g001690.2J and BolC03g069820.2J, which lacked this motif.

3.4. Physicochemical Properties of BoGeBP Gene Family Members

All 28 GeBP proteins in B. oleracea were predicted to be hydrophilic (Table 3). Most proteins were localized in the nucleus. Specifically, BolC01g019630.2J (Group A) was located in the plasma membrane, BolC09g005140.2J and BolC01g055250.2J (Group B) in both the chloroplast and nucleus, BolC06g018980.2J (Group C) in the chloroplast and Golgi apparatus, and BolC03g069820.2J (Group D) in both the plasma membrane and nucleus.
The number of amino acids varied greatly among the proteins, ranging from 126 (BolC04g042900.2J) to 881 (BolC01g019630.2J). Similarly, the molecular weight spanned a wide range from 1.7 kDa (BolC04g042890.2J) to 96.1 kDa (BolC01g019630.2J). The isoelectric point (pI) values ranged from 4.45 to 9.63, with most proteins falling within the acidic to neutral range (4.45–7.89), and a few exhibiting basic properties, such as BolC06g018990.2J (pI = 9.63).
Most proteins had instability indices greater than 40, indicating a tendency toward instability in vitro. For example, BolC04g042900.2J (Group D) had an instability index of 79.05, suggesting that it may require rapid degradation or depend on post-translational modifications for functional stability. By contrast, BolC04g041600.2J (Group A) showed a relatively stable profile with an index of 35.4, implying potential for prolonged existence in membrane structures.
The aliphatic index ranged from 48.58 to 92.59, suggesting that most BoGeBP proteins possess favorable thermostability. Notably, BolC03g069820.2J (Group D) had the highest aliphatic index (92.59), indicating potential adaptation to high-temperature environments. All proteins exhibited negative GRAVY (Grand Average of Hydropathy) values (ranging from –1.942 to –0.027), supporting their overall hydrophilic nature.
Additionally, two BoGeBP proteins were identified as transmembrane proteins: BolC01g019630.2J (Group A), with transmembrane regions at positions 413–431, 461–483, 490–512, 522–544, 557–579, 583–605, 657–679, 728–750, 757–779, 799–821, and 828–850; and BolC03g069820.2J (Group D), with transmembrane regions at positions 160–182 and 589–611.

3.5. Chromosomal Distribution of BoGeBP Gene Family Members

The chromosomal locations of BoGeBP genes were mapped across the B. oleracea genome (Figure 3). Chromosome 1 contains five BoGeBP genes, while chromosome 3 harbors one. Chromosome 4 carries the highest number, with ten BoGeBP genes, indicating the presence of tandem gene duplications. Chromosomes 5, 8, and 9 each contain two BoGeBP genes. Chromosomes 6 and 7 contain three BoGeBP genes each.

3.6. Protein Structure Analysis of BoGeBP Family Members

The secondary structure analysis revealed distinct patterns among the four groups of BoGeBP proteins (Table 4). In Group A, the average content of α-helices was 44.76%, while random coils accounted for 42.93%, indicating a relatively balanced structure. This balance suggests that these proteins possess both structural stability and potentially functional active regions. Group B proteins exhibited an average α-helix content of 34.15% and a significantly higher proportion of random coils at 57.05%, implying a more flexible and less ordered structure. Group C proteins displayed a notably high α-helix content, averaging 53.05%, with some proteins reaching nearly 60% (e.g., BolC06g018980.2J), suggesting a strong helical character. In Group D, the average α-helix content was 39.01%, with substantial variability ranging from 19.58% (BolC03g069820.2J) to 53.55% (BolC04g042890.2J). The average random coil proportion was 44.47%. Some proteins in this group also showed a relatively high proportion of extended chain structures, such as BolC03g069820.2J (27.08%), which may indicate specialized functions.
The tertiary structure analysis revealed substantial diversity across the entire gene family, while members within the same group exhibited relatively similar structures (Figure 4). For example, Group A members, including BolC04g041600.2J, BolC07g051920.2J, BolC04g027520.2J, BolC09g019010.2J, BolC05g009330.2J, and BolC04g047580.2J, showed structural resemblance. Similar intra-group structural consistency was observed in Group B (e.g., BolC09g005140.2J, BolC07g039050.2J, BolC05g060710.2J), Group C (e.g., BolC06g018990.2J, and BolC07g015340.2J), and Group D (e.g., BolC04g042880.2J, BolC04g042930.2J, BolC04g042920.2J, and BolC04g042910.2J). Proteins with similar tertiary structures may possess analogous functions and possibly share closer evolutionary relationships.

3.7. Cis-Acting Regulatory Elements in the Promoter Regions of BoGeBP Genes

Among the cis-acting regulatory elements identified in the promoter regions of the 28 BoGeBP genes, light-responsive elements accounted for the highest proportion (46%), followed by hormone-responsive elements (25%), stress-responsive elements (21%), and other (8%) (Figure 5).
This result suggests that most BoGeBP genes are potentially regulated by light signals, indicating that GeBP genes may be involved not only in stress responses but also in photomorphogenesis or light signal transduction pathways.
The substantial presence of hormone-responsive elements—such as ABRE (involved in abscisic acid response) and CGTCA-motif (related to methyl jasmonate response)—suggests that BoGeBP genes may play important roles in hormone-mediated stress responses, highlighting their potential for improving plant stress resistance through molecular breeding.
Among all genes, the following three had the highest number of cis-elements: BolC06g017320.2J (24 elements), which exhibited an even distribution among light-, stress-, and hormone-responsive elements, indicating its diverse regulatory functions in plant development and stress response; BolC07g051920.2J (21 elements), which predominantly contained light-responsive elements (13 in total), including 4 G-box and 3 GT1 motifs, suggesting a strong role in light signaling; and BolC08g055380.2J (20 elements), which contained 8 light-responsive and 7 hormone-responsive elements, further supporting its involvement in light and hormonal regulatory pathways.

3.8. Protein–Protein Interaction Network Prediction of BoGeBP Family Members

Except for BolC03g069820.2J, all members of the BoGeBP family exhibit complex predicted protein–protein interaction networks (Figure 6). Although most proteins, such as BolC01g001690.2J and BolC04g042900.2J, were annotated as “uncharacterized proteins,” the functions of their interacting partners provided crucial clues. For instance, BolC01g001690.2J interacts with members of the ATG9 family (e.g., A0A0D3BT90 and A0A0D3C3E1), suggesting its potential involvement in autophagosome formation or cytoplasm-to-vacuole targeting (Cvt) processes. This implies that this gene cluster may regulate the assembly of autophagy-related membrane structures in response to nutrient deprivation or pathogen invasion.
BolC05g060710.2J, annotated as a PALP domain-containing protein, interacts with numerous thioredoxin domain-containing proteins (e.g., A0A0D3A515 and A0A0D3BVL7), indicating a possible role in redox homeostasis or sulfur metabolism regulation.
BolC04g047580.2J and BolC04g042900.2J interact with A0A0D3C304, a phytocyanin domain-containing protein, suggesting that the BoGeBP family plays an important role in the function of this protein, which may act as a signal transduction component or a metal ion-binding protein involved in intercellular communication or copper/iron homeostasis. Proteins such as BolC04g042880.2J, BolC04g042890.2J, and BolC04g042900.2J interact with A0A0D3E8D6, which contains an ERCC4 domain frequently associated with DNA repair, indicating a potential core role for the family in maintaining genome stability. BolC07g039050.2J and BolC09g005140.2J interact with bifunctional dihydrofolate reductase-thymidylate synthase proteins (e.g., A0A0D3A1Z0 and A0A0D3B7L5), which are directly involved in dTMP synthesis and folate metabolism, possibly influencing DNA replication and repair efficiency. BolC01g019630.2J, annotated as an MFS transporter protein, interacts with pectinesterase and peroxidase proteins, potentially coordinating cell wall softening (via pectin degradation) and reactive oxygen species scavenging, thus contributing to pathogen defense or developmental regulation.

3.9. GO and KEGG Analyses of BoGeBP Family Members

Gene Ontology (GO) functional annotation was performed for the BoGeBP family, and 28 BoGeBP genes were assigned a total of 133 GO terms (Table S1). These GO terms were classified into three main categories: molecular function (MF), cellular component (CC), and biological process (BP).
Among the 133 GO terms, 77 were associated with BP. Of these, 17 terms were enriched in 19 BoGeBP members, while 41 terms were enriched in only one member. For the CC category, 29 GO terms were identified, including 5 terms enriched in 10 members, 5 in 9 members, and 10 in 8 members. In the MF category, 27 GO terms were annotated, with 4 terms enriched in 12 members and 13 terms enriched in only one member.
A statistical analysis of the number of GO terms assigned to the 28 BoGeBP members (Figure 7) showed that GO terms related to biological processes were the most abundant, with a total of 503 annotations. The number of BP-related terms per gene ranged from 0 to 35. GO terms under the cellular component category totaled 214, with most genes associated with approximately 25 terms. GO terms under molecular function were the least numerous, with only 118 annotations, and most genes were associated with around 9 terms.
These findings suggest that BoGeBP family members are extensively involved in various biological processes in B. oleracea, while also playing important roles in cellular structure and molecular function.
According to the analysis of Table 5 and Table 6, BolC05g060710.2J was annotated with K01738, which encodes cysteine synthase (EC 2.5.1.47). This enzyme catalyzes a key step in cysteine biosynthesis by combining O-acetylserine and sulfide to form cysteine. It is directly involved in both cysteine and methionine metabolism (map00270) and sulfur metabolism (map00920), acting as a core enzyme in the incorporation of sulfur into amino acids. Moreover, it is also associated with carbon metabolism (map01200), biosynthesis of amino acids (map01230), and biosynthesis of secondary metabolites (map01110), indicating its multifunctional role in primary metabolism.
BolC07g039050.2J was enriched in K12951 [cobalt/nickel transporting P-type ATPase (EC 7.2.2.-)] and K15441 [tRNA-specific adenosine deaminase 2 (EC 3.5.4.-)], suggesting its potential roles in transmembrane transport of heavy metal ions and in tRNA editing, which may contribute to the regulation of translation fidelity.
BolC01g019630.2J was annotated with K13783, a member of the major facilitator superfamily (MFS), predicted to function as a glycerol-3-phosphate transporter. This implies its possible involvement in energy metabolism (e.g., the glycerol-3-phosphate shuttle) or lipid biosynthesis, thereby influencing cellular energy homeostasis.

3.10. Codon Usage Bias Analysis of the GeBP Gene Family in B. oleracea

The Nc (effective number of codons) values of the BoGeBP genes ranged from 47.84 to 59.89, with an average of 53.6282. The CAI (codon adaptation index) values ranged from 0.166 to 0.301, with an average of 0.2318. The CBI (codon bias index) values ranged from −0.158 to 0.052, with a mean of −0.0432 (Table 7). For BrGeBP genes, the Nc values ranged from 47.74 to 57.67 (mean: 52.289), the CAI values ranged from 0.132 to 0.287 (mean: 0.2330), and the CBI values ranged from −0.188 to 0.053 (mean: −0.0450). For BnGeBP genes, the Nc values ranged from 37.25 to 58.36 (mean: 52.5648), the CAI values ranged from 0.168 to 0.298 (mean: 0.2284), and the CBI values ranged from −0.249 to 0.067 (mean: −0.0440). For AtGeBP, the Nc values ranged from 40.95 to 61.00, with an average of 50.2983; the CAI values ranged from 0.189 to 0.301 (mean: 0.2393), and the CBI values ranged from −0.229 to 0.081 (mean: −0.0641). These results indicate that GeBP genes in all studied species exhibit relatively weak codon usage bias.
According to Figure 8 and Table 8, BoGeBP genes possessed 11 optimal codons, of which 8 ended with A/U(T) and 3 with G/C. The high proportion of optimal codons ending in A/U suggests that members of the gene family may tend to be expressed under conditions of low translational efficiency or in stress-specific contexts. BrGeBP genes had 10 optimal codons (6 ending with A/U and 4 with G/C), while BnGeBP had 12 optimal codons (8 ending with A/U and 4 with G/C). AtGeBP genes had the highest number of optimal codons, totaling 15, with 13 ending in A/U and only 2 in G/C. These findings suggest that the GeBP gene family across all studied species preferentially uses codons ending in A or U. Two optimal codons, UUG and UCU, were found to be conserved across all four species, while UGU was conserved among species of the genus Brassica.

3.11. Tissue-Specific Expression Analysis of BoGeBP Family Members

BoGeBP genes exhibited tissue-specific expression patterns (Figure 9). The highest expression levels were observed in roots, followed by stems and floral buds. Among all members, BolC09g019010.2J showed the highest overall expression, indicating its potential key role in specific tissue functions. Notably, BolC08g055380.2J showed no detectable expression in any of the three tested tissues, suggesting it may be a pseudogene or only expressed under very specific circumstances not included in this study.

3.12. Synteny Analysis of GeBP Gene Family Among Cabbage and Other Species

To investigate the evolutionary relationships of the GeBP gene family, a synteny analysis was conducted among B. oleracea (BoGeBP), Arabidopsis (AtGeBP), B. rapa (BrGeBP), and B. napus (BnGeBP). A total of 19 syntenic gene pairs were identified between BoGeBP and AtGeBP (Figure 10), 45 pairs between BoGeBP and BrGeBP (Figure 11), and up to 72 pairs between BoGeBP and BnGeBP (Figure 12), revealing distinct differences in the degree of genomic collinearity among species.
The degree of synteny was positively correlated with phylogenetic proximity: the closer the evolutionary relationship between species, the greater the number of conserved syntenic gene pairs. This pattern arises because more recently diverged species have experienced fewer chromosomal rearrangements, gene losses, and sequence divergences, so that ancestral genomic blocks remain more intact; furthermore, genes with essential or conserved functions are often subject to purifying selection, which helps to maintain their genomic context and reinforces synteny among close relatives. The highest number of collinear pairs was found between BoGeBP and BnGeBP (72 pairs), consistent with their close taxonomic relationship—B. napus is an allotetraploid derived from hybridization between B. oleracea and B. rapa [25]. This result not only reflects their shared evolutionary history but also suggests that a large number of conserved genome duplication blocks in the GeBP gene family have been retained between these two species.
Although the number of syntenic pairs between BoGeBP and BrGeBP (45 pairs) is fewer than with BnGeBP, it is still considerably higher than that with AtGeBP (19 pairs), which aligns with the fact that both BoGeBP and BrGeB are diploid members of the genus Brassica. By contrast, the lowest number of syntenic gene pairs between BoGeBP and AtGeBP reflects their greater evolutionary divergence and lower conservation within the GeBP gene family.

3.13. Analysis of Gene Duplication Types in the GeBP Gene Family

The BoGeBP genes predominantly originated from tandem duplications (32%) and whole-genome duplications (WGD; 57%), suggesting that local tandem duplication may have rapidly generated functionally redundant genes, which, in combination with WGD, contributed to enhanced adaptability. In BrGeBP genes, dispersed duplication accounted for the majority (55%), along with a moderate proportion of WGD duplication (30%). This pattern implies that transposition or chromosomal rearrangement may play a role in dynamically modulating gene function. For BnGeBP genes, WGD duplication contributed to 82% of the family members, consistent with the allotetraploid origin of B. napus, indicating that large-scale genome duplication events, such as hybridization, are the primary drivers of gene family expansion (Table 9). By contrast, AtGeBP genes were primarily derived from dispersed duplication (57%) and lacked both tandem and proximal duplication types, suggesting that the flexibility of the Arabidopsis genome may facilitate rapid adaptation through dispersed duplication events.

4. Discussion

4.1. Expansion Mechanisms and Structural Conservation of the GeBP Gene Family

In this study, 28 GeBP genes were identified in B. oleracea, a number intermediate between the diploid B. rapa (20 genes) and the allotetraploid B. napus (44 genes), and significantly higher than that in the model plant Arabidopsis (23 genes) [21]. This pattern reflects the pivotal roles of polyploidization and gene duplication in the expansion of the GeBP family, consistent with previous findings that whole-genome duplication events underpin transcription factor family proliferation in Brassica and other plant lineages [25,55]. Phylogenetic analysis classified all GeBP proteins into four subgroups (A–D), with members from different Brassica species clustering tightly within each subgroup, suggesting a conserved and lineage-shared expansion during species evolution. Conserved domain analysis further revealed that all B. oleracea members retained motifs 2, 3, and 10 across subgroups, indicating strong structural conservation related to DNA binding and transcriptional regulation, likely maintained by purifying selection following duplication [56,57]. Chromosomal mapping showed that BoGeBP genes are unevenly distributed across the B. oleracea genome, with chromosome 4 harboring ten genes, many of which form tandem duplications. Duplication type analysis indicated that 57% of BoGeBP genes originated from whole-genome or segmental duplication (WGD) and 32% from tandem duplication, implying that multiple duplication mechanisms have collectively driven the family expansion. The preferential retention of WGD-derived genes may preserve dosage-sensitive core functions in polyploid genomes, as predicted by the gene balance hypothesis and observed in other plant systems [58]. By contrast, tandem duplications—frequently implicated in forming localized resistance and stress-responsive gene clusters—provide raw material for rapid environmental adaptation [59].
By comparison, GeBP genes in Arabidopsis and B. rapa are mainly derived from dispersed duplication, which is often associated with transposable element-mediated relocation and the acquisition of novel regulatory elements, thereby facilitating functional innovation and tissue-specific expression divergence [60].
In summary, the expansion of the GeBP gene family in B. oleracea reflects both the conserved legacy of polyploidy and the adaptive contributions of tandem duplication. These divergent duplication patterns underscore distinct genome evolutionary strategies between the genera Brassica and Arabidopsis and provide a foundation for leveraging specific GeBP members in breeding programs to enhance stress resilience and agronomic performance in B. oleracea.

4.2. Structural Diversity and Cis-Regulatory Characteristics Reveal the Potential Functional Divergence and Applied Value of BoGeBP Genes

BoGeBP proteins are generally hydrophilic and predominantly localized in the nucleus, consistent with typical features of transcription factors [61]. However, some members, such as BolC01g019630.2J (Group A) and BolC03g069820.2J (Group D), possess not only predicted transmembrane helical regions but also structural domains associated with transporters like MFS or P-ATPase, suggesting their potential dual role in signal sensing and transcriptional regulation. This type of “membrane–nucleus dual function” is rare among known GeBP proteins [10] and holds significant research and application potential.
From a structural perspective, BoGeBP members exhibit notable differences in their secondary structure composition. Proteins in Clade C have the highest α-helix content (averaging 53%), suggesting strong structural stability and possible roles in sustained signal transduction or structural support [62]. Clade B is characterized by a high proportion of random coils (average 57%), indicating greater flexibility and dynamic regulatory potential, possibly contributing to rapid responses to environmental stress [63]. Clade D shows the highest structural variability, hinting at a trend toward functional diversification. Additionally, variations in protein instability index and aliphatic index across groups imply differing physiological roles in heat adaptation and protein stability.
Promoter cis-acting element analysis further supported these functional inferences. Among all identified cis-elements, 46% are associated with light response, 25% with plant hormone responses (such as ABA and MeJA), and 21% with abiotic stresses, indicating that BoGeBP genes are widely involved in crosstalk among light, hormonal, and stress signaling pathways. BolC06g017320.2J, enriched with all three types of regulatory elements and highly expressed across multiple tissues, is presumed to be a core integrator of multiple signals. BolC07g051920.2J is highly expressed in leaf tissue, its promoter contains abundant light-responsive elements, and it interacts with several photosynthesis-related proteins, suggesting a potential role in photosynthetic regulation and photomorphogenesis.
By integrating structural features, expression patterns, and functional annotations, this study identifies several key candidate genes with potential biological significance. BolC05g060710.2J is a well-characterized cysteine synthase gene involved in sulfur metabolism and amino acid biosynthesis. It is significantly upregulated under drought stress [64] and may play a central role in antioxidant responses and the synthesis of sulfur-containing secondary metabolites (e.g., glucosinolates) [65], making it a promising candidate for stress resistance regulation. BolC01g019630.2J is a large MFS-type transporter potentially associated with energy metabolism, cell wall remodeling, and signal transduction. It is well suited for functional validation under pathogen stress or salt/drought conditions (Table 5) [66]. BolC03g069820.2J is a protein enriched in β-sheet structures and has a high aliphatic index, suggesting strong thermostability and membrane localization. It is speculated to play a unique role in responses to heat or heavy metal stress.
In summary, the BoGeBP gene family exhibits a unique combination of structural conservation and diversity, and demonstrates high potential for functional differentiation in terms of cis-regulation, subcellular localization, and biological function. These insights lay the groundwork for exploring their roles in light response, hormone signaling, and stress adaptation and identifies key genes that may serve as valuable genetic resources for the breeding of stress-tolerant, nutrient-rich, or pest-resistant cabbage cultivars.

4.3. Future Perspectives

By integrating synteny and gene duplication pattern analyses, this study reveals that the expansion strategies of the GeBP gene family differ significantly among Brassica species. In future research, the following approaches are recommended to further elucidate the biological roles and regulatory mechanisms of GeBP genes. CRISPR/Cas9-mediated functional validation: this genome-editing approach enables precise dissection of the roles of key GeBP genes in growth, development, and stress responses, providing direct causal evidence for gene function. Its advantages include high specificity and the ability to generate targeted knockouts or allelic variants; however, challenges may arise from genetic redundancy within the family, potential off-target effects, and genotype-dependent transformation efficiency in B. oleracea [67]. Integration of protein interactomics and metabolomics: this combined strategy can reveal the protein partners and downstream metabolic pathways of GeBP proteins, particularly those with predicted dual functions in signaling and transcriptional regulation. The main advantage lies in capturing the multi-layered regulatory network; however, the challenge is that data integration is computationally intensive, and protein detection may be hindered by low abundance or transient interactions [68]. Spatiotemporal transcriptome analysis under environmental gradients: this approach provides high-resolution insight into the dynamic expression patterns of GeBP genes across tissues, developmental stages, and diverse abiotic stress or hormonal conditions. It offers a comprehensive view of regulatory plasticity, but the large-scale datasets generated require robust statistical models, and distinguishing primary from secondary stress responses remains a methodological hurdle [69].
Collectively, these strategies will systematically clarify the synergistic roles of GeBP members in light, hormone, and stress signaling networks, while also highlighting their functional diversity. The anticipated advantages of these methods can accelerate functional characterization and breeding application, although the outlined challenges should be addressed to ensure reproducibility and translational impact. Ultimately, this integrated framework will provide valuable insights and genetic resources for molecular design breeding of cabbage and other Brassica vegetables.

4.4. Limitations and Significance of This Study

This study systematically analyzed the evolutionary characteristics, structural features, and expression patterns of the GeBP transcription factor family in B. oleracea, identifying several key candidate genes with potential roles in stress resistance and regulatory functions. However, there are some limitations to this study. Firstly, all conclusions were drawn based on public databases and bioinformatic analyses, lacking experimental validation such as confirmation of transcription factor binding sites or functional assays through gene knockout/overexpression. Secondly, the expression data were primarily derived from reference genomes and a limited number of samples, resulting in constraints regarding environmental and varietal diversity. Additionally, the accuracy of protein interaction predictions, functional annotations, and GO/KEGG enrichment analyses is influenced by algorithmic limitations and the completeness of current database annotations.
Despite these constraints, the results reveal that BoGeBP genes exhibit both structural conservation and regulatory divergence, suggesting their roles in coordinating light, hormone, and abiotic stress responses. From a horticultural perspective, the identification of lineage-specific expansion patterns and stress-responsive cis-regulatory modules offers practical avenues for crop improvement. WGD-derived genes may function as stable regulators of core metabolic processes, while tandem duplications enriched in stress-related elements could mediate rapid responses to environmental stimuli. Notably, key candidates such as BolC05g060710.2J, BolC01g019630.2J, and BolC06g017320.2J hold promise for molecular breeding through marker-assisted selection or gene editing, especially in the development of cabbage cultivars with enhanced resilience, nutritional quality, and environmental adaptability. This research therefore contributes valuable knowledge for advancing sustainable and resilient horticultural production systems and highlights the importance of integrating bioinformatics with applied breeding strategies in Brassica crops.

5. Conclusions

This study provides new insights into the evolutionary dynamics and regulatory complexity of the GeBP transcription factor family in B. oleracea. By resolving lineage-specific expansion patterns and uncovering structural divergence—particularly the unexpected identification of members (e.g., BolC01g019630.2J) that potentially bridge membrane perception and nuclear transcription—the work refines our understanding of how transcription factor families diversify and integrate environmental signals in polyploid crop genomes. The characterization of promoter architectures that coordinate light, hormone, and stress responsiveness further illuminates the multilayered regulatory roles of GeBP proteins and suggests how signal convergence is achieved at the cis-regulatory level.
Beyond descriptive cataloguing, the study advances the field by prioritizing concrete candidate genes (BolC05g060710.2J, BolC01g019630.2J, and BolC06g017320.2J) with putative central roles in metabolism, signaling, and stress integration, thus providing a rational basis for downstream functional genomics and metabolic engineering. The integrative framework—combining synteny, structural motif conservation, expression specificity, and cis-element profiling—serves as a transferable template for dissecting other transcription factor families in complex plant genomes.
Ultimately, these findings contribute both to basic evolutionary biology (by clarifying duplication-driven innovation and conservation in a recently diversified gene family) and applied crop improvement, offering prioritized molecular targets and mechanistic hypotheses to accelerate the breeding of cabbage and related Brassica vegetables with enhanced stress resilience and trait optimization.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/horticulturae11080968/s1. Figure S1. Conserved motifs and domains of the GeBP gene family. Table S1. A total of 133 GO terms in BoGeBP family.

Author Contributions

Conceptualization, Z.Z. and K.J.; methodology, Z.Z.; software, Z.Z. and K.J.; validation, Z.Z. and K.J.; formal analysis, Z.Z.; investigation, Z.Z. and K.J.; resources, Z.Z. and K.J.; data curation, Z.Z.; writing—original draft preparation, Z.Z.; writing—review and editing, Z.Z. and K.J.; visualization, Z.Z. and K.J.; supervision, Z.W.; project administration, Z.W.; funding acquisition, Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This workwas supported by the National Natural Science Foundation of China (32000405 to Z.W.), Tangshan Science and Technology Program Project (21130228C to Z.W.).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yoon, J.; Cho, L.-H.; Yang, W.; Pasriga, R.; Wu, Y.; Hong, W.-J.; Bureau, C.; Wi, S.J.; Zhang, T.; Wang, R.; et al. Homeobox transcription factor OsZHD2 promotes root meristem activity in rice by inducing ethylene biosynthesis. J. Exp. Bot. 2020, 71, 5348–5364. [Google Scholar] [CrossRef] [PubMed]
  2. Hu, Y.; Han, X.; Yang, M.; Zhang, M.; Pan, J.; Yu, D. The Transcription Factor INDUCER Of CBF EXPRESSION1 Interacts with ABSCISIC ACID INSENSITIVE5 and DELLA Proteins to Fine-Tune Abscisic Acid Signaling during Seed Germination in Arabidopsis. Plant Cell 2019, 31, 1520–1538. [Google Scholar] [CrossRef]
  3. Liu, C.; Ou, S.; Mao, B.; Tang, J.; Wang, W.; Wang, H.; Cao, S.; Schläppi, M.R.; Zhao, B.; Xiao, G.; et al. Early selection of bZIP73 facilitated adaptation of japonica rice to cold climates. Nat. Commun. 2018, 9, 3302. [Google Scholar] [CrossRef]
  4. Kim, T.W.; Wang, Z.Y. Brassinosteroid signal transduction from receptor kinases to transcription factors. Annu. Rev. Plant Biol. 2010, 61, 681–704. [Google Scholar] [CrossRef]
  5. Curaba, J.; Herzog, M.; Vachon, G. GeBP, the first member of a new gene family in Arabidopsis, encodes a nuclear protein with DNA-binding activity and is regulated by KNAT1. Plant J. 2003, 33, 305–317. [Google Scholar] [CrossRef]
  6. Sato, S.; Tabata, S.; Hirakawa, H.; Asamizu, E.; Shirasawa, K.; Isobe, S.; Kaneko, T.; Nakamura, Y.; Shibata, D.; Aoki, K.; et al. The tomato genome sequence provides insights into fleshy fruit evolution. Nature 2012, 485, 635–641. [Google Scholar] [CrossRef] [PubMed]
  7. Zhang, H.; Liu, Z.; Luo, R.; Sun, Y.; Yang, C.; Li, X.; Gao, A.; Pu, J. Genome-Wide Characterization, Identification and Expression Profile of MYB Transcription Factor Gene Family during Abiotic and Biotic Stresses in Mango (Mangifera indica). Plants 2022, 11, 3141. [Google Scholar] [CrossRef]
  8. Liu, S.; Liu, Y.; Liu, C.; Zhang, F.; Wei, J.; Li, B. Genome-wide characterization and expression analysis of GeBP family genes in soybean. Plants 2022, 11, 1848. [Google Scholar] [CrossRef] [PubMed]
  9. Li, L.; Shi, Q.; Li, Z.; Gao, J. Genome-wide identification and functional characterization of the PheE2F/DP gene family in Moso bamboo. BMC Plant Biol. 2021, 21, 158. [Google Scholar] [CrossRef]
  10. Chevalier, F.; Perazza, D.; Laporte, F.; Le Hénanff, G.; Hornitschek, P.; Bonneville, J.-M.; Herzog, M.; Vachon, G. GeBP and GeBP-like proteins are noncanonical leucine-zipper transcription factors that regulate cytokinin response in Arabidopsis. Plant Physiol. 2008, 146, 1142–1154. [Google Scholar] [CrossRef]
  11. Yanai, O.; Shani, E.; Dolezal, K.; Tarkowski, P.; Sablowski, R.; Sandberg, G.; Samach, A.; Ori, N. Arabidopsis KNOXI Proteins Activate Cytokinin Biosynthesis. Curr. Biol. 2005, 15, 1566–1571. [Google Scholar] [CrossRef] [PubMed]
  12. Hülskamp, M.; Miséra, S.; Jürgens, G. Genetic dissection of trichome cell development in Arabidopsis. Cell 1994, 76, 555–566. [Google Scholar] [CrossRef]
  13. Johnson, H.B. Plant pubescence: An ecological perspective. Bot. Rev. 1975, 41, 233–258. [Google Scholar] [CrossRef]
  14. Wang, X.; Shen, C.; Meng, P.; Tan, G.; Lv, L. Analysis and review of trichomes in plants. BMC Plant Biol. 2021, 21, 70. [Google Scholar] [CrossRef]
  15. Hauser, M.-T.; Harr, B.; Schlötterer, C. Trichome Distribution in Arabidopsis thaliana and its Close Relative Arabidopsis lyrata: Molecular Analysis of the Candidate Gene GLABROUS1. Mol. Biol. Evol. 2001, 18, 1754–1763. [Google Scholar] [CrossRef]
  16. Perazza, D.; Vachon, G.; Herzog, M. Gibberellins Promote Trichome Formation by Up-RegulatingGLABROUS1 in Arabidopsis1. Plant Physiol. 1998, 117, 375–383. [Google Scholar] [CrossRef] [PubMed]
  17. Khare, D.; Mitsuda, N.; Lee, S.; Song, W.Y.; Hwang, D.; Ohme-Takagi, M.; Martinoia, E.; Lee, Y.; Hwang, J.U. Root avoidance of toxic metals requires the GeBP-LIKE 4 transcription factor in Arabidopsis thaliana. New Phytol. 2017, 213, 1257–1273. [Google Scholar] [CrossRef] [PubMed]
  18. Perazza, D.; Laporte, F.; Balagué, C.; Chevalier, F.; Remo, S.; Bourge, M.; Larkin, J.; Herzog, M.; Vachon, G. GeBP/GPL transcription factors regulate a subset of CPR5-dependent processes. Plant Physiol. 2011, 157, 1232–1242. [Google Scholar] [CrossRef]
  19. Li, X.; Wang, Y.; Cai, C.; Ji, J.; Han, F.; Zhang, L.; Chen, S.; Zhang, L.; Yang, Y.; Tang, Q.; et al. Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica oleracea. Nat. Genet. 2024, 56, 517–529. [Google Scholar] [CrossRef]
  20. Kim, H.A.; Lim, C.J.; Kim, S.; Choe, J.K.; Jo, S.H.; Baek, N.; Kwon, S.Y. High-throughput sequencing and de novo assembly of Brassica oleracea var. Capitata L. for transcriptome analysis. PLoS ONE 2014, 9, e92087. [Google Scholar] [CrossRef]
  21. Liu, S.; Liu, Y.; Yang, X.; Tong, C.; Edwards, D.; Parkin, I.A.; Zhao, M.; Ma, J.; Yu, J.; Huang, S.; et al. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat. Commun. 2014, 5, 3930. [Google Scholar] [CrossRef]
  22. Lv, H.; Wang, Y.; Han, F.; Ji, J.; Fang, Z.; Zhuang, M.; Li, Z.; Zhang, Y.; Yang, L. A high-quality reference genome for cabbage obtained with SMRT reveals novel genomic features and evolutionary characteristics. Sci. Rep. 2020, 10, 12394. [Google Scholar] [CrossRef]
  23. Guo, N.; Wang, S.; Wang, T.; Duan, M.; Zong, M.; Miao, L.; Han, S.; Wang, G.; Liu, X.; Zhang, D. A graph-based pan-genome of Brassica oleracea provides new insights into its domestication and morphotype diversification. Plant Commun. 2024, 5, 100791. [Google Scholar] [CrossRef]
  24. Cartea, M.E.; Lema, M.; Francisco, M.; Velasco, P. Basic information on vegetable Brassica crops. In Genetics, Genomics and Breeding of Vegetable Brassicas; CRC Press: Boca Raton, FL, USA, 2011; pp. 1–33. [Google Scholar]
  25. Chalhoub, B.; Denoeud, F.; Liu, S.; Parkin, I.A.; Tang, H.; Wang, X.; Chiquet, J.; Belcram, H.; Tong, C.; Samans, B.; et al. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science 2014, 345, 950–953. [Google Scholar] [CrossRef] [PubMed]
  26. Ferjani, A.; Tsukagoshi, H.; Vassileva, V. Model organisms in plant science: Arabidopsis thaliana. Front. Plant Sci. 2023, 14, 1279230. [Google Scholar] [CrossRef] [PubMed]
  27. Swarbreck, D.; Wilks, C.; Lamesch, P.; Berardini, T.Z.; Garcia-Hernandez, M.; Foerster, H.; Li, D.; Meyer, T.; Muller, R.; Ploetz, L.; et al. The Arabidopsis Information Resource (TAIR): Gene structure and function annotation. Nucleic Acids Res. 2008, 36, D1009–D1014. [Google Scholar] [CrossRef] [PubMed]
  28. Jin, J.; Tian, F.; Yang, D.C.; Meng, Y.Q.; Kong, L.; Luo, J.; Gao, G. PlantTFDB 4.0: Toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res. 2017, 45, D1040–D1045. [Google Scholar] [CrossRef]
  29. Letunic, I.; Khedkar, S.; Bork, P. SMART: Recent updates, new developments and status in 2020. Nucleic Acids Res. 2021, 49, D458–D460. [Google Scholar] [CrossRef]
  30. Blum, M.; Andreeva, A.; Florentino, L.C.; Chuguransky, S.R.; Grego, T.; Hobbs, E.; Pinto, B.L.; Orr, A.; Paysan-Lafosse, T.; Ponamareva, I.; et al. InterPro: The protein sequence classification resource in 2025. Nucleic Acids Res. 2025, 53, D444–D456. [Google Scholar] [CrossRef]
  31. Marchler-Bauer, A.; Lu, S.; Anderson, J.B.; Chitsaz, F.; Derbyshire, M.K.; DeWeese-Scott, C.; Fong, J.H.; Geer, L.Y.; Geer, R.C.; Gonzales, N.R.; et al. CDD: A Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 2011, 39, D225–D229. [Google Scholar] [CrossRef]
  32. Edgar, R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef]
  33. Bailey, T.L.; Johnson, J.; Grant, C.E.; Noble, W.S. The MEME suite. Nucleic Acids Res. 2015, 43, W39–W49. [Google Scholar] [CrossRef]
  34. Chen, H.; Song, X.; Shang, Q.; Feng, S.; Ge, W. CFVisual: An interactive desktop platform for drawing gene structure and protein architecture. BMC Bioinform. 2022, 23, 178. [Google Scholar] [CrossRef] [PubMed]
  35. Chen, C.; Chen, H.; Zhang, Y.; Thomas, H.R.; Frank, M.H.; He, Y.; Xia, R. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol. Plant 2020, 13, 1194–1202. [Google Scholar] [CrossRef]
  36. Chou, K.C.; Shen, H.B. Plant-mPLoc: A top-down strategy to augment the power for predicting plant protein subcellular localization. PLoS ONE 2010, 5, e11335. [Google Scholar] [CrossRef]
  37. Chen, Y.; Yu, P.; Luo, J.; Jiang, Y. Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT. Mamm. Genome 2003, 14, 859–865. [Google Scholar] [CrossRef] [PubMed]
  38. Almagro Armenteros, J.J.; Tsirigos, K.D.; Sønderby, C.K.; Petersen, T.N.; Winther, O.; Brunak, S.; von Heijne, G.; Nielsen, H. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 2019, 37, 420–423. [Google Scholar] [CrossRef]
  39. Voorrips, R.E. MapChart: Software for the graphical presentation of linkage maps and QTLs. J. Hered. 2002, 93, 77–78. [Google Scholar] [CrossRef]
  40. Geourjon, C.; Deléage, G. SOPMA: Significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Comput. Appl. Biosci. 1995, 11, 681–684. [Google Scholar] [CrossRef] [PubMed]
  41. Waterhouse, A.; Bertoni, M.; Bienert, S.; Studer, G.; Tauriello, G.; Gumienny, R.; Heer, F.T.; de Beer, T.A.P.; Rempfer, C.; Bordoli, L.; et al. SWISS-MODEL: Homology modelling of protein structures and complexes. Nucleic Acids Res. 2018, 46, W296–W303. [Google Scholar] [CrossRef]
  42. Humphrey, W.; Dalke, A.; Schulten, K. VMD: Visual molecular dynamics. J. Mol. Graph. 1996, 14, 33–38. [Google Scholar] [CrossRef]
  43. Lescot, M.; Déhais, P.; Thijs, G.; Marchal, K.; Moreau, Y.; Van de Peer, Y.; Rouzé, P.; Rombauts, S. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res. 2002, 30, 325–327. [Google Scholar] [CrossRef]
  44. Szklarczyk, D.; Nastou, K.; Koutrouli, M.; Kirsch, R.; Mehryary, F.; Hachilif, R.; Hu, D.; Peluso, M.E.; Huang, Q.; Fang, T.; et al. The STRING database in 2025: Protein networks with directionality of regulation. Nucleic Acids Res. 2025, 53, D730–D737. [Google Scholar] [CrossRef]
  45. Cantalapiedra, C.P.; Hernández-Plaza, A.; Letunic, I.; Bork, P.; Huerta-Cepas, J. eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale. Mol. Biol. Evol. 2021, 38, 5825–5829. [Google Scholar] [CrossRef]
  46. Aleksander, S.A.; Balhoff, J.; Carbon, S.; Cherry, J.M.; Drabkin, H.J.; Ebert, D.; Feuermann, M.; Gaudet, P.; Harris, N.L.; Hill, D.P.; et al. The Gene Ontology knowledgebase in 2023. Genetics 2023, 224, iyad031. [Google Scholar] [CrossRef] [PubMed]
  47. Kanehisa, M.; Furumichi, M.; Sato, Y.; Matsuura, Y.; Ishiguro-Watanabe, M. KEGG: Biological systems database as a model of the real world. Nucleic Acids Res. 2025, 53, D672–D677. [Google Scholar] [CrossRef]
  48. Feng, J.-W. Codon Pattern of Papillomavirus (Type I) from Bos Grunniens Based on the CodonW Software. Eng. Technol. Res. 2017. [Google Scholar] [CrossRef] [PubMed]
  49. Chen, H.; Ji, K.; Bai, Y.; Li, Y.; Liu, Y.; Liu, F.; Cui, Y.; Ge, W.; Wang, Z. The calmodulin-binding transcriptional activator transcription factor family in foxtail millet (Setaria italica L.): Molecular characterization, codon bias, and evolutionary trajectory. Plant Gene 2025, 43, 100522. [Google Scholar] [CrossRef]
  50. Sharma, S.; Ciufo, S.; Starchenko, E.; Darji, D.; Chlumsky, L.; Karsch-Mizrachi, I.; Schoch, C.L. The NCBI biocollections database. Database 2018, 2018, bay006. [Google Scholar] [CrossRef]
  51. Wang, Y.; Tang, H.; DeBarry, J.D.; Tan, X.; Li, J.; Wang, X.; Lee, T.-H.; Jin, H.; Marler, B.; Guo, H. MCScanX: A toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012, 40, e49. [Google Scholar] [CrossRef]
  52. Mun, J.H.; Kwon, S.J.; Yang, T.J.; Seol, Y.J.; Jin, M.; Kim, J.A.; Lim, M.H.; Kim, J.S.; Baek, S.; Choi, B.S.; et al. Genome-wide comparative analysis of the Brassica rapa gene space reveals genome shrinkage and differential loss of duplicated genes after whole genome triplication. Genome Biol. 2009, 10, R111. [Google Scholar] [CrossRef] [PubMed]
  53. De Bodt, S.; Maere, S.; Van de Peer, Y. Genome duplication and the origin of angiosperms. Trends Ecol. Evol. 2005, 20, 591–597. [Google Scholar] [CrossRef]
  54. Cheng, F.; Wu, J.; Wang, X. Genome triplication drove the diversification of Brassica plants. Hortic. Res. 2014, 1, 14024. [Google Scholar] [CrossRef] [PubMed]
  55. Van de Peer, Y.; Mizrachi, E.; Marchal, K. The evolutionary significance of polyploidy. Nat. Rev. Genet. 2017, 18, 411–424. [Google Scholar] [CrossRef] [PubMed]
  56. Flagel, L.E.; Wendel, J.F. Gene duplication and evolutionary novelty in plants. New Phytol. 2009, 183, 557–564. [Google Scholar] [CrossRef]
  57. Lynch, M.; Conery, J.S. The evolutionary fate and consequences of duplicate genes. Science 2000, 290, 1151–1155. [Google Scholar] [CrossRef]
  58. Freeling, M. Bias in plant gene content following different sorts of duplication: Tandem, whole-genome, segmental, or by transposition. Annu. Rev. Plant Biol. 2009, 60, 433–453. [Google Scholar] [CrossRef]
  59. Reams, A.B.; Neidle, E.L. Selection for gene clustering by tandem duplication. Annu. Rev. Microbiol. 2004, 58, 119–142. [Google Scholar] [CrossRef]
  60. Qiao, X.; Li, Q.; Yin, H.; Qi, K.; Li, L.; Wang, R.; Zhang, S.; Paterson, A.H. Gene duplication and evolution in recurring polyploidization–diploidization cycles in plants. Genome Biol. 2019, 20, 38. [Google Scholar] [CrossRef]
  61. Mikenberg, I.; Widera, D.; Kaus, A.; Kaltschmidt, B.; Kaltschmidt, C. Transcription factor NF-κB is transported to the nucleus via cytoplasmic dynein/dynactin motor complex in hippocampal neurons. PLoS ONE 2007, 2, e589. [Google Scholar] [CrossRef]
  62. Anantharaman, V.; Balaji, S.; Aravind, L. The signaling helix: A common functional theme in diverse signaling proteins. Biol. Direct 2006, 1, 25. [Google Scholar] [CrossRef]
  63. Sun, X.; Rikkerink, E.H.; Jones, W.T.; Uversky, V.N. Multifarious roles of intrinsic disorder in proteins illustrate its broad impact on plant biology. Plant Cell 2013, 25, 38–55. [Google Scholar] [CrossRef]
  64. Zagorchev, L.; Seal, C.E.; Kranner, I.; Odjakova, M. A central role for thiols in plant tolerance to abiotic stress. Int. J. Mol. Sci. 2013, 14, 7405–7432. [Google Scholar] [CrossRef]
  65. Sasaki-Sekimoto, Y.; Taki, N.; Obayashi, T.; Aono, M.; Matsumoto, F.; Sakurai, N.; Suzuki, H.; Hirai, M.Y.; Noji, M.; Saito, K. Coordinated activation of metabolic pathways for antioxidants and defence compounds by jasmonates and their roles in stress tolerance in Arabidopsis. Plant J. 2005, 44, 653–668. [Google Scholar] [CrossRef]
  66. Lin, H.-C.; Yu, P.-L.; Chen, L.-H.; Tsai, H.-C.; Chung, K.-R. A major facilitator superfamily transporter regulated by the stress-responsive transcription factor Yap1 is required for resistance to fungicides, xenobiotics, and oxidants and full virulence in Alternaria alternata. Front. Microbiol. 2018, 9, 2229. [Google Scholar] [CrossRef]
  67. Das, A.; Sharma, N.; Prasad, M. CRISPR/Cas9: A novel weapon in the arsenal to combat plant diseases. Front. Plant Sci. 2019, 9, 2008. [Google Scholar] [CrossRef] [PubMed]
  68. Wani, N.; Raza, K. Integrative approaches to reconstruct regulatory networks from multi-omics data: A review of state-of-the-art methods. Comput. Biol. Chem. 2019, 83, 107120. [Google Scholar] [CrossRef] [PubMed]
  69. Roychowdhury, R.; Das, S.P.; Gupta, A.; Parihar, P.; Chandrasekhar, K.; Sarker, U.; Kumar, A.; Ramrao, D.P.; Sudhakar, C. Multi-omics pipeline and omics-integration approach to decipher plant’s abiotic stress tolerance responses. Genes 2023, 14, 1281. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Phylogenetic tree of the GeBP gene family in B. oleracea, B. rapa, B. napus, and Arabidopsis. Yellow represents Group A, blue represents Group B, green represents Group C, and red represents Group D.
Figure 1. Phylogenetic tree of the GeBP gene family in B. oleracea, B. rapa, B. napus, and Arabidopsis. Yellow represents Group A, blue represents Group B, green represents Group C, and red represents Group D.
Horticulturae 11 00968 g001
Figure 2. Conserved motifs and domains of the BoGeBP gene family.
Figure 2. Conserved motifs and domains of the BoGeBP gene family.
Horticulturae 11 00968 g002
Figure 3. Chromosomal distribution of BoGeBP gene family members.
Figure 3. Chromosomal distribution of BoGeBP gene family members.
Horticulturae 11 00968 g003
Figure 4. Tertiary structure of BoGeBP family proteins. Blue indicates α-helix, red indicates β-sheet, yellow indicates turn, and green indicates coil.
Figure 4. Tertiary structure of BoGeBP family proteins. Blue indicates α-helix, red indicates β-sheet, yellow indicates turn, and green indicates coil.
Horticulturae 11 00968 g004
Figure 5. Cis-acting regulatory elements in the promoter regions of BoGeBP gene family members.
Figure 5. Cis-acting regulatory elements in the promoter regions of BoGeBP gene family members.
Horticulturae 11 00968 g005
Figure 6. Protein–protein interaction network of BoGeBP family members.
Figure 6. Protein–protein interaction network of BoGeBP family members.
Horticulturae 11 00968 g006
Figure 7. Classification and statistical analysis of GO terms associated with BoGeBP genes.
Figure 7. Classification and statistical analysis of GO terms associated with BoGeBP genes.
Horticulturae 11 00968 g007
Figure 8. Venn diagram of optimal codons in GeBP gene family members across studied species.
Figure 8. Venn diagram of optimal codons in GeBP gene family members across studied species.
Horticulturae 11 00968 g008
Figure 9. Expression heatmap of BoGeBP genes across various tissues and in 6-day-old sprouts under blue light treatment.
Figure 9. Expression heatmap of BoGeBP genes across various tissues and in 6-day-old sprouts under blue light treatment.
Horticulturae 11 00968 g009
Figure 10. Syntenic relationships of GeBP genes between B. oleracea and Arabidopsis.
Figure 10. Syntenic relationships of GeBP genes between B. oleracea and Arabidopsis.
Horticulturae 11 00968 g010
Figure 11. Syntenic relationships of GeBP genes between B. oleracea and B. napus.
Figure 11. Syntenic relationships of GeBP genes between B. oleracea and B. napus.
Horticulturae 11 00968 g011
Figure 12. Syntenic relationships of GeBP genes between B. oleracea and B. rapa.
Figure 12. Syntenic relationships of GeBP genes between B. oleracea and B. rapa.
Horticulturae 11 00968 g012
Table 1. Number of genes in GeBP family of the study species.
Table 1. Number of genes in GeBP family of the study species.
SpeciesNumber
B. oleracea28
B. rapa20
B. napus44
Arabidopsis23
Total115
Table 2. Gene structure and functional information of the GeBP gene family.
Table 2. Gene structure and functional information of the GeBP gene family.
GroupGene IDGene LengthIntronsCDS
ABolC08g008250.2J80701
BolC04g041600.2J110701
BolC07g051920.2J110401
BolC01g019630.2J503534
BolC04g027520.2J110701
BolC01g037440.2J964623
BolC01g001690.2J294056
BolC09g019010.2J114301
BolC08g055380.2J89823
BolC05g009330.2J120001
BolC04g047580.2J92701
BBolC09g005140.2J118501
BolC07g039050.2J123601
BolC01g055250.2J108601
BolC05g060710.2J127801
CBolC01g019250.2J85712
BolC04g061600.2J175067
BolC06g017320.2J118734
BolC06g018980.2J69823
BolC06g018990.2J82612
BolC07g015340.2J77612
DBolC04g042880.2J93812
BolC03g069820.2J242834
BolC04g042890.2J46801
BolC04g042900.2J38101
BolC04g042930.2J93001
BolC04g042920.2J95401
BolC04g042910.2J96901
Table 3. Physicochemical properties of BoGeBP gene family members.
Table 3. Physicochemical properties of BoGeBP gene family members.
GroupProtein IDNumber of Amino AcidMolecular WeightTheoretical PIInstability IndexAliphatic IndexGrand Average of HydropathicityPredicted Location(s)
ABolC08g008250.2J26830336.915.361.5560.75−1.101Nucleus
BolC04g041600.2J36840319.465.9535.470.22−0.722Nucleus
BolC07g051920.2J36740774.865.7255.3360.63−0.948Nucleus
BolC01g019630.2J88196100.427.8936.5284.71−0.027Cell membrane
BolC04g027520.2J36840714.445.752.456.98−0.942Nucleus
BolC01g037440.2J27531108.635.5158.5148.58−1.203Nucleus
BolC01g001690.2J30033355.717.6154.8179.03−0.561Nucleus
BolC09g019010.2J38042131.825.1557.9258.26−1.043Nucleus
BolC08g055380.2J15417306.14.637.6660.71−0.794Nucleus
BolC05g009330.2J39943515.026.4546.7256.24−0.97Nucleus
BolC04g047580.2J30834729.835.5156.0158.6−1.016Nucleus
BBolC09g005140.2J39442983.255.0762.8376.47−0.515Chloroplast. Nucleus
BolC07g039050.2J41144855.044.8760.6971.58−0.581Nucleus
BolC01g055250.2J36140035.865.566.3672.88−0.685Chloroplast. Nucleus
BolC05g060710.2J42546780.054.8863.1569.2−0.64Nucleus
CBolC01g019250.2J22525317.338.8136.7880.53−0.322Nucleus
BolC04g061600.2J39144768.534.4547.1767.31−0.978Nucleus
BolC06g017320.2J29733781.845.0232.7273.84−0.792Nucleus
BolC06g018980.2J13015463.729.2462.3578.69−0.832Chloroplast, Golgi apparatus
BolC06g018990.2J24727935.729.6335.8862.83−0.884Nucleus
BolC07g015340.2J23226136.549.3247.0167.33−0.869Nucleus
DBolC04g042880.2J28432271.26.9948.252.89−1.13Nucleus
BolC03g069820.2J61367325.319.2862.292.590.037Cell membrane, nucleus
BolC04g042890.2J15517691.39.0127.1866.06−0.927Nucleus
BolC04g042900.2J12614210.775.0979.0569.68−0.835Nucleus
BolC04g042930.2J30935205.88.9753.2656.18−0.974Nucleus
BolC04g042920.2J31735462.676.4346.1254.48−1.012Nucleus
BolC04g042910.2J32236233.766.9742.958.48−0.924Nucleus
Table 4. Secondary structure composition of BoGeBP family proteins.
Table 4. Secondary structure composition of BoGeBP family proteins.
GroupProtein IDα-Helix (%)Extended Chain (%)β-Turn (%)Random Coil (%)
ABolC08g008250.2J46.27 4.48 2.61 46.64
BolC04g041600.2J47.83 7.61 3.26 41.30
BolC07g051920.2J43.05 8.72 2.72 45.50
BolC01g019630.2J45.86 13.05 6.24 34.85
BolC04g027520.2J43.75 7.07 2.72 46.47
BolC01g037440.2J45.45 3.64 2.55 48.36
BolC01g001690.2J35.67 14.00 9.00 41.33
BolC09g019010.2J38.68 11.58 4.47 45.26
BolC08g055380.2J49.35 11.04 3.90 35.71
BolC05g009330.2J40.60 7.77 4.76 46.87
BolC04g047580.2J55.84 1.62 2.60 39.94
BBolC09g005140.2J34.77 4.06 1.52 59.64
BolC07g039050.2J34.06 6.08 1.95 57.91
BolC01g055250.2J36.01 7.48 3.05 53.46
BolC05g060710.2J31.76 8.24 2.82 57.18
CBolC01g019250.2J28.89 17.78 4.00 49.33
BolC04g061600.2J56.52 7.42 2.30 33.76
BolC06g017320.2J57.58 2.69 2.02 37.71
BolC06g018980.2J59.23 7.69 8.46 24.62
BolC06g018990.2J58.30 4.05 2.43 35.22
BolC07g015340.2J57.76 3.88 3.45 34.91
DBolC04g042880.2J43.66 8.10 3.17 45.07
BolC03g069820.2J19.58 27.08 7.99 45.35
BolC04g042890.2J53.55 8.39 5.16 32.90
BolC04g042900.2J21.43 19.05 7.94 51.59
BolC04g042930.2J45.31 8.09 0.97 45.63
BolC04g042920.2J46.69 5.99 3.15 44.16
BolC04g042910.2J42.86 6.83 3.73 46.58
Table 5. KEGG pathway annotations (K-numbers) of BoGeBP genes.
Table 5. KEGG pathway annotations (K-numbers) of BoGeBP genes.
KEGG-KnumKEGG NameGene
K13783MFS transporter, OPA family, solute carrier family 37 (glycerol-3-phosphate transporter), member 1/2BolC01g019630.2J
K01738cysteine synthase [EC:2.5.1.47]BolC05g060710.2J
K12951cobalt/nickel-transporting P-type ATPase D [EC:7.2.2.-]BolC07g039050.2J
K15441tRNA-specific adenosine deaminase 2 [EC:3.5.4.-]BolC07g039050.2J
Table 6. KEGG pathway annotations of BoGeBP genes.
Table 6. KEGG pathway annotations of BoGeBP genes.
KEGG-PathwayPathway TermGene
map00270Cysteine and methionine metabolismBolC05g060710.2J
map00920Sulfur metabolismBolC05g060710.2J
map01100Metabolic pathwaysBolC05g060710.2J
map01110Biosynthesis of secondary metabolitesBolC05g060710.2J
map01120Microbial metabolism in diverse environmentsBolC05g060710.2J
map01200Carbon metabolismBolC05g060710.2J
map01230Biosynthesis of amino acidsBolC05g060710.2J
Table 7. Codon usage bias analysis of GeBP gene family members in the studied species.
Table 7. Codon usage bias analysis of GeBP gene family members in the studied species.
Species T3sC3sA3sG3sCAICBIFopNcGC3sGCL_symL_aaGravyAromo
B. oleracearange0.2229~0.55320.117~0.42470.1903~0.37620.2315~0.45830.166~0.301−0.158~0.0520.333~0.4747.84~59.890.381~0.6560.4~0.557123~842126~881−1.20255~0.0373570.043597~0.102157
average0.3960 0.2672 0.2763 0.3662 0.2318 −0.0432 0.4078 53.6282 0.4838 0.4642 320.2143 332.3929 −0.7924 0.0653
B. raparange0.2646~0.50250.1719~0.37720.1828~0.35220.3~0.43620.132~0.287−0.188~0.0530.328~0.47447.74~57.670.378~0.6150.412~0.537125~614130~640−1.08734~−0.262560.043127~0.08046
average0.3982 0.2677 0.2661 0.3732 0.2330 −0.0450 0.40752.2890.4892 0.4683 354.25367.2−0.8411 0.0594
B. napusrange0.2755~0.50580.1471~0.34980.1463~0.34840.2857~0.55840.168~0.298−0.249~0.0670.283~0.47737.25~58.360.408~0.6060.421~0.53999~1772105~1830−1.0876~−0.10170.04213~0.10484
average0.3924 0.2686 0.2681 0.3701 0.2284 −0.0440 0.4058 52.5648 0.4888 0.4667 561.65581.725−0.7279 0.0651
Arabidopsisrange0.3719~0.59140.129~0.32220.2383~0.39810.2597~0.41580.189~0.301−0.229~0.0810.3~0.47440.95~610.296~0.4830.361~0.489125~546132~572−1.1504~−0.16920.04444~0.1049
average0.4749 0.19710.2973 0.3505 0.2393 −0.0641 0.3993 50.2983 0.4130 0.4231 298.5217391310.173913−0.7881 0.0651
Note: T3s, the content of the third base T; C3s, the content of the third base C; A3s, the content of the third base A; G3s, the content of the third base G; CAI, codon adaptation index; CBI, codon preference index; Fop, the optimal codon usage frequency; ENc(Nc), number of valid codon; GC3s, the GC content at the third codon site; GC, GC content of genes; L_sym, number of synonymous codons; L_aa, number of amino acids; Gravy, the hydrophilicity of proteins; Aromo, aromatic properties of proteins.
Table 8. Summary of optimal codons in GeBP gene family members across studied species.
Table 8. Summary of optimal codons in GeBP gene family members across studied species.
IntersectionCountCodons
Bol & Bra & Bna & Ath2UUG;UCU
Bol & Bna & Ath4AUU;CAU;AAG;UUU
Bol & Bra & Bna1UGU
Bra & Bna & Ath1CCU
Bol & Bna1GUG
Bol & Ath1GCU
Bra & Bna1CAG
Bra & Ath2AAU;GGU
Bna & Ath2ACU;GAU
Bol2CCA;ACC
Bra3UAC;GAA;AGG
Ath3GUU;UAU;AGA
Note: Bol, B. oleracea; Bra, B. rapa; Bna, B. napus; Ath, Arabidopsis.
Table 9. Gene duplication types of the GeBP gene family.
Table 9. Gene duplication types of the GeBP gene family.
SpeciesSingletonDispersedProximalTandemWGD or SegmentalTotal
GenomeGeBPGenomeGeBPGenomeGeBPGenomeGeBPGenomeGeBPGenomeGeBP
Bol18950225233348903915926620165844228
Bra2323010098118661271322361263961220
Bna271602875953361332680673383610544244
Ath5124011328131091232380771582849623
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhu, Z.; Ji, K.; Wang, Z. Evolutionary Expansion, Structural Diversification, and Functional Prediction of the GeBP Gene Family in Brassica oleracea. Horticulturae 2025, 11, 968. https://doi.org/10.3390/horticulturae11080968

AMA Style

Zhu Z, Ji K, Wang Z. Evolutionary Expansion, Structural Diversification, and Functional Prediction of the GeBP Gene Family in Brassica oleracea. Horticulturae. 2025; 11(8):968. https://doi.org/10.3390/horticulturae11080968

Chicago/Turabian Style

Zhu, Ziying, Kexin Ji, and Zhenyi Wang. 2025. "Evolutionary Expansion, Structural Diversification, and Functional Prediction of the GeBP Gene Family in Brassica oleracea" Horticulturae 11, no. 8: 968. https://doi.org/10.3390/horticulturae11080968

APA Style

Zhu, Z., Ji, K., & Wang, Z. (2025). Evolutionary Expansion, Structural Diversification, and Functional Prediction of the GeBP Gene Family in Brassica oleracea. Horticulturae, 11(8), 968. https://doi.org/10.3390/horticulturae11080968

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop