Next Article in Journal
Bacillus sp. L11 Promotes Tomato (Solanum lycopersicum L.) Seedling Growth by Reshaping Rhizosphere Bacterial Communities and Enhancing Root Growth Parameters
Previous Article in Journal
Phenotypic Diversity, Oil Quality Evaluation, and Elite Germplasm Screening of Xanthoceras sorbifolium Germplasm Resources
Previous Article in Special Issue
Interactive Effects of Culture System and Carbon Source on Mineral Nutrition, Growth, and Shoot Proliferation in Chestnut Micropropagation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Genomics and Co-Expression Profiling of MADS-Box Genes Reveal Conserved Candidate Regulators of Secondary Cell Wall Formation in Lignified Endocarp and Seed Coat Across Four Angiosperm Species

1
Key Laboratory for Forest Resources Conservation and Utilization in the Southwest Mountains of China, Ministry of Education, Southwest Forestry University, Kunming 650224, China
2
Yunnan Academy of Forestry and Grassland, Kunming 650201, China
3
Yunnan Provincial Key Laboratory for Conservation and Utilization of In-Forest Resource, Southwest Forestry University, Kunming 650224, China
4
Yunnan Key Laboratory of Crop Wild Relatives Omics, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
*
Authors to whom correspondence should be addressed.
Horticulturae 2026, 12(5), 626; https://doi.org/10.3390/horticulturae12050626 (registering DOI)
Submission received: 30 March 2026 / Revised: 14 May 2026 / Accepted: 15 May 2026 / Published: 19 May 2026

Abstract

Fruit endocarp and seed coat are essential protective structures that influence key agronomic and mechanical traits in species with lignified protective tissues, yet their regulatory mechanisms remain incompletely understood. Here, we conducted a comprehensive genome-wide analysis of the MADS-box gene family in four angiosperm species: Juglans sigillata, Carya illinoinensis, Macadamia integrifolia, and Ricinus communis. A total of 58, 55, 57, and 57 MADS-box genes were identified, respectively, and systematically characterized through phylogenetic, structural, and evolutionary analyses. Comparative results revealed that MIKCc-type genes are highly conserved and primarily expanded via segmental duplication under strong purifying selection. Co-expression network analysis identified MADS-box genes as high-connectivity hub candidates that are strongly associated with genes involved in tissue specification, hormone signaling, and secondary cell wall biosynthesis. Promoters analysis indicated that these genes contain diverse cis-regulatory elements; however, these results are based on sequence prediction and do not demonstrate functional regulatory interactions. Across species, MADS-box genes exhibited analogous temporal expression dynamics during lignified endocarp and seed coat development, consistent with a potentially conserved transcriptional framework. Collectively, this study provides new insights into the evolutionary diversification and putative functions of MADS-box genes, and proposes a putative hierarchical regulatory framework for lignified endocarp and seed coat development. These findings supply valuable candidate target genes for future molecular breeding aimed at improving shell thickness, hardness, and related agronomic traits in woody nut and oilseed species.

1. Introduction

Fruits originate from the ovary of the carpel and are composed of distinct tissue layers collectively termed the pericarp [1]. This structure is typically subdivided into the exocarp, mesocarp, and endocarp, representing the outer, middle, and inner regions, respectively [2]. Among these layers, the endocarp, which directly encloses the seed, exhibits remarkable structural and functional diversity across plant lineages [3]. In drupe-type fruits, the endocarp undergoes pronounced lignification and hardening during maturation, forming a rigid protective layer [1,4,5]. This characteristic is observed in a wide range of economically important woody species, including peach, cherry, almond, olive, walnut, and pecan. In addition to endocarp, lignified seed coat (testa) can also form hard protective tissues in certain species, representing an alternative developmental origin of mechanically resistant structures.
The transition of the endocarp from soft parenchymatous tissue to a rigid lignified structure is primarily driven by coordinated secondary cell wall (SCW) deposition and massive lignin accumulation [6,7]. Although the molecular regulatory networks governing fruit lignification remain understudied, conserved regulatory parallels can be drawn from vascular tissue development [8,9]. During xylem differentiation, plant cells undergo sequential developmental processes including cell expansion, SCW thickening, and programmed cell death [10,11,12]. The mature SCW is enriched in cellulose, hemicellulose, and lignin, collectively conferring mechanical rigidity and structural strength [13,14]. Similarly, lignified endocarps serve as physical barriers, enhancing seed tolerance to biotic and abiotic stresses [15]. The thickness, hardness, and brittleness of lignified shells vary substantially among species and directly determine their economic and industrial utilization value [16]. Notably, lignification has been extensively characterized in woody vascular tissues; however, whether conserved transcriptional regulatory modules are shared between lignified fruit endocarps and lignified seed coats remains largely unresolved.
Previous studies on endocarp development have mainly focused on individual enzymes and biochemical pathways associated with lignin biosynthesis [17,18,19,20]. Accumulated physiological evidence indicates that lignin deposition in lignified fruit tissues can reach or even exceed the levels observed in typical woody stems, implying a highly specialized and intensified SCW biosynthesis program in reproductive protective tissues [21]. The initiation of lignification tightly coincides with the onset of endocarp hardening, followed by coordinated modulation of cell proliferation, cell expansion, and sclerenchyma differentiation across subsequent developmental stages [22]. The specification and patterning of pericarp tissues are governed by complex transcriptional regulatory networks. In Arabidopsis, MADS-box transcription factors act as core regulators controlling reproductive organ identity and tissue differentiation [23,24]. AGAMOUS (AG) determines carpel identity; SEEDSTICK (STK) is the central regulator of ovule identity, while SHATTERPROOF1 (SHP1) and SHP2 primarily regulate valve margin formation and fruit dehiscence, and cooperate with STK to modulate integument and seed coat development [25]. Mutation or disruption of these genes causes abnormal ovule morphogenesis and impaired seed development [26,27]. In woody nut species, MADS-box homologs have been functionally implicated in endocarp differentiation and shell formation, as exemplified by the SHELL locus in oil palm and STK-like homologs in highly lignified fruit species [2,28]. SHP1/SHP2 homologs modulate lignified tissues formation associated with fruit dehiscence, whereas FRUITFULL (FUL) restricts their spatial expression domain to ensure normal fruit tissue patterning [29,30]. Conserved regulatory modules involving these homologs also exist in fleshy fruit species [31]. In peach, SHP and STK homologs exhibit early expression in developing endocarp, followed by expression decline prior to massive lignin accumulation, while FUL homologs are predominantly expressed in non-lignified pericarp regions [2,32]. Subsequently, SCW regulatory genes such as NST1-like factors are strongly activated, indicating a sequential developmental transition from tissue identity specification to structural reinforcement [33]. In oil palm, mutations in the STK homolog SHELL alter endocarp formation and significantly affect shell thickness, further supporting the putative conserved roles of MADS-box members in fruit structural development [34].
The MADS-box gene family constitutes a central regulatory hub in plant reproductive growth and development [23]. Members are classified into Type I and Type II subfamilies, among which Type II MIKCc genes play predominant roles in floral organogenesis and fruit morphogenesis [35,36,37,38]. Mechanistically, MADS-box proteins form higher-order complexes that bind CArG-box cis-elements, enabling precise spatiotemporal modulation of downstream gene expression programs [39,40]. Functionally, MADS-box genes play important roles in fruit development by regulating tissue differentiation, secondary cell wall formation, and developmental patterning through modulation of cell proliferation and cell wall remodeling [41,42,43,44,45]. As E-class MADS-box members, SEPALLATA (SEP) genes act as essential co-factors by interacting with other MADS-box proteins to maintain reproductive organ identity and coordinate fruit developmental progression [46]. Functional investigations in tomato and strawberry further support the involvement of SEP homologs in fruit development and ripening [47,48,49]. Current evidence from Arabidopsis and model fruit systems suggests that MADS-box genes may function primarily as upstream modulators of SCW transcriptional networks, rather than direct regulators of individual lignin biosynthetic enzyme genes [29,50,51,52]. This plausible regulatory framework remains to be systematically examined in woody angiosperms with lignified endocarp or seed coat structures. Notably, accumulating evidence indicates that several MADS-box genes not only regulate ovule and seed development but are also closely associated with key agronomic and developmental traits, including inflorescence architecture, seed size, and seed coat formation [53,54,55]. These functional links suggest that MADS-box genes represent promising molecular markers for trait-associated variation and may facilitate marker-assisted selection and genetic improvement of fruit and seed-related characteristics in crop species.
Despite advances in understanding lignification in woody tissues, it remains unclear whether conserved MADS-box-mediated regulatory mechanisms govern lignified endocarp formation versus lignified seed coat formation across angiosperms. This study addressed two core hypotheses: (i) conserved MADS-box members are recruited to regulate lignified protective tissue formation across phylogenetically distant species; (ii) divergent regulatory profiles distinguish endocarp-based lignification in nut species from seed-coat lignification in non-nut woody species. To test these hypotheses, four species covering three plant families with distinct lignified protective tissue origins were deliberately selected: (1) iron walnut (Juglans sigillata, Juglandaceae) and pecan (Carya illinoinensis, Juglandaceae): typical nut species with lignified endocarp forming the hard shell; (2) Macadamia (Macadamia integrifolia, Proteaceae): a woody species in which the hard protective shell is derived from lignified seed coat, rather than endocarp; (3) castor (Ricinus communis, Euphorbiaceae): an oilseed species with a well-developed lignified seed coat, included as a complementary system to facilitate comparison between seed coat-based and endocarp-based lignified tissues. Through integrative phylogenomic, structural, evolutionary, transcriptomic, and regulatory network analyses, we systematically characterized the MADS-box family evolution and functional divergence. This comparative framework allows the identification of conserved and lineage-specific regulatory modules associated with lignified endocarp and seed coat formation, and provides valuable candidate genes for molecular genetic improvement of shell-related agronomic traits in woody nut and oilseed crops.

2. Materials and Methods

2.1. Identification and Classification of MADS-Box Genes

Protein-coding sequences from four species, including Juglans sigillata (iron walnut), Carya illinoinensis (pecan), Macadamia integrifolia (macadamia), and Ricinus communis (castor), were retrieved from publicly available genome databases. Specifically, genome assemblies for iron walnut (accession: GWHDEDA00000000), macadamia (accession: GWHBAUK00000000), and castor (accession: GWHJENL00000000.1), were obtained from the NGDC Genome Warehouse, while pecan genomic resources were obtained from NCBI (BioProject: PRJNA435846).
To comprehensively identify MADS-box family members, a combined homology-based and domain-based strategy was employed. Hidden Markov Model (HMM) profiles for the conserved MADS domain (PF00319) and K-box domain (PF01486) were downloaded from the Pfam database and used as queries to search protein datasets using HMMER v3.3.2 [56]. In parallel, local BLASTP searches were performed using known MADS-box protein sequences from Arabidopsis thaliana as queries, with an E-value threshold of 1e−10. Candidate sequences identified from both approaches were merged, and redundant entries were removed. All candidates were further validated for the presence of a complete MADS domain using conserved domain databases. Proteins lacking an intact MADS domain were discarded. To avoid redundancy and annotation bias, only the longest transcript isoform for each gene was retained for subsequent analyses. Physicochemical properties, including molecular weight (MW) and theoretical isoelectric point (pI), were calculated using the ExPASy ProtParam tool v3.0 (https://web.expasy.org/ accessed on 20 January 2026) [57]. Subcellular localization was predicted using Plant-mPLoc v2.0 (http://www.csbio.sjtu.edu.cn/bioinf/plant-multi/ accessed on 20 January 2026). Multiple sequence alignment was conducted using MUSCLE v3.8 (https://www.ebi.ac.uk/jdispatcher/msa/muscle accessed on 25 January 2026) with default parameters [58,59]. Ambiguous or poorly aligned regions were automatically trimmed to improve phylogenetic reliability. Phylogenetic trees were constructed using IQ-TREE (2.4.0) under the Maximum Likelihood (ML) framework, with the best-fit evolutionary model automatically selected by the ModelFinder Plus (MFP) module [60]. Node support was evaluated using 1000 ultrafast bootstrap replicates. The resulting phylogenetic trees were initially visualized in MEGA10 and further refined, annotated, and formatted for publication using iTOL v7 (https://itol.embl.de/ accessed on 15 February 2026) [61].

2.2. Conserved Domain and Gene Structure Analyses

Conserved domains of the identified MADS-box proteins from all four species were predicted using the SMART program v10 (http://smart.embl-heidelberg.de/ accessed on 14 February 2026) and InterProScan v5 (https://www.ebi.ac.uk/interpro/search/sequence/ accessed on 14 February 2026) [59,62]. The characteristic M, I/I-like, K, and C domains of MADS-box proteins were further identified and manually annotated according to previously established domain classification criteria [63,64,65,66,67]. WebLogo v3 (https://weblogo.threeplusone.com accessed on 26 February 2026) was employed to visualize consensus sequences and amino acid residue frequency distributions based on multiple sequence alignment results of all four species [68]. Positional conservation was quantified using bit scores, with higher bit values indicating a higher degree of amino acid conservation at the corresponding site.
Gene structures, including exon–intron organization, were visualized using the Gene Structure Display Server (GSDS v2.0) (http://gsds.cbi.pku.edu.cn/ accessed on 22 February 2026). Protein secondary structural features, including solvent accessibility and structural disorder regions, were predicted using NetSurfP-3.0 (https://dtu.biolib.com/NetSurfP-3/ accessed on 23 February 2026) [69]. Homology modeling of protein three-dimensional (3D) structures was performed using AlphaFold Server (https://deepmind.google/science/alphafold/alphafold-server/ accessed on 14 May 2026). For predicted protein structure, the model with the pLDDT confidence score ≥ 70 were considered high-confidence and used for subsequent structural analysis. The predicted structures were subsequently visualized and structurally aligned using PyMOL 3.0 software (https://pymol.org/#download accessed on 26 February 2026) [70].

2.3. Chromosomal Distribution and Evolutionary Analysis

Chromosomal locations of all identified MADS-box genes in the four studied species were retrieved from genome annotation files and visualized using TBtools v2.467 [71]. Gene duplication events, including tandem and segmental duplications, were identified using MCScanX v1.0 [72]. To investigate both intra- and interspecies evolutionary relationships, collinearity analyses were performed at the whole-genome level as well as specifically for MADS-box genes. Syntenic relationships among duplicated gene pairs within each species and orthologous gene pairs across the four species were identified and visualized using jcvi [73]. Further analyses of syntenic blocks, synonymous substitution rate (Ks), nonsynonymous/synonymous substitution ratios (Ka/Ks), and interspecies divergence were conducted using WGDI (Whole Genome Duplication) and gene pairs with Ks values greater than 2 were excluded to reduce the influence of substitution saturation [74,75].

2.4. Transcriptome-Based Expression Profiling and Co-Expression Network Analysis

In this study, transcriptome data were integrated from three sources: our previously published datasets for iron walnut and pecan, newly generated RNA-seq data for castor, and publicly available omics datasets for macadamia. For iron walnut (Dapao variety), endocarp transcriptome data from our previous study were retrieved from the NCBI Sequence Read Archive (SRA) under BioProject accession PRJNA928586, covering three representative developmental stages: rapid expansion (P1), structural hardening (P2), and maturation (P3) [76]. Similarly, published endocarp transcriptomes of pecan (Caddo variety) were obtained from the National Genomics Data Center (NGDC) database under BioProject accession PRJCA040435, encompassing three sequential developmental stages (KD1–KD3) [77]. Publicly available RNA-seq data across five shell developmental stages (shell1–shell5) of macadamia (cv. Hinde) were downloaded from NCBI under BioProject accession PRJNA706119 [78]. For castor (ZB306), seed coat tissues at five developmental stages (SOC1–SOC5) were freshly sampled and subjected to transcriptome sequencing. Briefly, total RNA was isolated from each sample, and RNA integrity and purity were assessed by agarose gel electrophoresis and a Nanodrop2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) [79,80]. High-quality RNA samples were used for cDNA library construction, followed by paired-end sequencing on the Illumina NovaSeq 6000 platform (Illumina, San Diego, CA, USA). The newly generated castor transcriptome data have been deposited in NGDC under BioProject accession PRJCA062922. All above samples in this study were collected with three independent biological replicates per developmental stage.
Raw sequencing reads were processed using fastp (v0.24.0) to remove adapters sequences and low-quality reads [81]. Clean reads were subsequently aligned to their respective reference genomes using HISAT2 v2.2.1 with default parameters [82]. Gene expression levels were quantified using FeatureCounts v2.0.1 implemented in the Subread package [83]. Genes expression patterns were analyzed using K-means clustering, and the results were visualized with the pheatmap R v4.4.2 package [84]. To screen conserved stage-specific marker genes with comparable expression trajectories across the four species, all-to-all similarity searches were performed using MMseqs2 v13 with the following parameters: −e 1e−5-min-seq-id 0.3 to identify homologous sequences between castor and the other three species [85]. Because castor exhibited the most clearly resolved developmental clustering patterns, its gene IDs and K-means clustering assignments were used as the reference framework for cross-species comparisons. Genes from the other three species were considered candidate marker genes when their orthologs displayed highly similar expression patterns and were preferentially expressed during corresponding developmental stages across species. Subsequently, Gene Ontology (GO, https://geneontology.org/ accessed on 5 March 2026) and Kyoto Encyclopedia of Genes and Genomes (KEGG) [86] enrichment analyses were independently performed for each expression cluster and conserved stage-specific marker genes. Enrichment significance was determined using Fisher’s exact test, with the threshold parameters set as adjusted p-value < 0.05 and FDR < 0.05 to identify significantly enriched biological processes and metabolic pathways.

2.5. Cis-Regulatory Elements Analysis

To investigate the regulatory mechanisms controlling MADS-box gene expression, the 2000 bp upstream promoter regions of all target genes were extracted. For genome-wide background comparison, equal numbers of 2 kb upstream promoter sequences were randomly selected from non-MADS-box genes in each species with matched GC content [87]. Cis-regulatory elements were identified using the PlantPAN 4.0 database (https://plantpan.itps.ncku.edu.tw/plantpan4/index.html accessed on 5 March 2026). Fisher’s exact test with FDR correction was performed to evaluate the statistical enrichment of each cis-element, and fold-enrichment values were calculated. Only elements with FDR < 0.05 and fold-enrichment > 1.5 were defined as significantly enriched. Cis-regulatory elements were classified according to their putative functions. The distribution and enrichment results of these elements were visualized using TBtools [71].

3. Results

3.1. Identification and Characterization of MADS-Box Genes

A total of 58, 55, 57, and 57 MADS-box genes were identified in iron walnut, pecan, macadamia, and castor, respectively, using a combined strategy involving BLAST v.2.16.0 searches and HMMER-based detection of the conserved MADS (PF00319) and K-box (PF01486) domains (Figure S1). All candidate proteins were further validated via Batch CD-Search and SMART analyses [62,88]. Sequences lacking a complete MADS domain, as well as redundant transcripts generated by alternative splicing, were excluded from subsequent analysis. The identified genes were renamed according to their chromosomal positions, resulting in the designations JsiMADS1-58, CilMADS1-55, MiMADS1-57, and RcMADS1-57 (Tables S1–S4). The highly similar numbers of MADS-box genes among the four species suggest that this gene family has remained remarkably conserved in size despite substantial phylogenetic divergence. Protein lengths ranged from 68 to 978 amino acids (aa), with the majority (71–84%) falling between 200 and 400 aa (Figure S2). Average protein lengths varied slightly across species, from approximately 245.59 aa in walnut to 311.05 aa in pecan. Predicted isoelectric point (pI) values ranged from 4.11 to 10.82, with average values between 7.31 and 8.28, indicating that most proteins are weakly basic. Molecular weights ranged from 7.98 to 56.71 kDa, with average values of 27.93, 35.38, 28.59, and 28.42 kDa for JsiMADSs, CilMADSs, MiMADSs, and RcMADSs, respectively. Subcellular localization analysis predicted nuclear localization for all MADS-box proteins, consistent with their conserved function as transcriptional regulators. Collectively, these results demonstrate that the MADS-box gene family exhibits substantial conservation in both gene number and basic physicochemical properties across the four species.

3.2. Conserved Domains and Structural Features

Based on domain composition, MADS-box proteins were classified into Type I and Type II groups [31]. Type I proteins generally consist of M, intervening-like (I-like), and C-terminal (C) domains, whereas Type II proteins possess the canonical MIKC structure (M, I, K, and C domains) [89]. Across all four species, Type II proteins displayed highly conserved structural characteristics and predominantly adopted a canonical 4α–2β configuration (Figures S3–S6). This structure was identified in 36, 33, 25, and 39 proteins in iron walnut, pecan, macadamia, and castor, respectively. In contrast, Type I proteins exhibited markedly greater structural variability, ranging from simplified conformations to expanded structures. The particularly high divergence in pecan and macadamia suggests lineage-specific structural evolution of Type I members. The M domain (approximately 58–60 aa in length; Tables S1–S4) was highly conserved in both Type I and Type II proteins. Several residues, including Arg (3), Glu (9), Asp (20), Lys (23), and Leu (35) (Figure 1), were strongly conserved, consistent with their established roles in DNA binding and structural stability [63,90]. Positively charged residues (particularly Arg and Lys) likely contribute to CArG-box motif recognition, whereas acidic and hydrophobic residues may stabilize protein structure. The I-like domain of Type I proteins was relatively shorter (10–20 aa) and weakly conserved, typically containing residues such as Val, Leu/Ile, Asp, Arg, and aromatic residues (Phe/Tyr), indicating a preference for hydrophobic and partially charged amino acids. In contrast, the I domain of Type II proteins showed moderate conservation across species, with conserved residues including Ser (2), Met (6), Thr (9), Leu/Ile (10), Glu (11), Arg (12), and Tyr (13). Compositional differences between Type I and Type II proteins suggest early evolutionary divergence and functional specialization, particularly in protein–protein interaction specificity and dimerization [63]. The K domain—specific to Type II proteins and mediating protein–protein interactions—was generally conserved (approximately 88–100 aa in length). However, several proteins contained truncated K domains (40–70 aa), suggesting partial degeneration or functional diversification following duplication. Conserved residues such as Leu, Lys, Arg, and Gly further support the role of the K domain in dimerization and higher-order complex formation [64,90,91]. In contrast, the C-terminal region showed extensive variation in both length and amino acid composition, ranging from only a few residues to more than 200 aa (Figure 1 and Figures S7–S10). This pronounced variability may reflect diversification in transcriptional activation capacity and regulatory specificity.
Gene structure analysis revealed clear differences between the two major groups (Figure 2: iron walnut; Figures S11–S13: pecan, macadamia, and castor, respectively). More than 90% of Type I genes contained either zero or one intron, whereas Type II genes displayed considerably more complex exon–intron structures, with an average of 6.7–7.8 introns per gene. These results indicate that Type II MADS-box genes have retained a highly conserved structural framework, whereas Type I genes have undergone more extensive evolutionary diversification.

3.3. Phylogenetic Analysis

Phylogenetic analysis grouped all MADS-box genes into the canonical Type I (Mα, Mβ, Mγ) and Type II (MIKCc, MIKC*) subfamilies [92]. Among these groups, the MIKCc clade was predominant in all four species, accounting for approximately 45.6–67.2% of the identified genes. In contrast, the Mα subgroup represented the largest fraction of Type I genes, whereas Mβ, Mγ, and MIKC* members were less abundant and showed greater interspecific variation (Figure 3 and Figure S14). Iron walnut contained 13 Type I and 45 Type II MADS-box genes, including 10 Mα genes, 2 Mβ (JsiMADS12/20) genes, and 1 Mγ (JsiMADS30) gene. In pecan, macadamia, and castor, 16/39, 26/31, and 22/35 Type I/Type II genes were identified, respectively (Figures S15–S17). The distributions of Mα/Mβ/Mγ genes were 10/2/1, 9/2/5, 14/6/6, and 13/5/4 in iron walnut, pecan, macadamia, and castor, respectively. Similarly, the MIKCc/MIKC* distributions were 39/6, 34/5, 26/5, and 28/7 (Figure 3). The predominance of MIKCc genes across all species highlights their evolutionary conservation and suggests they represent the principal functional component of the MADS-box family involved in reproductive organ development and shell formation. In contrast, the relatively greater expansion of Type I genes in macadamia and castor may reflect lineage-specific adaptation or functional specialization.

3.4. Chromosome Distribution, Gene Duplication, and Evolutionary Analysis

MADS-box genes were unevenly distributed across chromosomes in all four species (Figure 4), with pronounced variation in local gene density. The highest numbers of genes were located on Chr10 of iron walnut (9 genes), Chr15 of pecan (7 genes), Chr07 of macadamia (11 genes), and Chr02/Chr05 of castor (10 genes each) (Figures S18 and S19). Analysis of duplication patterns indicated that segmental duplication was the predominant force driving the expansion of the MADS-box gene family. Specifically, 40 of 58 genes (69%) in iron walnut, 42 of 55 genes (76%) in pecan, 21 of 57 genes (36%) in macadamia, and 26 of 57 genes (46%) in castor were associated with segmental duplication events. Notably, more than 86% of these duplicated genes belonged to the Type II group. In contrast, tandem duplication contributed minimally, accounting for only 1–4 duplicated gene pairs per species. Ka/Ks analysis showed that all duplicated gene pairs exhibited ratios ranging from 0.085 to 0.38 (Tables S5 and S6), well below 1, indicating that these genes have been predominantly subjected to strong purifying selection, thereby maintaining their functional conservation. In addition, 44–53% of duplicated gene pairs displayed higher Ks values than the genome-wide average, suggesting these duplicates likely originated from relatively ancient duplication events. Notably, the proportion of segmentally duplicated MADS-box genes in Juglandaceae species (iron walnut and pecan) was nearly twice that observed in macadamia and castor. This disparity is consistent with the well-documented lineage-specific whole-genome duplication (WGD) event in Juglandaceae. Further synteny analysis revealed that 82% (33/40) of duplicated gene pairs in iron walnut and 79% (33/42) in pecan were located within WGD-derived collinear blocks (Figure 4 and Figure S19, Tables S7 and S8), supporting a major contribution of this paleopolyploidy event to MADS-box gene expansion in these species. In contrast, the macadamia genome did not exhibit signatures of the ancestral γ whole-genome triplication event characteristic of core eudicots, although lineage-specific duplication events have been reported [78]. Similarly, castor has experienced the ancestral eudicot hexaploidization but lacks additional species-specific genome duplication events [93]. Collectively, these results suggest that segmental duplication, largely associated with whole-genome duplication events, has played a central role in shaping the expansion of the MADS-box gene family, while long-term purifying selection has contributed to the preservation of their functional integrity across species.

3.5. Expression Patterns During Fruit Endocarp and Seed Coat Development

Based on our previous studies of endocarp development in iron walnut and pecan, as well as seed coat development in castor, combined with the morphological framework of macadamia shell (derived from seed coat) development reported by Lin et al., the developmental processes of lignified endocarp and seed coat across the four species were classified into three conserved stages: early, middle, and late [76,77,78,79,80]. Transcriptome datasets from lignified endocarp and seed coat development were used for temporal expression clustering. Genes with FPKM > 0.1 were defined as expressed genes and retained for further analysis. To minimize batch effects among RNA-seq datasets from different species, within-species Z-score normalization was applied to standardize expression patterns and enable valid cross-species trend comparisons. Elbow plot analysis was applied to statistically determine the optimal cluster number for each species: three clusters (A1–A3) for iron walnut, three clusters (B1–B3) for pecan, four clusters (C1–C4) for macadamia, and four clusters (D1–D4) for castor (Figure 5 and Figure S20).
For iron walnut endocarp RNA-seq data, genes highly expressed at P1 (60 days after anthesis, DAA; cluster A1), P2 (90 DAA; A2), and P3 (120 DAA; A3) were assigned to the early, middle, and late developmental stages, respectively (Figure 5A). Similarly, for pecan endocarp RNA-seq data, KD1 (60 days after pollination, DAP; B1), KD2 (90 DAP; B2), and KD3 (120 DAP; B3) corresponded to the early, middle, and late stages, respectively (Figure 5B). In macadamia shell transcriptomes, shell1 (C1), shell2–4 (C2–C3), shell5 (C4) represented the early, middle, and late stages, respectively (Figure 5C). For castor seed coat transcriptomes, SOC1 (D1), SOC2–3 (D2), and SOC4–5 (D3–D4) were assigned to the early, middle, and late stages, respectively (Figure 5D).
To evaluate the reliability of the clustering and stage assignments, a total of 777 genes showing stable and conserved expression patterns across the three developmental stages in all four species were identified as stage-specific marker genes (Table S9). Among them, 328, 227, and 222 genes were assigned to the early, middle, and late stages, respectively. Gene Ontology (GO) enrichment analysis revealed that early-stage marker genes were significantly enriched in terms related to microtubule-associated complexes (GO: 0005875), regulation of cell growth (GO: 0001558), and regulation of cell cycle phase transition (GO: 1901987) (Figure S21), including genes such as AGAMOUS-like 65 (AGL65), Homeobox-leucine zipper protein (ATHB13), and indole-3-acetic acid inducible 27 (IAA27) (Table S13). Middle-stage marker genes were mainly enriched in RNA modification (GO: 0009451), cell maturation (GO: 0048469), and cuticle development (GO: 0042335) (Figure S18), including alpha-mannosidase 3 (MNS3), beta-1,3-galactosyltransferase 20 (GALT20), and Vacuolar-sorting receptor (VSR) (Table S9). In contrast, late-stage marker genes were significantly enriched in carbohydrate storage (GO:0005985), autophagic cell death (GO:0048102), and L-serine ammonia-lyase activity (GO:0016846) (Figure S18), represented by genes such as 4-coumarate:CoA ligase-like (4CL-like), caffeoyl-CoA O-methyltransferase (CCoAOMT), Metallothionein-like protein type 3-like (MT3), ABSCISIC ACID INSENSITIVE 3 (ABI3), and AGL19 (Table S9).
Based on these unified developmental classifications, we further investigated the expression patterns of MADS-box genes across the four species (Figure 6). In iron walnut, 51 of the 58 MADS-box genes were expressed (FPKM > 0.1) and classified into A1–A3, corresponding to the unified early, middle, and late developmental stages. Most genes (34) showed peak expression during the early stage (A1), whereas nine and eight genes exhibited maximal expression during the middle (A2) and late (A3) stages, respectively (Figure 6A and Figure S22, and Table S10). Similar clustering patterns were observed in pecan (Figure 6B, Table S11). In macadamia, 36 of the 57 MADS-box genes were grouped into four clusters, among which 16 genes in C1 were preferentially expressed during the early stage, 14 genes in C2–C3 during the middle stage, and six genes in C4 during the late stage (Figure 6C, Table S12). In castor, 48 of the 57 MADS-box genes were similarly grouped into early-, middle-, and late-stage clusters, including 12 genes in D1, 20 genes in D2–D3, and 16 genes in D4 (Figure 6D, Table S13). Within each species, MIKCc members generally exhibited broad and sustained expression across multiple developmental stages, whereas Type I MADS-box genes displayed more restricted and stage-specific expression patterns. Notably, only two MADS-box genes showed highly conserved stage-specific expression across all four species: AGL65 (MIKC* subgroup) was specifically expressed during the early stage, whereas AGL19 (MIKCc subgroup) was preferentially expressed during the late stage. In addition, several MADS-box genes displayed divergent expression patterns among species. For example, the well-characterized lignified endocarp and seed coat regulator STK (MIKCc subgroup) was specifically expressed during the early stage in iron walnut, pecan, and castor, but showed late-stage-specific expression in macadamia (Tables S10–S13). Collectively, these results indicate that although the overall developmental transcriptional programs underlying lignified endocarp and seed coat formation are broadly conserved across species, substantial divergence exists in the temporal expression dynamics of individual MADS-box genes, suggesting both conserved and lineage-specific regulatory mechanisms during shell development.

3.6. Gene Co-Expression Analysis

To better elucidate the regulatory roles of MADS-box genes, genes exhibiting stage-specific expression patterns during early, middle and late development were classified as co-expressed genes. During the early developmental stage, co-expressed genes were related to the spliceosome (ko03040), plant hormone signal transduction (ko04075), and ATP-dependent chromatin remodeling (ko03036). MADS-box genes belonging to the SEP (JsiMADS4/6, CilMADS3, MiMADS15/37), FUL/AP1 (JsiMADS5/11, CilMADS28/39), AG (JsiMADS36, CilMADS15, MiMADS46, RcMADS8), and AGL6 (JsiMADS16, MiMADS57) subfamilies exhibited high expression during this stage (Figure S23, Tables S10–S13). These genes were co-expressed with multiple transcription factors, including auxin response factor (ARF), basic helix-loop-helix (bHLH), basic leucine-zipper (bZIP), cyclin d, homeobox protein, LOB domain-containing protein, microtubule-associated protein 65 (MAP65), MYB, as well as genes involved in cell expansion, e.g., such as EXORDIUM-like 2/3, expansin A, FASCICLIN-like arabinogalactan-protein, and xyloglucan endotransglucosylase/hydrolase (XTH). These results indicate that early shell development is primarily associated with tissue specification, active cell proliferation, and extensive cellular expansion.
At the middle developmental stage, co-expressed genes were enriched in mRNA surveillance (ko03015), biosynthesis of secondary metabolites (ko01110), and biosynthesis of amino acids (ko01230) pathways (Figure S24, Tables S10–S13). Several MADS-box subfamilies, including FUL/AP1 (JsiMADS17/28), AP3/PI (JsiMADS25, CilMADS49), SVP (JsiMADS44, CilMADS17), AGL6 (CilMADS50, MiMADS19), and Mα(JsiMADS23, CilMADS45, MiMADS5, RcMADS46), showed stage-specific expression. These genes were co-expressed with BEL1-like homeodomain 1, NAC, and NFYA TFs, together with genes encoding cellulose synthase-like proteins, gibberellin 2-oxidase 6, and glucan synthase-like proteins. This expression pattern suggests a transition from early developmental patterning toward secondary wall deposition and tissue differentiation.
At the late stage, genes were strongly enriched in carbon metabolism (ko01200), protein processing in endoplasmic reticulum (ko04141), glyoxylate and dicarboxylate metabolism (ko00630), 2-oxocarboxylic acid metabolism (ko01210), and multiple pathways associated with lignin and secondary wall formation, e.g., phenylpropanoid biosynthesis (ko00940). MIKC* (JsiMADS39, CilMADS55, RcMADS10), AG (JsiMADS13/34, MiMADS25, RcMADS33), and AGL6 (JsiMADS9, CilMADS26, MiMADS19, RcMADS1) subfamily genes were prominently expressed at this stage (Figure S25, Tables S10–S13). They were co-expressed with genes encoding 4CL2, cinnamate-4-hydroxylase (C4H), and gibberellin 2-oxidase (GA2ox), as well as several regulatory factors involved in lignification and cell wall reinforcement. Collectively, these results reveal a dynamic developmental progression in which MADS-box genes participate in a hierarchical regulatory network that shifts from early cell identity specification to late-stage lignification and secondary wall formation.

3.7. Identification and Analysis of Cis-Regulatory Elements in MADS-Box Genes

Analysis of the 2000 bp upstream promoter regions identified 12 major classes of cis-regulatory elements in the MADS-box genes across the four species. The distribution patterns of cis-regulatory elements in the promoter regions of JsiMADS genes are presented in Figure 7, whereas the corresponding distributions in pecan, macadamia, and castor are shown in (Figures S26–S28), respectively. Seven classes were associated with plant hormone signaling, including ABA-, auxin-, salicylic acid-, gibberellin-, and ethylene-responsive elements, whereas the remaining elements corresponded to TF binding sites, e.g., CArG, bZIP, HD, MYB, MYC, E2F motifs. To validate whether these elements are functionally relevant to MADS-box gene regulation (Figure S29), we performed statistical enrichment analysis by comparing their frequencies in MADS-box promoters with a genome-wide background (1000 randomly selected non-MADS-box gene promoters per species). Fisher’s exact test (p < 0.05) revealed that CArG-boxes, MYB binding sites, and ABRE motifs were significantly over-represented across all four species (Figure S30). Specifically, CArG-boxes (canonical MADS-domain binding motifs) showed 2.3–3.1 fold enrichment, MYB binding sites showed 2.1–2.7-fold enrichment, and ABRE motifs showed 1.8–2.2-fold enrichment relative to the background. This confirms that these elements are not randomly distributed but are likely functionally constrained for MADS-box gene regulation.
Among all identified elements, bZIP and MYC/ABRE motifs were the top two most frequently detected (Figures S29 and S30), while ERF elements showed relatively stable frequencies across species, suggesting ethylene signaling may broadly participate in the regulation of lignified endocarp and seed coat development. Several hormone-responsive elements displayed species-specific differences: ABA-responsive elements were more abundant in iron walnut (3.23%) and pecan (3.42%) than in macadamia (2.58%) and castor (1.81%), whereas ERF3 elements showed the greatest variation (5.59% in castor vs. 7.98% in macadamia). These differences may reflect divergence in hormonal responsiveness and developmental programs among species with distinct shell architectures. The widespread presence of CArG-boxes supports extensive autoregulatory and cross-regulatory interactions among MADS-box genes. HD and MYB binding sites were also highly abundant, suggesting potential coordination between MADS-box proteins and additional transcriptional regulators involved in development and lignification. Low-frequency elements (e.g., E2F, SAUR motifs) may be associated with more specialized functions, including cell cycle regulation and cell growth processes (Tables S14–S17). Overall, the coexistence of hormone-responsive elements and TF binding sites—coupled with their significant enrichment—indicates that MADS-box genes occupy a central position at the intersection of hormonal signaling and transcriptional regulation during lignified endocarp and seed coat development.

4. Discussion

This study provides the high integrative comparative analysis of the MADS-box gene family across multiple species possessing lignified reproductive protective tissues, including woody nut endocarp and oil crop seed coat. By integrating phylogenetic classification, conserved domain characterization, gene structure organization, spatiotemporal transcriptomic profilings, and co-expression network inference, we identified conserved expression signatures of MIKCc subfamily MADS-box genes during lignified tissue development. Although these tissues differ anatomically and developmentally among species, their shared transcriptional features suggest the existence of partially conserved regulatory programs associated with reproductive tissue lignification in angiosperms.
The formation of lignified endocarp and seed coat tissues depends on tightly coordinated developmental patterning processes involving carpel and ovule differentiation [87,94]. In the present study, canonical MIKCc subfamily genes, including AG, SEP, FUL/AP1 related homologs, exhibited preferential expression during early developmental stages across these four species (Figure 5). These expression profiles are consistent with previous functional studies in Arabidopsis thaliana, where AG specifies carpel identity, while STK, SHP1, and SHP2 participate in ovule specification and integument differentiation [95,96,97,98]. Importantly, our data suggest that MADS-box genes may contribute to the early developmental establishment of lignified versus non-lignified tissues; however, the current evidence is based primarily on temporal co-expression and comparative transcriptomic analyses rather than direct functional assays. A key emerging insight of this study is the identification of a conserved temporal transition from early developmental patterning to late-stage lignification and SCW formation. Early developmental phases were predominantly associated with hormone signal transduction and chromatin remodeling pathways, whereas later stages showed progressive enrichment of phenylpropanoid biosynthesis and SCW formation (Figures S23–S25). This developmental trajectory resembles the hierarchical transcriptional cascade previously described during xylem differentiation, in which NAC transcription factors (e.g., NST1, SND1) activate downstream MYB regulators that subsequently induce lignin biosynthetic genes [99,100,101]. Our temporal expression and co-expression analyses are therefore consistent with a model in which MADS-box genes function upstream of lignification-associated regulatory networks and may participate in developmental transitions preceding SCW deposition. Nevertheless, this hierarchical positioning remains hypothetical because co-expression relationships do not establish direct transcriptional regulation or causality.
Interestingly, comparative analysis across the four species revealed both conserved and lineage-specific transcriptional dynamics of MADS-box genes. Only two genes, AGL65 and AGL19, displayed highly conserved stage-specific expression patterns across all species, whereas other genes, such as STK, exhibited substantial divergence in temporal expression. These observations suggest that the overall developmental framework governing reproductive tissue lignification may be evolutionarily conserved, while individual regulatory components have undergone lineage-specific rewiring during species diversification. Previous studies investigating lignified tissue development have mainly focused on drupe endocarp formation [5,6,102,103].
Our analysis of castor seed coat development extends these observations by demonstrating that similar MADS-box subfamilies are also associated with testa lignification. In particular, late-stage enrichment of AG, AGL6, and MIKC*-related genes coincided with activation of phenylpropanoid biosynthesis and SCW-associated pathways, suggesting potential involvement in sclerenchyma differentiation and cell wall reinforcement. However, these similarities can be interpreted under two alternative evolutionary scenarios: (i) conservation of an ancestral regulatory program associated with carpel-derived protective tissues, or (ii) convergent recruitment of similar transcriptional modules during independent evolution of lignified structures. Because both endocarp and seed coat are homologous reproductive structures derived from carpellar tissues [104,105], the hypothesis of partial regulatory conservation is biologically plausible. Nonetheless, the current data cannot distinguish homology from convergent co-option, and broader phylogenomic sampling combined with functional studies will be required to resolve this question.
Co-expression network analysis further identified extensive associations between MADS-box genes and genes involved in hormone signaling, chromatin remodeling, transcriptional regulation, and lignin biosynthesis. Notably, multiple MADS-box genes were co-expressed with NAC and MYB TFs as well as key phenylpropanoid pathway enzymes, including C4H and 4CL (Figure 5; Tables S9–S13). These interactions suggest that MADS-box genes may participate in multilayered transcriptional coordination during lignified tissue development, although direct regulatory relationships remain unverified. Similar regulatory architectures involving MADS-domain proteins have been reported in diverse systems, including Arabidopsis inflorescence development, pine secondary growth, pear stone cell formation, and loquat flesh lignification, indicating that MADS-mediated transcriptional coordination may represent a recurrent feature of plant lignified tissue differentiation [106,107,108,109].
Cis-regulatory element profiling of the 2 kb upstream promoter regions of MADS-box genes identified abundant hormone-responsive motifs and transcription factor binding sites across all four species. Prominent enrichment of ABA, auxin, and gibberellin-responsive elements implies that MADS-box transcriptional activity is potentially modulated by developmental and hormonal signals (Figure 7 and Figures S26–S28). Widespread enrichment of MYB, NAC, and bHLH binding sites further indicates potential combinatorial regulatory interactions, although these predicted interactions remain to be experimentally validated. Within this regulatory network, SEP subfamily proteins are inferred to function as core components of higher-order MADS-box protein complexes, consistent with the canonical combinatorial interaction mode of plant MADS-box proteins. Similar complex formation and regulatory roles have been documented in wheat endosperm cellularization, rice floret meristem specification, and barley spike morphogenesis [110,111,112]. Taken together, our findings are consistent with a hierarchical regulatory model in which MADS-box genes may act upstream of lignification-associated transcriptional networks, contributing to early tissue specification and the initiation of SCW-related developmental programs. Nevertheless, this model is built entirely on comparative transcriptomics, co-expression network inference, and cis-element enrichment, and should be regarded as provisional pending direct functional validation. Beyond establishing conserved transcriptional patterns, this study provides an integrated evolutionary and mechanistic framework for understanding the origin, conservation, and lineage-specific divergence of carpel-derived lignified protective structures in angiosperms.

5. Conclusions

This study presents a comprehensive genome-wide characterization and comparative analysis of the MADS-box gene family across four phylogenetically divergent species with lignified endocarp or seed coat tissues, including two Juglandaceae woody nuts, macadamia, and castor. Our integrated analyses demonstrate that MIKCc-type MADS-box genes are evolutionarily conserved and serve as candidate upstream regulatory components throughout fruit shell and seed coat developmental progression. Spatiotemporal transcriptome dynamics collectively uncover a conserved developmental cascade, transitioning from early cell fate specification and tissue patterning to subsequent lignification and SCW deposition. Our results highlight MADS-box genes as key upstream regulatory components orchestrating the hierarchical transcriptional network modulating SCW biosynthesis and lignin accumulation. We propose a refined working model in which MADS-box genes potentially integrate phytohormonal signaling and downstream transcriptional cascades to govern lignified tissue formation. This study advances our mechanistic understanding of molecular regulation and evolutionary conservation underlying nut shell and seed coat development. In terms of molecular breeding applications, this work provides valuable candidate genes for the genetic improvement of shell hardness, thickness, and overall nut commodity quality. Particularly promising targets include conserved stage-specific MADS-box members: the early specifically expressed AGL65 and late-enriched AGL19, as well as conserved hub genes from the SEP, FUL/AP1, AG, and AGL6 subfamilies that show sustained expression across developmental stages and tight co-expression with core lignin pathway genes. These conserved, stage-biased, and network-hub MADS-box homologs represent priority targets for future functional characterization and molecular breeding design in woody oil and nut crops. Future functional validation of these core MIKCc hub genes and their downstream regulatory interactions will further clarify the molecular mechanism governing lignified endocarp and seed coat development and facilitate precision molecular breeding in woody oil crops.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/horticulturae12050626/s1, Figure S1: Venn diagram illustrating the overlap between MADS-box genes identified by BLAST and HMM searches in four species; Figure S2: Distribution of protein lengths of MADS-box genes in four species; Figure S3: Secondary structure prediction of MADS-box proteins in iron walnut.; Figure S4: Secondary structure prediction of MADS-box proteins in pecan; Figure S5: Secondary structure prediction of MADS-box proteins in Macadamia; Figure S6: Secondary structure prediction of MADS-box proteins in castor; Figure S7: Predicted three-dimensional structures of MADS-box proteins in iron walnut; Figure S8: Predicted three-dimensional structures of MADS-box proteins in pecan; Figure S9: Predicted three-dimensional structures of MADS-box proteins in macadamia; Figure S10: Predicted three-dimensional structures of MADS-box proteins in castor; Figure S11: Phylogenetic relationships, conserved domain, and gene structure of MADS-box genes in pecan (CilMADSs); Figure S12: Phylogenetic relationships, conserved domain, and gene structure of MADS-box genes in macadamia (MiMADSs); Figure S13: Phylogenetic relationships, conserved domain, and gene structure of MADS-box genes in castor (RcMADSs); Figure S14: Distribution of MADS-box genes among five subfamilies (Mα, Mβ, Mγ, MIKCc, and MIKC*) in four species; Figure S15: Phylogenetic relationships of MADS-box genes in pecan and Arabidopsis; Figure S16: Phylogenetic relationships of MADS-box genes in macadamia and Arabidopsis; Figure S17: Phylogenetic relationships of MADS-box genes in castor and Arabidopsis; Figure S18: Chromosomal distribution of MADS-box genes in four species; Figure S19: Segmental duplication of MADS-box genes across four species.; Figure S20: Elbow plots for determining the optimal number of clusters for expressed genes in four species; Figure S21: GO enrichment analysis of selected mark genes at the three developmental stages; Figure S22: Elbow plots for determining the optimal number of clusters for expressed MADS-box genes in four species; Figure S23: KEGG enrichment analysis of co-expressed genes at the early developmental stage in four species; Figure S24: KEGG enrichment analysis of co-expressed genes at the middle developmental stage in four species; Figure S25: KEGG enrichment analysis of co-expressed genes at the late developmental stage in four species; Figure S26: Cis-regulatory element distribution in promoter regions of CilMADS genes; Figure S27: Cis-regulatory element distribution in promoter regions of MiMADS genes; Figure S28: Cis-regulatory element distribution in promoter regions of RcMADS genes; Figure S29: Proportion of cis-regulatory elements in the promoter regions of MADS-box genes in four species; Figure S30: Proportion of cis-regulatory elements in the promoter regions of whole genome genes in four species; Table S1: Comprehensive annotation of JsiMADS genes in iron walnut: gene features, conserved domains, structural characterization, and subcellular localization; Table S2: Comprehensive annotation of CilMADS genes in pecan: gene features, conserved domains, structural characterization, and subcellular localization; Table S3: Comprehensive annotation of MiMADS genes in macadamia: gene features, conserved domains, structural characterization, and subcellular localization; Table S4: Comprehensive annotation of RcMADS genes in castor: gene features, conserved domains, structural characterization, and subcellular localization; Table S5: Synonymous (Ks) and nonsynonymous (Ka) substitution rates and Ka/Ks ratios of duplicated MADS-box gene pairs within species; Table S6: Synonymous (Ks) and nonsynonymous (Ka) substitution rates and Ka/Ks ratios of duplicated MADS-box gene pairs among four species; Table S7: Duplication patterns and subfamily classification of MADS-box genes within species; Table S8: Duplication patterns and subfamily classification of MADS-box genes among four species; Table S9: Detailed information of the selected cluster-specific marker genes; Table S10: Detailed characteristics of MADS-box genes in each cluster of iron walnut; Table S11: Detailed characteristics of MADS-box genes in each cluster of pecan; Table S12: Detailed characteristics of MADS-box genes in each cluster of macadamia; Table S13: Detailed characteristics of MADS-box genes in each cluster of castor; Table S14: Cis-regulatory elements identified in the promoter regions of JsiMADS genes in iron walnut; Table S15: Cis-regulatory elements identified in the promoter regions of CilMADS genes in pecan; Table S16: Cis-regulatory elements in the promoter regions of MiMADS genes in macadamia; Table S17: Cis-regulatory elements in the promoter regions of RcMADS genes in castor.

Author Contributions

Methodology, J.S.; Data curation, J.S. and Z.Z.; Writing—original draft, Z.W. and A.Y.; Visualization, F.W., M.W. and F.M.; Provide materials, X.X.; Funding acquisition, A.Y. and A.L.; Writing—review & editing, A.Y. and A.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Yunnan Fundamental Research Projects (202401AT070269), National Natural Science Foundation of China (NSFC, 32360475), Forestry Innovation Programs of Southwest Forestry University (LXXK-2023Z02), Fund of Yunnan Key Laboratory of Crop Wild Relatives Omics (CWR-2024-05). We thank all the individuals who have helped us in this study. No conflict of interest was declared.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding authors.

Acknowledgments

We thank all the individuals who have helped us in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
XTHXyloglucan endotransglucosylases/hydrolases
SCWSecondary Cell Wall
E2FE2 Factor
ERF3Ethylene Response Factor 3
C4HCinnamate-4-Hydroxylase
4CL4-Coumarate:CoA Ligase
KDCaddo
DAPDays After Pollination
MWMolecular Weight
pIIsoelectric point
MLMaximum Likelihood
kDakiloDalton
ARFauxin response factor
bHLHbasic helix-loop-helix
bZIPbasic leucine zipper
HDhomeodomain transcription factor
CArGCC(A/T)6GG
aaamino acids
CWCell wall
ABAAbscisic Acid

References

  1. Cerri, M.; Reale, L. Anatomical Traits of the Principal Fruits: An Overview. Sci. Hortic. 2020, 270, 109390. [Google Scholar] [CrossRef]
  2. Dardick, C.; Callahan, A. Evolution of the Fruit Endocarp: Molecular Mechanisms Underlying Adaptations in Seed Protection and Dispersal Strategies. Front. Plant Sci. 2014, 5, 284. [Google Scholar] [CrossRef] [PubMed]
  3. Khan, M.K.U.; Muhammad, N.; Jia, Z.; Peng, J.; Liu, M. Mechanism of Stone (Hardened Endocarp) Formation in Fruits: An Attempt toward Pitless Fruits, and Its Advantages and Disadvantages. Genes 2022, 13, 2123. [Google Scholar] [CrossRef]
  4. Canton, M.; Drincovich, M.F.; Lara, M.V.; Vizzotto, G.; Walker, R.P.; Famiani, F.; Bonghi, C. Metabolism of Stone Fruits: Reciprocal Contribution Between Primary Metabolism and Cell Wall. Front. Plant Sci. 2020, 11, 1054. [Google Scholar] [CrossRef]
  5. Mao, X.; Zhao, X.; Luo, Z.; He, A.; Yang, M.; Liu, M.; Zhao, J.; Liu, P. Transcriptome-Based Analysis of Lignin Accumulation in the Regulation of Fruit Stone Development and Endocarp Hardening in Chinese Jujube. J. Integr. Agric. 2025, 24, 2217–2228. [Google Scholar] [CrossRef]
  6. Huss, J.C.; Antreich, S.J.; Felhofer, M.; Mayer, K.; Eder, M.; Vieira Dias dos Santos, A.C.; Ramer, G.; Lendl, B.; Gierlinger, N. Hydrolyzable Tannins are Incorporated into the Endocarp during Sclerification of the Water Caltrop Trapa Natans. Plant Physiol. 2023, 194, 94–105. [Google Scholar] [CrossRef] [PubMed]
  7. Wang, Y.; Gui, C.; Wu, J.; Gao, X.; Huang, T.; Cui, F.; Liu, H.; Sethupathy, S. Spatio-Temporal Modification of Lignin Biosynthesis in Plants: A Promising Strategy for Lignocellulose Improvement and Lignin Valorization. Front. Bioeng. Biotechnol. 2022, 10, 917459. [Google Scholar] [CrossRef]
  8. Li, Z.; Fernie, A.R.; Persson, S. Transition of Primary to Secondary Cell Wall Synthesis. Sci. Bull. 2016, 61, 838–846. [Google Scholar] [CrossRef]
  9. Yang, J.H.; Wang, H. Molecular Mechanisms for Vascular Development and Secondary Cell Wall Formation. Front. Plant Sci. 2016, 7, 356. [Google Scholar] [CrossRef] [PubMed]
  10. Růžička, K.; Ursache, R.; Hejátko, J.; Helariutta, Y. Xylem Development–from the Cradle to the Grave. New Phytol. 2015, 207, 519–535. [Google Scholar] [CrossRef]
  11. Bollhöner, B.; Prestele, J.; Tuominen, H. Xylem Cell Death: Emerging Understanding of Regulation and Function. J. Exp. Bot. 2012, 63, 1081–1094. [Google Scholar] [CrossRef]
  12. Déjardin, A.; Laurans, F.; Arnaud, D.; Breton, C.; Pilate, G.; Leplé, J.-C. Wood Formation in Angiosperms. Comptes Rendus Biol. 2010, 333, 325–334. [Google Scholar] [CrossRef]
  13. de Souza, W.R.; Mitchell, R.A.C.; Cesarino, I. Editorial: The Plant Cell Wall: Advances and Current Perspectives. Front. Plant Sci. 2023, 14, 1235749. [Google Scholar] [CrossRef]
  14. Kang, X.; Kirui, A.; Dickwella Widanage, M.C.; Mentink-Vigier, F.; Cosgrove, D.J.; Wang, T. Lignin-Polysaccharide Interactions in Plant Secondary Cell Walls Revealed by Solid-State NMR. Nat. Commun. 2019, 10, 347. [Google Scholar] [CrossRef]
  15. Liu, N.; Liu, Z.; Tian, G.; Zhao, S.; Chu, H.; Hu, Y.; Zhao, Y.; Zhang, Y.; Cheng, K.; Wang, D.; et al. Roles of MADS-box transcription factors in plant responses to abiotic and biotic stresses. Plant Commun. 2026, 7, 101778. [Google Scholar] [CrossRef]
  16. Du, F.; Tan, T. Recent Studies in Mechanical Properties of Selected Hard-Shelled Seeds: A Review. JOM 2021, 73, 1723–1735. [Google Scholar] [CrossRef]
  17. Luo, X.; Li, H.; Wu, Z.; Yao, W.; Zhao, P.; Cao, D.; Yu, H.; Li, K.; Poudel, K.; Zhao, D.; et al. The pomegranate (Punica granatum L.) draft genome dissects genetic divergence between soft- and hard-seeded cultivars. Plant Biotechnol. J. 2020, 18, 955–968. [Google Scholar] [CrossRef]
  18. Alba, R.; Cordonnier-Pratt, M.-M.; Pratt, L.H. Fruit-Localized Phytochromes Regulate Lycopene Accumulation Independently of Ethylene Production in Tomato. Plant Physiol. 2000, 123, 363–370. [Google Scholar] [CrossRef]
  19. Hayama, H.; Shimada, T.; Fujii, H.; Ito, A.; Kashimura, Y. Ethylene-Regulation of Fruit Softening and Softening-Related Genes in Peach. J. Exp. Bot. 2006, 57, 4071–4077. [Google Scholar] [CrossRef]
  20. Lombardo, V.A.; Osorio, S.; Borsani, J.; Lauxmann, M.A.; Bustamante, C.A.; Budde, C.O.; Andreo, C.S.; Lara, M.V.; Fernie, A.R.; Drincovich, M.F. Metabolic profiling during peach fruit development and ripening reveals the metabolic networks that underpin each developmental stage. Plant Physiol. 2011, 157, 1696–1710. [Google Scholar] [CrossRef]
  21. Landucci, L.; Smith, R.A.; Liu, S.; Karlen, S.D.; Ralph, J. Eudicot Nutshells: Cell-Wall Composition and Biofuel Feedstock Potential. Energy Fuels 2020, 34, 16274–16283. [Google Scholar] [CrossRef]
  22. Pesquet, E.; Cesarino, I.; Kajita, S.; Pawlowski, K. Physiological Roles of Lignins–Tuning Cell Wall Hygroscopy and Biomechanics. New Phytol. 2025, 248, 2674–2706. [Google Scholar] [CrossRef]
  23. Nam, J. Antiquity and Evolution of the MADS-Box Gene Family Controlling Flower Development in Plants. Mol. Biol. Evol. 2003, 20, 1435–1447. [Google Scholar] [CrossRef]
  24. Rounsley, S.D.; Ditta, G.S.; Yanofsky, M.F. Diverse Roles for MADS Box Genes in Arabidopsis Development. Plant Cell 1995, 7, 1259–1269. [Google Scholar]
  25. Favaro, R.; Pinyopich, A.; Battaglia, R.; Kooiker, M.; Borghi, L.; Ditta, G.; Yanofsky, M.F.; Kater, M.M.; Colombo, L. MADS-Box Protein Complexes Control Carpel and Ovule Development in Arabidopsis. Plant Cell 2003, 15, 2603–2611. [Google Scholar] [CrossRef]
  26. Brambilla, V.; Kater, M.; Colombo, L. Ovule Integument Identity Determination in Arabidopsis. Plant Signal. Behav. 2008, 3, 246–247. [Google Scholar] [CrossRef]
  27. Malabarba, J.; Buffon, V.; Mariath, J.E.A.; Gaeta, M.L.; Dornelas, M.C.; Margis-Pinheiro, M.; Pasquali, G.; Revers, L.F. The MADS-Box Gene Agamous-like 11 Is Essential for Seed Morphogenesis in Grapevine. J. Exp. Bot. 2017, 68, 1493–1506. [Google Scholar] [CrossRef]
  28. Singh, R.; Leslie Low, E.-T.; Ooi, L.C.-L.; Ong-Abdullah, M.; Chin, T.N.; Nagappan, J.; Nookiah, R.; Amiruddin, M.D.; Rosli, R.; Abdul Manaf, M.A.; et al. The Oil Palm Shell Gene Controls Oil Yield and Encodes a Homologue of SEEDSTICK. Nature 2013, 500, 340–344. [Google Scholar] [CrossRef]
  29. Ferrándiz, C.; Liljegren, S.J.; Yanofsky, M.F. Negative Regulation of the SHATTERPROOF Genes by FRUITFULL during Arabidopsis Fruit Development. Science 2000, 289, 436–438. [Google Scholar] [CrossRef]
  30. Liljegren, S.J.; Ditta, G.S.; Eshed, Y.; Savidge, B.; Bowman, J.L.; Yanofsky, M.F. SHATTERPROOF MADS-Box Genes Control Seed Dispersal in Arabidopsis. Nature 2000, 404, 766–770. [Google Scholar] [CrossRef]
  31. Pabón-Mora, N.; Wong, G.K.-S.; Ambrose, B.A. Evolution of Fruit Development Genes in Flowering Plants. Front. Plant Sci. 2014, 5, 300. [Google Scholar] [CrossRef]
  32. Dardick, C.D.; Callahan, A.M.; Chiozzotto, R.; Schaffer, R.J.; Piagnani, M.C.; Scorza, R. Stone Formation in Peach Fruit Exhibits Spatial Coordination of the Lignin and Flavonoid Pathways and Similarity to Arabidopsis Dehiscence. BMC Biol. 2010, 8, 13. [Google Scholar] [CrossRef]
  33. Fang, S.; Shang, X.; Yao, Y.; Li, W.; Guo, W. NST- and SND-Subgroup NAC Proteins Coordinately Act to Regulate Secondary Cell Wall Formation in Cotton. Plant Sci. 2020, 301, 110657. [Google Scholar] [CrossRef]
  34. Singh, R.; Ong-Abdullah, M.; Low, E.-T.L.; Manaf, M.A.A.; Rosli, R.; Nookiah, R.; Ooi, L.C.-L.; Ooi, S.; Chan, K.-L.; Halim, M.A.; et al. Oil Palm Genome Sequence Reveals Divergence of Interfertile Species in Old and New Worlds. Nature 2013, 500, 335–339. [Google Scholar] [CrossRef]
  35. Becker, A.; Theißen, G. The Major Clades of MADS-Box Genes and Their Role in the Development and Evolution of Flowering Plants. Mol. Phylogenet. Evol. 2003, 29, 464–489. [Google Scholar] [CrossRef]
  36. Dai, Y.; Wang, Y.; Zeng, L.; Jia, R.; He, L.; Huang, X.; Zhao, H.; Liu, D.; Zhao, H.; Hu, S.; et al. Genomic and Transcriptomic Insights into the Evolution and Divergence of MIKC-Type MADS-Box Genes in Carica Papaya. Int. J. Mol. Sci. 2023, 24, 14039. [Google Scholar] [CrossRef]
  37. Li, Y.; Wu, R.; Chen, T.; Qin, D.; An, X. Comprehensive Genome-Wide Characterization of the MIKC-Type MADS-Box Family Members and the Dynamic Expression Profiling throughout the Development of Floral Buds in Populus Tomentosa. Ind. Crops Prod. 2024, 222, 119968. [Google Scholar] [CrossRef]
  38. Tahmasebi, S.; Jonoubi, P.; Majdi, M.; Majd, A.; Heidari, P. Genome-Wide Characterization and Expression Profiling of MADS-Box Family Genes during Organ Development and Drought Stress in Camelina sativa L. Sci. Rep. 2025, 15, 9327. [Google Scholar] [CrossRef]
  39. Aerts, N.; de Bruijn, S.; van Mourik, H.; Angenent, G.C.; van Dijk, A.D.J. Comparative Analysis of Binding Patterns of MADS-Domain Proteins in Arabidopsis Thaliana. BMC Plant Biol. 2018, 18, 131. [Google Scholar] [CrossRef]
  40. Tripathi, A.; Vishwakarma, K.; Tripathi, S.; Jadaun, J.S.; Nayak, A.K. Utilization of MADS-Box Genes for Agricultural Advancement: Current Insights and Future Prospects. Mol. Biol. Rep. 2025, 53, 20. [Google Scholar] [CrossRef]
  41. Busatto, N.; Herrera, R. Fruit Development and Ripening—A Molecular and Physiological View Modulating and Enhancing Fruit Quality. J. Plant Growth Regul. 2025, 44, 1069–1071. [Google Scholar] [CrossRef]
  42. García-Cruz, K.V.; García-Ponce, B.; Garay-Arroyo, A.; Sanchez, M.D.L.P.; Ugartechea-Chirino, Y.; Desvoyes, B.; Pacheco-Escobedo, M.A.; Tapia-López, R.; Ransom-Rodríguez, I.; Gutierrez, C.; et al. The MADS-Box XAANTAL1 Increases Proliferation at the Arabidopsis Root Stem-Cell Niche and Participates in Transition to Differentiation by Regulating Cell-Cycle Components. Ann. Bot. 2016, 118, 787–796. [Google Scholar] [CrossRef]
  43. Li, C.; Lu, X.; Xu, J.; Liu, Y. Regulation of Fruit Ripening by MADS-Box Transcription Factors. Sci. Hortic. 2023, 314, 111950. [Google Scholar] [CrossRef]
  44. Ng, M.; Yanofsky, M.F. Function and Evolution of the Plant MADS-Box Gene Family. Nat. Rev. Genet. 2001, 2, 186–195. [Google Scholar] [CrossRef] [PubMed]
  45. Zhang, Z.; Zou, W.; Lin, P.; Wang, Z.; Chen, Y.; Yang, X.; Zhao, W.; Zhang, Y.; Wang, D.; Que, Y.; et al. Evolution and Function of MADS-Box Transcription Factors in Plants. Int. J. Mol. Sci. 2024, 25, 13278. [Google Scholar] [CrossRef]
  46. Morel, P.; Chambrier, P.; Boltz, V.; Chamot, S.; Rozier, F.; Rodrigues Bento, S.; Trehin, C.; Monniaux, M.; Zethof, J.; Vandenbussche, M. Divergent Functional Diversification Patterns in the SEP/AGL6/AP1 MADS-Box Transcription Factor Superclade. Plant Cell 2019, 31, 3033–3056. [Google Scholar] [CrossRef]
  47. Ampomah-Dwamena, C.; Morris, B.A.; Sutherland, P.; Veit, B.; Yao, J.-L. Down-Regulation of TM29, a TomatoSEPALLATA Homolog, Causes Parthenocarpic Fruit Development and Floral Reversion. Plant Physiol. 2002, 130, 605–617. [Google Scholar] [CrossRef]
  48. Seymour, G.B.; Ryder, C.D.; Cevik, V.; Hammond, J.P.; Popovich, A.; King, G.J.; Vrebalov, J.; Giovannoni, J.J.; Manning, K. A SEPALLATA Gene Is Involved in the Development and Ripening of Strawberry (Fragaria×ananassa Duch.) Fruit, a Non-Climacteric Tissue. J. Exp. Bot. 2011, 62, 1179–1188. [Google Scholar] [CrossRef]
  49. Vallarino, J.G.; Merchante, C.; Sánchez-Sevilla, J.F.; de Luis Balaguer, M.A.; Pott, D.M.; Ariza, M.T.; Casañal, A.; Posé, D.; Vioque, A.; Amaya, I.; et al. Characterizing the Involvement of FaMADS9 in the Regulation of Strawberry Fruit Receptacle Development. Plant Biotechnol. J. 2020, 18, 929–943. [Google Scholar] [CrossRef]
  50. Bemer, M.; Karlova, R.; Ballester, A.R.; Tikunov, Y.M.; Bovy, A.G.; Wolters-Arts, M.; de Barros Rossetto, P.; Angenent, G.C.; de Maagd, R.A. The Tomato FRUITFULL Homologs TDR4/FUL1 and MBP7/FUL2 Regulate Ethylene-Independent Aspects of Fruit Ripening. Plant Cell 2012, 24, 4437–4451. [Google Scholar] [CrossRef] [PubMed]
  51. Fujisawa, M.; Shima, Y.; Nakagawa, H.; Kitagawa, M.; Kimbara, J.; Nakano, T.; Kasumi, T.; Ito, Y. Transcriptional Regulation of Fruit Ripening by Tomato FRUITFULL Homologs and Associated MADS Box Proteins. Plant Cell 2014, 26, 89–101. [Google Scholar] [CrossRef]
  52. Meng, D.; Liu, X.; Cao, Y.; Cai, Y.; Duan, J. PbMADS49 Regulates Lignification During Stone Cell Development in “Dangshansuli” (Pyrus Bretschneideri) Fruit. Plant Cell Environ. 2025, 48, 4161–4177. [Google Scholar] [CrossRef] [PubMed]
  53. Jin, D.; Yang, C.; Liu, X.; Lan, J.; Jiang, T.; Chen, H.; Zhang, J.; Jiang, Y.; Peng, B.; Wang, L. Palea and grain shrunken Encoding OsMADS15 Determines Palea Identity to Affect Rice Grain Yield and Quality. J. Genet. Genom. 2026. [Google Scholar] [CrossRef]
  54. Yin, X.; Liu, X.; Xu, B.; Lu, P.; Dong, T.; Yang, D.; Ye, T.; Feng, Y.-Q.; Wu, Y. OsMADS18, a Membrane-Bound MADS-Box Transcription Factor, Modulates Plant Architecture and the Abscisic Acid Response in Rice. J. Exp. Bot. 2019, 70, 3895–3909. [Google Scholar] [CrossRef]
  55. Mizzotti, C.; Mendes, M.A.; Caporali, E.; Schnittger, A.; Kater, M.M.; Battaglia, R.; Colombo, L. The MADS Box Genes SEEDSTICK and ARABIDOPSIS Bsister Play a Maternal Role in Fertilization and Seed Development. Plant J. 2012, 70, 409–420. [Google Scholar] [CrossRef]
  56. Sun, J.; Zhou, Z.; Meng, F.; Wen, M.; Liu, A.; Yu, A. Characterization Analyses of MADS-Box Genes Highlighting Their Functions with Seed Development in Ricinus communis. Front. Plant Sci. 2025, 16, 1589915. [Google Scholar] [CrossRef] [PubMed]
  57. Gasteiger, E.; Gattiker, A.; Hoogland, C.; Ivanyi, I.; Appel, R.D.; Bairoch, A. ExPASy: The Proteomics Server for in-Depth Protein Knowledge and Analysis. Nucleic Acids Res. 2003, 31, 3784–3788. [Google Scholar] [CrossRef] [PubMed]
  58. Chou, K.-C.; Shen, H.-B. Plant-mPLoc: A Top-Down Strategy to Augment the Power for Predicting Plant Protein Subcellular Localization. PLoS ONE 2010, 5, e11335. [Google Scholar] [CrossRef]
  59. Madeira, F.; Madhusoodanan, N.; Lee, J.; Eusebi, A.; Niewielska, A.; Tivey, A.R.N.; Lopez, R.; Butcher, S. The EMBL-EBI Job Dispatcher Sequence Analysis Tools Framework in 2024. Nucleic Acids Res. 2024, 52, W521–W525. [Google Scholar] [CrossRef] [PubMed]
  60. Nguyen, L.-T.; Schmidt, H.A.; von Haeseler, A.; Minh, B.Q. IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol. Biol. Evol. 2015, 32, 268–274. [Google Scholar] [CrossRef]
  61. Letunic, I.; Bork, P. Interactive Tree of Life (iTOL) v6: Recent Updates to the Phylogenetic Tree Display and Annotation Tool. Nucleic Acids Res. 2024, 52, W78–W828. [Google Scholar] [CrossRef] [PubMed]
  62. Schultz, J.; Copley, R.R.; Doerks, T.; Ponting, C.P.; Bork, P. SMART: A Web-Based Tool for the Study of Genetically Mobile Domains. Nucleic Acids Res. 2000, 28, 231–234. [Google Scholar] [CrossRef]
  63. Lai, X.; Vega-Léon, R.; Hugouvieux, V.; Blanc-Mathieu, R.; van der Wal, F.; Lucas, J.; Silva, C.S.; Jourdain, A.; Muino, J.M.; Nanao, M.H.; et al. The Intervening Domain Is Required for DNA-Binding and Functional Identity of Plant MADS Transcription Factors. Nat. Commun. 2021, 12, 4760. [Google Scholar] [CrossRef] [PubMed]
  64. Henschel, K.; Kofuji, R.; Hasebe, M.; Saedler, H.; Münster, T.; Theißen, G. Two Ancient Classes of MIKC-Type MADS-Box Genes are Present in the Moss Physcomitrella patens. Mol. Biol. Evol. 2002, 19, 801–814. [Google Scholar] [CrossRef]
  65. Gramzow, L.; Theissen, G. A Hitchhiker’s Guide to the MADS World of Plants. Genome Biol. 2010, 11, 214. [Google Scholar] [CrossRef]
  66. Riese, M.; Faigl, W.; Quodt, V.; Verelst, W.; Matthes, A.; Saedler, H.; Münster, T. Isolation and Characterization of New MIKC-Type MADS-Box Genes from the Moss Physcomitrella patens. Plant Biol. 2005, 7, 307–314. [Google Scholar] [CrossRef]
  67. Rahman, M.A.; Balasubramani, S.P.; Basha, S.M. Molecular Characterization and Phylogenetic Analysis of MADS-Box Gene VroAGL11 Associated with Stenospermocarpic Seedlessness in Muscadine Grapes. Genes 2021, 12, 232. [Google Scholar] [CrossRef] [PubMed]
  68. Crooks, G.E.; Hon, G.; Chandonia, J.-M.; Brenner, S.E. WebLogo: A Sequence Logo Generator. Genome Res. 2004, 14, 1188–1190. [Google Scholar] [CrossRef]
  69. Lin, Y.; Qi, X.; Wan, Y.; Chen, Z.; Fang, H.; Liang, C. Genome-Wide Analysis of the MADS-Box Gene Family in Lonicera japonica and a Proposed Floral Organ Identity Model. BMC Genom. 2023, 24, 447. [Google Scholar] [CrossRef]
  70. Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly Accurate Protein Structure Prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
  71. Chen, C.; Chen, H.; Zhang, Y.; Thomas, H.R.; Frank, M.H.; He, Y.; Xia, R. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol. Plant 2020, 13, 1194–1202. [Google Scholar] [CrossRef]
  72. Wang, Y.; Tang, H.; Debarry, J.D.; Tan, X.; Li, J.; Wang, X.; Lee, T.; Jin, H.; Marler, B.; Guo, H.; et al. MCScanX: A Toolkit for Detection and Evolutionary Analysis of Gene Synteny and Collinearity. Nucleic Acids Res. 2012, 40, e49. [Google Scholar] [CrossRef]
  73. Tang, H.; Krishnakumar, V.; Zeng, X.; Xu, Z.; Taranto, A.; Lomas, J.S.; Zhang, Y.; Huang, Y.; Wang, Y.; Yim, W.C.; et al. JCVI: A Versatile Toolkit for Comparative Genomics Analysis. iMeta 2024, 3, e211. [Google Scholar] [CrossRef]
  74. Sun, P.; Jiao, B.; Yang, Y.; Shan, L.; Li, T.; Li, X.; Xi, Z.; Wang, X.; Liu, J. WGDI: A User-Friendly Toolkit for Evolutionary Analyses of Whole-Genome Duplications and Ancestral Karyotypes. Mol. Plant 2022, 15, 1841–1851. [Google Scholar] [CrossRef]
  75. Zhang, Z.; Li, J.; Zhao, X.-Q.; Wang, J.; Wong, G.K.-S.; Yu, J. KaKs_Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging. Genom. Proteom. Bioinform. 2006, 4, 259–263. [Google Scholar] [CrossRef]
  76. Yu, A.; Zou, H.; Li, P.; Yao, X.; Guo, J.; Sun, R.; Wang, G.; Xi, X.; Liu, A. Global Transcriptomic Analyses Provide New Insight into the Molecular Mechanisms of Endocarp Formation and Development in Iron Walnut (Juglans sigillata Dode). Int. J. Mol. Sci. 2023, 24, 6543. [Google Scholar] [CrossRef] [PubMed]
  77. Wen, M.; Zhou, Z.; Sun, J.; Meng, F.; Xi, X.; Liu, A.; Yu, A. A Genome-Wide Characterization of the Xyloglucan Endotransglucosylase/Hydrolase Family Genes and Their Functions in the Shell Formation of Pecan. Horticulturae 2025, 11, 609. [Google Scholar] [CrossRef]
  78. Lin, J.; Zhang, W.; Zhang, X.; Ma, X.; Zhang, S.; Chen, S.; Wang, Y.; Jia, H.; Liao, Z.; Lin, J.; et al. Signatures of Selection in Recently Domesticated Macadamia. Nat. Commun. 2022, 13, 242. [Google Scholar] [CrossRef]
  79. Yu, A.; Wang, Z.; Zhang, Y.; Li, F.; Liu, A. Global Gene Expression of Seed Coat Tissues Reveals a Potential Mechanism of Regulating Seed Size Formation in Castor Bean. Int. J. Mol. Sci. 2019, 20, 1282. [Google Scholar] [CrossRef] [PubMed]
  80. Han, B.; Wu, D.; Zhang, Y.; Li, D.-Z.; Xu, W.; Liu, A. Epigenetic Regulation of Seed-Specific Gene Expression by DNA Methylation Valleys in Castor Bean. BMC Biol. 2022, 20, 57. [Google Scholar] [CrossRef] [PubMed]
  81. Chen, S. Ultrafast One-Pass FASTQ Data Preprocessing, Quality Control, and Deduplication Using Fastp. iMeta 2023, 2, e107. [Google Scholar] [CrossRef]
  82. Kim, D.; Paggi, J.M.; Park, C.; Bennett, C.; Salzberg, S.L. Graph-Based Genome Alignment and Genotyping with HISAT2 and HISAT-Genotype. Nat. Biotechnol. 2019, 37, 907–915. [Google Scholar] [CrossRef] [PubMed]
  83. Liao, Y.; Smyth, G.K.; Shi, W. featureCounts: An Efficient General Purpose Program for Assigning Sequence Reads to Genomic Features. Bioinformatics 2014, 30, 923–930. [Google Scholar] [CrossRef]
  84. Hauser, M.; Steinegger, M.; Söding, J. MMseqs software suite for fast and deep clustering and searching of large protein sequence sets. Bioinformatics 2016, 32, 1323–1330. [Google Scholar] [CrossRef]
  85. Kanehisa, M.; Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef]
  86. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef]
  87. Muley, V.Y. Prediction and Analysis of Transcription Factor Binding Sites: Practical Examples and Case Studies Using R Programming. In Reverse Engineering of Regulatory Networks; Mandal, S., Ed.; Springer: New York, NY, USA, 2024; pp. 199–225. ISBN 978-1-0716-3461-5. [Google Scholar]
  88. Wang, J.; Chitsaz, F.; Derbyshire, M.K.; Gonzales, N.R.; Gwadz, M.; Lu, S.; Marchler, G.H.; Song, J.S.; Thanki, N.; Yamashita, R.A.; et al. The Conserved Domain Database in 2023. Nucleic Acids Res. 2023, 51, D384–D388. [Google Scholar] [CrossRef]
  89. Qiu, Y.; Li, Z.; Walther, D.; Köhler, C. Updated Phylogeny and Protein Structure Predictions Revise the Hypothesis on the Origin of MADS-Box Transcription Factors in Land Plants. Mol. Biol. Evol. 2023, 40, msad194. [Google Scholar] [CrossRef]
  90. Kwantes, M.; Liebsch, D.; Verelst, W. How MIKC* MADS-Box Genes Originated and Evidence for Their Conserved Function Throughout the Evolution of Vascular Plant Gametophytes. Mol. Biol. Evol. 2012, 29, 293–302. [Google Scholar] [CrossRef] [PubMed]
  91. Espinosa-Soto, C.; Immink, R.G.; Angenent, G.C.; Alvarez-Buylla, E.R.; de Folter, S. Tetramer Formation in Arabidopsis MADS Domain Proteins: Analysis of a Protein-Protein Interaction Network. BMC Syst. Biol. 2014, 8, 9. [Google Scholar] [CrossRef] [PubMed]
  92. Fatima, M.; Ma, X.; Zhang, J.; Ming, R. Genome-Wide Analysis of MADS-Box Genes and Their Expression Patterns in Unisexual Flower Development in Dioecious Spinach. Sci. Rep. 2024, 14, 18635. [Google Scholar] [CrossRef]
  93. Xu, W.; Wu, D.; Yang, T.; Sun, C.; Wang, Z.; Han, B.; Wu, S.; Yu, A.; Chapman, M.A.; Muraguri, S.; et al. Genomic Insights into the Origin, Domestication and Genetic Basis of Agronomic Traits of Castor Bean. Genome Biol. 2021, 22, 113. [Google Scholar] [CrossRef]
  94. Liu, H.; Li, J.; Gong, P.; He, C. The Origin and Evolution of Carpels and Fruits from an Evo-Devo Perspective. J. Integr. Plant Biol. 2023, 65, 283–298. [Google Scholar] [CrossRef]
  95. Ehlers, K.; Bhide, A.S.; Tekleyohans, D.G.; Wittkop, B.; Snowdon, R.J.; Becker, A. The MADS Box Genes ABS, SHP1, and SHP2 Are Essential for the Coordination of Cell Divisions in Ovule and Seed Coat Development and for Endosperm Formation in Arabidopsis thaliana. PLoS ONE 2016, 11, e0165075. [Google Scholar] [CrossRef]
  96. Jack, T. New Members of the Floral Organ Identity AGAMOUS Pathway. Trends Plant Sci. 2002, 7, 286–287. [Google Scholar] [CrossRef]
  97. Liu, Z.; Franks, R.G. Molecular Basis of Fruit Development. Front. Plant Sci. 2015, 6, 28. [Google Scholar] [CrossRef]
  98. Losa, A.; Colombo, M.; Brambilla, V.; Colombo, L. Genetic Interaction between AINTEGUMENTA (ANT) and the Ovule Identity Genes SEEDSTICK (STK), SHATTERPROOF1 (SHP1) and SHATTERPROOF2 (SHP2). Sex. Plant Reprod. 2010, 23, 115–121. [Google Scholar] [CrossRef]
  99. Nakano, Y.; Yamaguchi, M.; Endo, H.; Rejab, N.A.; Ohtani, M. NAC-MYB-Based Transcriptional Regulation of Secondary Cell Wall Biosynthesis in Land Plants. Front. Plant Sci. 2015, 6, 288. [Google Scholar] [CrossRef]
  100. Zhong, R.; Lee, C.; Ye, Z.-H. Functional Characterization of Poplar Wood-Associated NAC Domain Transcription Factors. Plant Physiol. 2010, 152, 1044–1055. [Google Scholar] [CrossRef]
  101. Zhong, R.; Ye, Z.-H. Complexity of the Transcriptional Network Controlling Secondary Wall Biosynthesis. Plant Sci. 2014, 229, 193–207. [Google Scholar] [CrossRef]
  102. Nichol, J.B.; Samuel, M.A. Characterizing the Role of Endocarp a and b Cells Layers during Pod (Silique) Development in Brassicaceae. Plant Signal. Behav. 2024, 19, 2384243. [Google Scholar] [CrossRef] [PubMed]
  103. Sánchez Piñero, M.; Martín-Palomo, M.J.; Moriana, A.; Corell, M.; Pérez López, D. Endocarp Development Study in Full Irrigated Olive Orchards and Impact on Fruit Features at Harvest. Plants 2022, 11, 3541. [Google Scholar] [CrossRef]
  104. Beeckman, T.; De Rycke, R.; Viane, R.; Inzé, D. Histological Study of Seed Coat Development in Arabidopsis thaliana. J. Plant Res. 2000, 113, 139–148. [Google Scholar] [CrossRef]
  105. Meade, L.E.; Plackett, A.R.G.; Hilton, J. Reconstructing Development of the Earliest Seed Integuments Raises a New Hypothesis for the Evolution of Ancestral Seed-Bearing Structures. New Phytol. 2021, 229, 1782–1794. [Google Scholar] [CrossRef]
  106. Cruz, N.; Méndez, T.; Ramos, P.; Urbina, D.; Vega, A.; Gutiérrez, R.A.; Moya-León, M.A.; Herrera, R. Induction of PrMADS10 on the Lower Side of Bent Pine Tree Stems: Potential Role in Modifying Plant Cell Wall Properties and Wood Anatomy. Sci. Rep. 2019, 9, 18981. [Google Scholar] [CrossRef]
  107. Ge, H.; Shi, Y.; Zhang, M.; Li, X.; Yin, X.; Chen, K. The MADS-Box Transcription Factor EjAGL65 Controls Loquat Flesh Lignification via Direct Transcriptional Inhibition of EjMYB8. Front. Plant Sci. 2021, 12, 652959. [Google Scholar] [CrossRef]
  108. Xue, Y.; Chen, S.; Hao, Y.; Shan, M.; Zheng, P.; Wang, R.; Zhang, M.; Wu, J.; Xue, C. The PbrMADS1–PbrMYB169 Complex Has Uniquely Emerged to Regulate Lignification of Stone Cells in Pear. J. Integr. Plant Biol. 2026, 68, 239–256. [Google Scholar] [CrossRef] [PubMed]
  109. Zhang, Y.; Cao, G.; Qu, L.-J.; Gu, H. Characterization of Arabidopsis MYB Transcription Factor Gene AtMYB17 and Its Possible Regulation by LEAFY and AGL15. J. Genet Genom. 2009, 36, 99–107. [Google Scholar] [CrossRef]
  110. Khanday, I.; Yadav, S.R.; Vijayraghavan, U. Rice LHS1/OsMADS1 Controls Floret Meristem Specification by Coordinated Regulation of Transcription Factors and Hormone Signaling Pathways. Plant Physiol. 2013, 161, 1970–1983. [Google Scholar] [CrossRef]
  111. Li, G.; Kuijer, H.N.J.; Yang, X.; Liu, H.; Shen, C.; Shi, J.; Betts, N.; Tucker, M.R.; Liang, W.; Waugh, R.; et al. MADS1 Maintains Barley Spike Morphology at High Ambient Temperatures. Nat. Plants 2021, 7, 1093–1107. [Google Scholar] [CrossRef] [PubMed]
  112. Zhang, J.; Zhang, Z.; Zhang, R.; Yang, C.; Zhang, X.; Chang, S.; Chen, Q.; Rossi, V.; Zhao, L.; Xiao, J.; et al. Type I MADS-Box Transcription Factor TaMADS-GS Regulates Grain Size by Stabilizing Cytokinin Signalling during Endosperm Cellularization in Wheat. Plant Biotechnol. J. 2024, 22, 200–215. [Google Scholar] [CrossRef]
Figure 1. Conserved domain architecture of MADS-box across iron walnut, pecan, macadamia, and castor. Sequence logos representations illustrating the amino acid conservation patterns of distinct domains in MADS-box proteins. For Type I MADS-box proteins, the M, I-like, and C domains are shown, whereas for Type II proteins, the M, I, K, and C domains are presented. The M domains exhibit a high degree of conservation across both types, while the I/I-like and K domains show moderate conservation, and the C-terminal regions display substantial variability. The height of individual letters corresponds to the relative frequency of amino acids at a given position, while the total height of each stack represents the degree of sequence conservation measured in bits.
Figure 1. Conserved domain architecture of MADS-box across iron walnut, pecan, macadamia, and castor. Sequence logos representations illustrating the amino acid conservation patterns of distinct domains in MADS-box proteins. For Type I MADS-box proteins, the M, I-like, and C domains are shown, whereas for Type II proteins, the M, I, K, and C domains are presented. The M domains exhibit a high degree of conservation across both types, while the I/I-like and K domains show moderate conservation, and the C-terminal regions display substantial variability. The height of individual letters corresponds to the relative frequency of amino acids at a given position, while the total height of each stack represents the degree of sequence conservation measured in bits.
Horticulturae 12 00626 g001
Figure 2. Phylogenetic relationships, conserved domain, and gene structure of MADS-box genes in iron walnut (JsiMADSs). The left panel shows a maximum likelihood phylogenetic tree classifying JsiMADS proteins into Type I (Mα, Mβ, Mγ) and Type II (MIKC* and MIKCc) subfamilies. The middle panel illustrates the conserved domain architecture of each protein, with the M, I, I-like, K, and C domains represented in blue, pink, orange, yellow, and beige, respectively. The right panel displays the exon–intron organization of each gene, where exons are indicated by cyan boxes and introns by black lines.
Figure 2. Phylogenetic relationships, conserved domain, and gene structure of MADS-box genes in iron walnut (JsiMADSs). The left panel shows a maximum likelihood phylogenetic tree classifying JsiMADS proteins into Type I (Mα, Mβ, Mγ) and Type II (MIKC* and MIKCc) subfamilies. The middle panel illustrates the conserved domain architecture of each protein, with the M, I, I-like, K, and C domains represented in blue, pink, orange, yellow, and beige, respectively. The right panel displays the exon–intron organization of each gene, where exons are indicated by cyan boxes and introns by black lines.
Horticulturae 12 00626 g002
Figure 3. Phylogenetic relationships of MADS-box genes in iron walnut and Arabidopsis. A circular maximum likelihood phylogenetic tree constructed using full-length MADS-box protein sequences from Juglans sigillata and Arabidopsis thaliana. The proteins are classified into major subfamilies, including Type I groups (Mα, Mβ, Mγ) and Type II groups (MIKC* and MIKCc), as well as further MIKCc subclades such as AG, SEP, AP1/FUL, SOC1, SVP, and AGL6. Distinct colors are used to denote different subfamilies.
Figure 3. Phylogenetic relationships of MADS-box genes in iron walnut and Arabidopsis. A circular maximum likelihood phylogenetic tree constructed using full-length MADS-box protein sequences from Juglans sigillata and Arabidopsis thaliana. The proteins are classified into major subfamilies, including Type I groups (Mα, Mβ, Mγ) and Type II groups (MIKC* and MIKCc), as well as further MIKCc subclades such as AG, SEP, AP1/FUL, SOC1, SVP, and AGL6. Distinct colors are used to denote different subfamilies.
Horticulturae 12 00626 g003
Figure 4. Collinearity relationships of MADS-box genes among four species. Syntenic relationships of MADS-box genes were identified among Juglans sigillata (Jsi), Carya illinoinensis (Cil), Macadamia integrifolia (Mi), and Ricinus communis (Rc). Chromosomes are represented by colored bars, and gray lines indicate all syntenic blocks identified among the four genomes. Colored connecting lines highlight collinear relationships involving MADS-box genes between species pairs, illustrating the conservation and evolutionary divergence of MADS-box gene loci across these species.
Figure 4. Collinearity relationships of MADS-box genes among four species. Syntenic relationships of MADS-box genes were identified among Juglans sigillata (Jsi), Carya illinoinensis (Cil), Macadamia integrifolia (Mi), and Ricinus communis (Rc). Chromosomes are represented by colored bars, and gray lines indicate all syntenic blocks identified among the four genomes. Colored connecting lines highlight collinear relationships involving MADS-box genes between species pairs, illustrating the conservation and evolutionary divergence of MADS-box gene loci across these species.
Horticulturae 12 00626 g004
Figure 5. Expression profiles of expressed genes during endocarp/seed coat development. Expression patterns of expressed genes across different developmental stages in four species: (A) iron walnut (stages P1–P3), (B) pecan (stages KD1–KD3), (C) macadamia (stages shell1–shell5), and (D) castor (seed coat, stages SOC1–SOC5). Genes were grouped into major clusters in each species based on their temporal expression patterns, corresponding to early, middle, and late developmental stages, denoted as A1–A3, B1–B3, C1–C4, and D1–D4, respectively. Heatmaps represent gene expression levels after Z-score normalization across developmental stages, where red indicates relatively high expression and blue indicates relatively low expression.
Figure 5. Expression profiles of expressed genes during endocarp/seed coat development. Expression patterns of expressed genes across different developmental stages in four species: (A) iron walnut (stages P1–P3), (B) pecan (stages KD1–KD3), (C) macadamia (stages shell1–shell5), and (D) castor (seed coat, stages SOC1–SOC5). Genes were grouped into major clusters in each species based on their temporal expression patterns, corresponding to early, middle, and late developmental stages, denoted as A1–A3, B1–B3, C1–C4, and D1–D4, respectively. Heatmaps represent gene expression levels after Z-score normalization across developmental stages, where red indicates relatively high expression and blue indicates relatively low expression.
Horticulturae 12 00626 g005
Figure 6. Expression profiles of expressed MADS-box genes during endocarp/seed coat development. Expression patterns of expressed genes across different developmental stages in four species: (A) iron walnut (stages P1–P3), (B) pecan (stages KD1–KD3), (C) macadamia (stages shell1-shell5), and (D) castor (seed coat, stages SOC1–SOC5). Expressed MADS-box genes were grouped into major clusters in each species based on their temporal expression patterns, corresponding to early, middle, and late developmental stages, denoted as A1–A3, B1–B3, C1–C4, and D1–D4, respectively. Heatmaps represent gene expression levels after Z-score normalization across developmental stages, where red indicates relatively high expression and blue indicates relatively low expression.
Figure 6. Expression profiles of expressed MADS-box genes during endocarp/seed coat development. Expression patterns of expressed genes across different developmental stages in four species: (A) iron walnut (stages P1–P3), (B) pecan (stages KD1–KD3), (C) macadamia (stages shell1-shell5), and (D) castor (seed coat, stages SOC1–SOC5). Expressed MADS-box genes were grouped into major clusters in each species based on their temporal expression patterns, corresponding to early, middle, and late developmental stages, denoted as A1–A3, B1–B3, C1–C4, and D1–D4, respectively. Heatmaps represent gene expression levels after Z-score normalization across developmental stages, where red indicates relatively high expression and blue indicates relatively low expression.
Horticulturae 12 00626 g006
Figure 7. Cis-regulatory element distribution in promoter regions of JsiMADS genes. Distribution of cis-regulatory elements within the 2 kb upstream promoter regions of JsiMADS genes in iron walnut. Each horizontal line represents an individual gene, and colored boxes indicate the positions of predicted cis-regulatory elements along the promoter sequence. Different colors correspond to distinct categories of regulatory elements.
Figure 7. Cis-regulatory element distribution in promoter regions of JsiMADS genes. Distribution of cis-regulatory elements within the 2 kb upstream promoter regions of JsiMADS genes in iron walnut. Each horizontal line represents an individual gene, and colored boxes indicate the positions of predicted cis-regulatory elements along the promoter sequence. Different colors correspond to distinct categories of regulatory elements.
Horticulturae 12 00626 g007
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sun, J.; Zhou, Z.; Wang, Z.; Wei, F.; Meng, F.; Wen, M.; Xi, X.; Liu, A.; Yu, A. Comparative Genomics and Co-Expression Profiling of MADS-Box Genes Reveal Conserved Candidate Regulators of Secondary Cell Wall Formation in Lignified Endocarp and Seed Coat Across Four Angiosperm Species. Horticulturae 2026, 12, 626. https://doi.org/10.3390/horticulturae12050626

AMA Style

Sun J, Zhou Z, Wang Z, Wei F, Meng F, Wen M, Xi X, Liu A, Yu A. Comparative Genomics and Co-Expression Profiling of MADS-Box Genes Reveal Conserved Candidate Regulators of Secondary Cell Wall Formation in Lignified Endocarp and Seed Coat Across Four Angiosperm Species. Horticulturae. 2026; 12(5):626. https://doi.org/10.3390/horticulturae12050626

Chicago/Turabian Style

Sun, Jing, Zekun Zhou, Zhixin Wang, Funing Wei, Fanqing Meng, Mengyun Wen, Xueliang Xi, Aizhong Liu, and Anmin Yu. 2026. "Comparative Genomics and Co-Expression Profiling of MADS-Box Genes Reveal Conserved Candidate Regulators of Secondary Cell Wall Formation in Lignified Endocarp and Seed Coat Across Four Angiosperm Species" Horticulturae 12, no. 5: 626. https://doi.org/10.3390/horticulturae12050626

APA Style

Sun, J., Zhou, Z., Wang, Z., Wei, F., Meng, F., Wen, M., Xi, X., Liu, A., & Yu, A. (2026). Comparative Genomics and Co-Expression Profiling of MADS-Box Genes Reveal Conserved Candidate Regulators of Secondary Cell Wall Formation in Lignified Endocarp and Seed Coat Across Four Angiosperm Species. Horticulturae, 12(5), 626. https://doi.org/10.3390/horticulturae12050626

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop