Next Article in Journal
Level 3 Cardiopulmonary Exercise Testing to Guide Therapeutic Decisions in Non-Severe Pulmonary Hypertension with Lung Disease
Previous Article in Journal
Effects of a Self-Management Telehealth Program on Improving Strength and Hand Function in Systemic Sclerosis Patients: A Randomized Controlled Trial
Previous Article in Special Issue
OsMYBR1, a 1R-MYB Family Transcription Factor Regulates Starch Biosynthesis in Rice Endosperm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Meta-Analysis of Wild Relatives and Domesticated Species of Rice, Tomato, and Soybean Using Publicly Available Transcriptome Data

by
Makoto Yumiya
1 and
Hidemasa Bono
1,2,*
1
Graduate School of Integrated Sciences for Life, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima 739-0046, Japan
2
Genome Editing Innovation Center, Hiroshima University, 3-10-23 Kagamiyama, Higashi-Hiroshima 739-0046, Japan
*
Author to whom correspondence should be addressed.
Life 2025, 15(7), 1088; https://doi.org/10.3390/life15071088
Submission received: 3 June 2025 / Revised: 5 July 2025 / Accepted: 9 July 2025 / Published: 11 July 2025
(This article belongs to the Special Issue Recent Advances in Crop Genetics and Breeding)

Abstract

The domesticated species currently available in the market have been developed through the breeding of wild relatives. Breeding strategies using wild relatives with high genetic diversity are attracting attention as an important approach for addressing climate change and ensuring sustainable food supply. However, studies examining gene expression variation in multiple wild and domesticated species are limited. Therefore, we aimed to investigate the changes in gene expression associated with domestication. We performed a meta-analysis of public gene expression data of domesticated species of rice, tomato, and soybean and their presumed ancestral species using 21 pairs for rice, 36 pairs for tomato, and 56 pairs for soybean. In wild relatives, the expression of genes involved in osmotic, drought, and wound stress tolerance was upregulated, with 18 genes included in the top 5% of DW scores. In domesticated species, upregulated expression was observed in genes related to auxin and those involved in the efflux of heavy metals and harmful substances, with 36 genes included in the top 5% of DW scores. These findings provide insights into how domestication influences changes in crop traits. Thus, our findings may contribute to rapid breeding and the development of new varieties capable of growing in harsh natural environments. Hence, a new cultivation method called “de novo domestication” has been proposed, which combines the genetic diversity of currently unused wild relatives and wild relatives with genome editing technologies that enable rapid breeding.

1. Introduction

Owing to global warming, extreme weather events, and changes in the Earth’s environment, the yields of major global crops such as maize, rice, and soybeans could decrease 12–20% by the end of this century [1,2,3,4]. Additionally, with an increase in global population, food demand is expected to rise significantly, and by 2050, global food demand is projected to be approximately 1.5 times higher than that in 2010 [5]. To address these issues, the development of crop varieties with desirable traits in a short period is highly anticipated. In recent years, genome editing technologies, particularly CRISPR/Cas9, have garnered significant attention. Genome editing is a technology that enables the targeting of specific genes or nucleotide sequences to induce loss-of-function or introduce genes derived from other organisms. Using this genome editing technology, new varieties of various organisms, such as tomatoes with high gamma-aminobutyric acid content and red sea bream (Madai) with increased edible parts, have been developed [6,7]. Additionally, “de novo domestication” efforts utilizing wild relatives, which are genetic resources that have been underutilized until now, are also gaining attention for achieving sustainable crop production. Research cases of “de novo domestication” of wild tomatoes and wild rice have been reported in the literature [8,9,10]. The reason for this interest is because the currently distributed domesticated species have been selectively bred with a particular focus on yield, resulting in concerns about their low genetic diversity [11,12,13,14,15]. To implement such methods effectively, obtaining insights into the differences in traits between domesticated species and their wild counterparts is essential. However, studies examining gene expression variation in multiple wild and domesticated species are limited.
Meta-analysis is a valuable method that integrates multiple research findings to provide new insights. Meta-analyses utilizing public gene expression data have been performed and reported in the literature. For example, reports include a meta-analysis of soybean transcriptome data under heat, water, and drought stress [16], as well as an application that enables the identification and visualization of stress-responsive genes in Arabidopsis thaliana by applying analytical indices comparable to those used in the present study [17]. Additionally, the number of gene expression datasets registered in public databases is expected to increase in the future [18]. Therefore, the reliability of the analysis results is expected to increase as the number of datasets available for the meta-analysis increases. Given this background, we performed a meta-analysis using gene expression data from wild and domesticated species of rice, tomato, and soybean registered by multiple research groups in public databases to identify the gene groups specifically expressed in wild relatives and crops.
This study aimed to investigate the changes in gene expression associated with domestication and provide insights for developing new crop improvement strategies. Although the species analyzed in this study were not closely related and the number of gene expression datasets used was limited, this analysis provides valuable insights by focusing on changes in gene expression between wild and domesticated species during domestication.

2. Materials and Methods

2.1. Curation of Public Gene Expression Data

RNA sequencing (RNA-seq) data were obtained from public databases, primarily the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) [19]. To supplement the datasets not available in NCBI GEO, additional RNA-seq data were collected from published studies available online. In NCBI GEO, we performed searches using the scientific names of wild relatives, specifically “Oryza rufipogon” and “Glycine soja”. For tomato, two wild relatives, “Solanum pennellii” and “Solanum arcanum”, were utilized. The domesticated species paired with wild relatives and used in this analysis were “Oryza sativa japonica” for rice, “Solanum lycopersicum” for tomato, and “Glycine max” for soybean. To further refine the search results, a filter for “expression profiling by high-throughput sequencing” was applied. RNA-seq data were searched using the methods described above, and datasets containing paired wild and domesticated species from the same project were curated and used for subsequent analyses. The rationale for using data from the same project was to standardize the cultivation environments and experimental conditions as much as possible, thereby reducing batch effects.

2.2. Gene Expression Quantification

Each RNA-seq dataset was obtained using the prefetch (version 3.0.10) and Fasterq-dump (version 3.0.10) commands from the SRA Toolkit [20]. Quality control of the raw reads and removal of adapter sequences were performed using fastp (version 0.23.4) [21]. Subsequently, the output files from the quality check were aggregated into a single file using MultiQC (version 1.18) [22]. Transcripts were quantified using Salmon (version 1.10.1) [23]. For transcript quantification, reference cDNAs obtained from Ensembl Plant were used: Oryza sativa japonica and Oryza rufipogon with IRGSP-1.0; Solanum pennellii, Solanum arcanum, and Solanum lycopersicum with SL3.0; and Glycine soja and Glycine max with Glycine_max_v2.1. As a result, the quantified RNA-seq data were expressed as transcripts per million (TPM). Subsequently, transcript-level TPM values were summarized at the gene level using the tximport (version 1.28.0) [24] package in R.

2.3. Calculation of the DW Ratio

Gene expression data were normalized to the DW ratio. ‘D’ and ‘W’ represent ‘domesticated’ and ‘wild’, respectively. The DW ratio was calculated using the following equation:
DW Ratio = Domesticated TPM + 1/Wild TPM + 1
When calculating the DW ratio, to avoid division by zero and prevent errors caused by genes with zero expression, 1 was added to the TPM values of both domesticated species and wild relatives.

2.4. Classification of Differentially Expressed Genes (DEGs) Based on the DW Ratio

To evaluate genes showing expression changes between domesticated species and their wild relatives, all genes were classified into three groups. Specifically, the three groups were upregulated, unchanged, and downregulated. These groupings were determined according to preestablished thresholds. Genes were classified as upregulated if their DW ratio exceeded an upper threshold, downregulated if their DW ratio fell below a lower threshold, and unchanged if they did not meet either of these criteria. For the upregulated category, 20 thresholds ranging from 1.5-fold to 200-fold were tested, and a 2-fold threshold was adopted. For the downregulated category, 20 thresholds ranging from 1/1.5 to 1/200 were tested, and a threshold of 1/2 was adopted [25].

2.5. Calculation of the DW Score

To evaluate DEGs by integrating different experiments based on the DW ratio, a DW score was calculated. The DW score was calculated by subtracting the number of pairs classified as upregulated from the number of pairs classified as downregulated. A pair refers to a set comprising one wild relative and one domesticated species from the same project.
DW Score = Downregulated pair count − Upregulated pair count
The DW ratio and DW score were calculated using code from a previous study [26].

2.6. Gene Set Enrichment Analysis

Gene set enrichment analysis was performed using ShinyGO 0.81 [27] on the top- and bottom-ranking genes based on their DW scores. For the rice analysis, “Oryza sativa japonica Group gene IRGSP-1.0” was selected as the species, and “Gene Ontology (GO) Biological Process” was chosen as the pathway database. Default settings were used for all the other parameters. No enriched terms were observed when using species-specific annotations for tomato and soybean. Therefore, the gene IDs for both species were converted to Arabidopsis thaliana gene IDs, and enrichment analysis was performed using “Arabidopsis thaliana genes (TAIR10)”.

2.7. Commonly Upregulated Genes in Wild and Domesticated Species

To perform cross-species analysis, gene IDs from each species were converted to their corresponding Arabidopsis thaliana gene IDs (TAIR10). For this process, we used Ensembl Plant BioMart [28] to create a correspondence table linking the gene IDs of rice, tomato, and soybean with the gene IDs of Arabidopsis thaliana (TAIR10). Using the DW score for each species, we performed comparisons focusing on three ranges: the top 1%, 3%, and 5% of the upregulated and downregulated genes.

3. Results

3.1. Overview of the Study

In this study, a meta-analysis was performed to identify DEGs using domesticated species and their ancestral wild relatives. Figure 1 presents an overview of the study.

3.2. Curation of RNA-Seq Data from Public Databases and the Literature

The RNA-seq data of wild and domesticated species used in this study were primarily collected from NCBI GEO [19]. Compared to domesticated species, data on wild relatives were substantially limited. Therefore, we decided to use three species—rice, tomato, and soybean—for the analysis because data from multiple projects were available for these species. The number of samples and their tissue types used in this analysis are shown in Figure 2.
To collect RNA-seq data, keyword searches were performed using the names of three wild relatives: Oryza rufipogon, Solanum pennellii/Solanum arcanum, and Glycine soja. RNA-seq data were obtained from five BioProjects in the NCBI GEO and used for analysis. However, because the number of available datasets from NCBI GEO was very limited, RNA-seq data were also collected from ArrayExpress in the European Bioinformatics Institute BioStudies [29] and the relevant literature. Ultimately, RNA-seq data from four BioProjects not registered in the NCBI GEO were obtained from the Sequence Read Archive [30], resulting in a total of nine BioProjects used for analysis. The metadata for the samples used in this analysis are provided in Supplementary Tables S1–S3, which correspond to Oryza rufipogon and Oryza sativa japonica (Table S1); Solanum pennellii, Solanum arcanum, and Solanum lycopersicum (Table S2); and Glycine soja and Glycine max (Table S3), respectively.

3.3. Classification of DEGs and Enrichment Analysis in Oryza rufipogon and Oryza sativa japonica

The expression quantification data for each species are provided in Supplementary Tables S4–S6, which correspond to Oryza rufipogon and Oryza sativa japonica (Table S4); Solanum pennellii, Solanum arcanum, and Solanum lycopersicum (Table S5); and Glycine soja and Glycine max (Table S6), respectively. Subsequently, both the expression ratio (DW ratio) and DW score of the wild and domesticated species were calculated. Based on the DW score ranking, lists of DEGs were obtained for both the wild and domesticated species. A threshold of 2-fold and 1.5-fold change was selected as the criterion for identifying genes with differential expressions. Lists of genes with DW scores above the 2-fold and 1.5-fold thresholds used in this analysis are provided in Supplementary Tables S7–S9, which correspond to Oryza rufipogon and Oryza sativa japonica (Table S7); Solanum pennellii, Solanum arcanum, and Solanum lycopersicum (Table S8); and Glycine soja and Glycine max (Table S9), respectively.
To investigate functional biases, we performed enrichment analysis on DEGs from pairs of Oryza rufipogon and Oryza sativa japonica, utilizing approximately the top 3% ranked using DW scores (Figure 3a). As a supplement, for rice, an enrichment analysis was initially performed using the top approximately 1% of the DEGs based on DW scores; however, no enriched terms were identified. Therefore, for the rice analysis only, the scope was expanded to include DEGs in approximately the top 3% based on DW scores. In the enrichment analysis targeting genes with upregulated expression in Oryza rufipogon, GO terms related to environmental responses, such as “cellular response to cold” and “response to wounding”, were found to be enriched (Figure 3b). In the results for Oryza sativa japonica, the GO terms related to photosynthesis and photosystems were enriched (Figure 3c). The genes included in the GO terms identified in Oryza rufipogon and Oryza sativa japonica are listed in Supplementary Tables S10 and S11, corresponding to Oryza rufipogon (Table S10) and Oryza sativa japonica (Table S11), respectively.

3.4. Classification of DEGs and Enrichment Analysis of Solanum pennellii, Solanum arcanum, and Solanum lycopersicum

In the enrichment analysis for tomato, as mentioned in the Section 2, tomato gene IDs were converted to Arabidopsis thaliana gene IDs, and the enrichment analysis was performed accordingly.
Enrichment analysis was performed on DEGs from pairs of Solanum pennellii, Solanum arcanum, and Solanum lycopersicum, utilizing approximately the top 1% ranked using DW scores (Figure 4a). In the enrichment analysis of gene groups with upregulated expression in Solanum pennellii and Solanum arcanum, GO terms such as “ascorbate glutathione cycle” and “purine nucleoside transmembrane transport” were found to be enriched. Additionally, GO terms related to stress responses, such as “response to hydrogen peroxide” and “response to reactive oxygen species”, were also enriched (Figure 4b). In Solanum lycopersicum, GO terms related to sulfur metabolism and plant hormones, such as “sulfate reduction”, “jasmonic acid metabolic process”, and “salicylic acid metabolic process”, were enriched (Figure 4c). The genes included in the GO terms identified in Solanum pennellii, Solanum arcanum, and Solanum lycopersicum are listed in Supplementary Tables S12 and S13, corresponding to Solanum pennellii and Solanum arcanum (Table S12) and Solanum lycopersicum (Table S13), respectively.

3.5. Classification of DEGs and Enrichment Analysis in Glycine soja and Glycine max

Similar to tomato, soybean gene IDs were converted to Arabidopsis thaliana gene IDs, and enrichment analysis was performed.
Enrichment analysis was performed on DEGs from pairs of Glycine soja and Glycine max utilizing approximately the top 1% ranked using DW scores (Figure 5a). In Glycine soja, similar to the results observed in other wild relatives, GO terms such as “response to oxidative stress” and “cellular detoxification”, which contribute to defense mechanisms against environmental stress, were enriched. Additionally, a larger number of genes were classified under these terms (Figure 5b). In Glycine max, the enrichment analysis revealed that GO terms associated with secondary metabolites, such as triterpenoid and isoprenoid biosynthesis, as well as pathways related to energy metabolism, were enriched (Figure 5c). The genes included in the GO terms identified in Glycine soja and Glycine max are listed in Supplementary Tables S14 and S15, corresponding to Glycine soja (Table S14) and Glycine max (Table S15), respectively.

3.6. Common DEGs in Wild and Domesticated Species

To investigate genes commonly differentially expressed between wild and domesticated species of rice, tomato, and soybean, the gene IDs of each species were converted to Arabidopsis thaliana gene IDs. The correspondence tables for each gene ID created using BioMart [17] from Ensembl Plants focusing on the top 5% of genes by score for each species (Figures 3a–5a) are provided in Supplementary Tables S16–S21, corresponding to Oryza rufipogon (Table S16), Oryza sativa japonica (Table S17), Solanum pennellii and Solanum arcanum (Table S18), Solanum lycopersicum (Table S19), Glycine soja (Table S20), and Glycine max (Table S21), respectively.
Eighteen genes were commonly upregulated in approximately the top 5% of DW scores for wild relatives, while 36 genes were identified for domesticated species (Figure 6a,b). These genes are listed in Table 1 and Table 2. Enrichment analysis results for the genes that were commonly upregulated in wild relatives and domesticated species are shown in Figure 6c,d. In wild relatives, GO terms related to environmental stress responses were enriched, consistent with the previous enrichment analysis results. However, in domesticated species, GO terms related to chemical compound export and detoxification, as well as the regulation of hormone responses, were enriched, differing from the enrichment analysis results observed in each individual domesticated species.

4. Discussion

In this study, we obtained gene expression data for wild and domesticated species of rice, tomato, and soybean from public databases and investigated the DEGs resulting from domestication effects. For this purpose, we utilized the DW ratio and DW score as analytical metrics to perform a meta-analysis comparing gene expression data from multiple research projects. Enrichment analysis was performed to investigate the characteristics of DEGs in both wild and domesticated species. Additionally, to identify genes that showed common differential expression across different species, the gene IDs of rice, tomato, and soybean were converted to Arabidopsis thaliana gene IDs. Based on these analyses, wild relatives included gene groups involved in environmental stress responses, enabling plants to adapt to harsh conditions. This finding supports the previously reported high environmental adaptability of wild relatives. In contrast, domesticated species contained genes involved in detoxification and export of chemical compounds. This is likely due, in large part, to the increased use of chemical fertilizers in crop cultivation. Based on these findings, the meta-analysis utilizing gene expression data from public databases suggests the high environmental adaptability of wild relatives and changes in crop traits and characteristics resulting from domestication.

4.1. Genes Commonly Upregulated Across Wild Relatives

In wild relatives, 18 genes were commonly upregulated within the top 5% of DW scores. The gene groups included in this analysis were associated with four GO terms: response to osmotic stress (GO:0006970), response to abscisic acid (GO:0001101), response to alcohol (GO:0097305), and response to lipid (GO:0033993), as shown in the enrichment analysis results in Figure 5c. A notable result is the enrichment of the abscisic acid response term, which plays a central role in responses to various stresses, such as drought, salinity, and low temperatures, across multiple species [31,32,33,34]. Additionally, GO terms related to the osmotic stress response and maintenance of cellular homeostasis were also enriched. Thus, the genes commonly upregulated in wild relatives included gene groups involved in environmental stress responses, suggesting that wild relatives possess a higher capacity to adapt to harsh natural environments than domesticated species.
Subsequently, we investigated whether the individual genes listed in Table 1—those commonly upregulated across wild relatives—have been previously reported to primarily function in environmental stress tolerance. HKT1 is a key ion transporter involved in the plant salt stress response and salt tolerance. It primarily limits the translocation of Na+ from roots to leaves and stems, thereby contributing to the maintenance of ion homeostasis [35]. A study comparing salt tolerance between HKT1 knockout plants and those with phloem-specific overexpression of HKT1 reported that overexpression lines exhibited reduced Na+ translocation to the leaves, whereas knockout lines showed significant Na+ accumulation in the leaves. Additionally, plants with overexpressed HKT1 produce more seeds and have higher overall yield under saline conditions compared to control plants [36]. Therefore, the molecular mechanism of salt tolerance mediated by HKT1, which exhibits upregulated expression in wild relatives, holds promise for developing crop varieties capable of thriving in saline-affected soils while maintaining superior yields.
The RD22 gene functions as a molecular link connecting abscisic acid (ABA) signaling and abiotic stress responses and plays a critical role in plant drought stress adaptation [37]. In Arabidopsis thaliana, RD22 exists as a single-copy gene, whereas certain plant species, such as grapevines, possess multiple paralogs, forming an expanded RD22 family [38]. The expression of the RD22 gene is regulated by two transcription factors. When plants are exposed to drought stress, ABA is synthesized in the initial phase, triggering the production of MYB2 and MYC2—the transcription factors responsible for RD22. These factors promote RD22 gene expression, resulting in the synthesis of RD22 gene products that confer drought tolerance in plants [37,38,39,40,41].
The transcription factors HB-7 and HB-12 belong to the homeodomain-leucine zipper subfamily I. HB-12 contributes to enhanced seed production under water stress conditions, while HB-7 is involved in leaf development and photosynthesis promotion in mature plants [42]. These two genes are cooperatively regulated depending on developmental stages and environmental conditions, modulating processes associated with plant growth and water stress responses. DOX1 is an enzyme that catalyzes the initial oxidation of fatty acids and possesses diverse functions, including pathogen defense, aphid-induced wound response, and protection against oxidative stress and cell death [43,44,45].
MYB74 and MYB102 belong to the R2R3-MYB transcription factor family and regulate stress responses and other plant-specific processes. MYB102 is involved in the osmotic stress response and wound signaling pathway [46]. Specifically, in experiments investigating the feeding effects caused by Pieris rapae larvae, MYB102 knockout mutants exhibited accelerated larval development rates and considerably higher pupation rates than control plants [47]. Thus, MYB102 contributes to herbivory resistance. Similar to MYB102, MYB74 is associated with environmental stress tolerance. Specifically, MYB74 overexpression enhances osmotic stress tolerance. However, MYB74 overexpression lines exhibit detrimental effects on growth compared with control plants [48]. These findings show that although MYB74 overexpression enhances stress tolerance, particularly to osmotic stress, it negatively impacts plant growth. The present analytical results, showing higher MYB74 expression in wild relatives and lower expression in domesticated species, suggest that domestication prioritizes yield. This implies a trade-off relationship; domesticated species lost the high stress tolerance inherent in wild relatives but gained increased yield, as supported by functional studies of MYB74.
In summary, the gene function analysis of wild relatives revealed that genes contributing to traits essential for survival in harsh environments, including biotic and environmental stresses, were highly expressed. Furthermore, the fact that genes highly expressed in wild relatives exhibited reduced expression in domesticated species suggests that domestication may have led to the gradual loss of these stress-resistance genes. Therefore, breeding approaches utilizing wild relatives—which retain the high stress tolerance lost in domesticated species—hold potential as valuable genetic resources for developing crop varieties adapted to increasingly harsh environmental conditions.

4.2. Genes Commonly Upregulated Across Domesticated Species

In domesticated species, the terms identified in the enrichment analyses of individual wild relatives were rarely observed. This difference can be attributed to the fact that rice, tomato, and soybean are not closely related species, and each species has undergone domestication and breeding to acquire different characteristics and traits. Conversely, the fact that genes involved in environmental stress tolerance were commonly detected even in analyses targeting non-closely related plant species suggests that these genes may play important roles across a wide range of plant species.
Next, we investigated the functions of genes that were commonly upregulated in domesticated species. The gene list for domesticated species included several auxin-related genes, such as SHY2, IAA3, AXR3, IAA7, IAA14, IAA1, and ATAUX2-11. Auxins are plant hormones involved in various functions, including the promotion of cell elongation and regulation of plant growth and development [49,50]. Although these genes have not been reported to directly contribute to increased crop yield, the observed upregulation of auxin-related genes in domesticated species probably reflects domestication-driven selection.
Among the genes showing upregulated expression in domesticated species, a particularly distinctive feature was the inclusion of multiple multidrug and toxic compound extrusion (MATE) family genes. ALF5, a member of the MATE family, confers resistance to tetramethylammonium. Furthermore, studies suggest that engineering plants to overexpress specific MATE proteins could enable their growth in chemically contaminated soils [51]. The MATE family genes that exhibited upregulated expression in this study were considerably enriched in GO terms related to chemical export and detoxification (Figure 6d), including detoxification (GO:0098754), response to toxic substance (GO:0009636), xenobiotic export (GO:0046618), xenobiotic detoxification by transmembrane export across the plasma membrane (GO:1990961), xenobiotic transport (GO:0042908), export from cell (GO:0140352), and export across plasma membrane (GO:0140115). One of the genes included in these terms, DTX1, mediates the efflux of the heavy metal cadmium as well as toxic compounds [52]. One possible reason for the increased expression of MATE gene family members in domesticated species is the change in the soil environment resulting from the increased use of chemical fertilizers. The adverse effects of heavy metal and pesticide accumulation in modern agricultural soils on plant growth and health have been extensively documented in scientific studies [53] Considering these factors, agricultural environments for domesticated species, domesticated from wild ancestors, exhibit elevated heavy metal accumulation in soils compared to traditional systems, driven by increased agrochemical use (e.g., pesticides and chemical fertilizers) and pollution linked to modern agricultural practices. Consequently, we hypothesized that domesticated species exhibit an upregulated expression of genes associated with the detoxification and export of harmful substances, a phenomenon that is not observed in their wild counterparts. However, most MATE family genes, other than DTX1, remain unnamed and are likely to be understudied, warranting further detailed functional analyses.
XTH32, another gene showing upregulated expression in domesticated species, is a member of the enzyme family involved in plant cell wall remodeling. It catalyzes the cleavage and polymerization of xyloglucan, thereby contributing to plant growth and development [54]. Therefore, XTH32 probably contributes to the development of large-yielding individuals commonly observed in domesticated species. Furthermore, this gene family is involved in adaptation to external stresses [55] and is mentioned in meta-analyses of Arabidopsis abiotic stress responses performed using methods similar to this analysis [56,57].
This study has several limitations that need to be taken into consideration. First, the number of datasets used for the meta-analysis was small, which may have introduced bias in the results. This is due to the current scarcity of research projects in public databases that include RNA-seq data for both wild and domesticated species. This issue will likely be resolved in the future when studies comparing wild and domesticated species are conducted and their respective RNA-seq data accumulate. Second, well-annotated transcriptome data for wild relatives were lacking. The reference transcriptome used to quantify expression levels was derived from domesticated species. Consequently, the accuracy of gene expression quantification and identification of specific genes in wild relatives may have been compromised. To address this issue, the establishment of high-quality data for wild relatives is essential. Third, when comparing genes with variable expression across different species, the analysis was standardized to Arabidopsis gene names. However, because tomato, rice, and soybean are not closely related, several genes could not be matched. Consequently, genes exhibiting variations in expression in both wild and domesticated species may not be reflected in the results of this study. This issue is likely to occur when standardizing gene names to those of specific species. However, improvements are expected as the identification of gene functions advances in many species beyond model organisms. Finally, because the identification of genes with different expression levels in this study did not involve statistical testing, the results should be interpreted with caution. Despite these limitations, this study is a valuable resource for understanding trait changes due to domestication and for applications in breeding, given that few studies have examined gene expression variation in multiple wild and domesticated species.

5. Conclusions

A meta-analysis involving multiple wild relatives and domesticated species enabled the identification of DEGs associated with the effects of domestication. In wild relatives, increased expression of multiple genes that contribute to environmental stress tolerance was observed. These findings suggest that wild relatives have the potential to grow and survive in harsh environmental conditions. This finding also supports recent trends in breeding methods that utilize valuable traits of wild relatives. In contrast, in domesticated species, particularly rice, an upregulation of photosynthesis-related genes was observed. Since photosynthesis is closely associated with crop yield, this result may reflect the selection and breeding of high-yielding individuals during the domestication process. In the three domesticated species, the genes commonly upregulated included those involved in chemical response, efflux, and detoxification. These results reflect the impact of increased agrochemical use (e.g., pesticides and chemical fertilizers) on contemporary crop cultivation. Additionally, auxin-related genes and enzymes involved in cell-wall remodeling were identified among the upregulated genes in domesticated species, suggesting their potential contribution to yield improvement through domestication-driven selection.
The findings of this study provide valuable insights into the genetic basis of crop domestication and contribute to the identification of candidate genes for breeding that are expected to be applied in future crop improvement efforts. In particular, the utilization of wild relatives to develop new crop varieties holds significant potential as a valuable approach for addressing various challenges in future crop breeding. Although the number of species and samples used in this analysis was limited, combining analyses involving a greater diversity of species with emerging scientific technologies such as genome editing is expected to enable the development of crop varieties with desirable traits in a shorter timeframe than that required for traditional methods.

Supplementary Materials

The following supporting information can be downloaded at: https://doi.org/10.6084/m9.figshare.c.7801100.v1. Table S1: Oryza rufipogon and Oryza sativa japonica sample metadata. Table S2: Solanum pennellii, Solanum arcanum, and Solanum lycopersicum sample metadata. Table S3: Glycine soja and Glycine max metadata. Table S4: TPM Data of Oryza rufipogon and Oryza sataiva japonica. Table S5: TPM data of Solanum pennellii, Solanum arcanum, and Solanum lycopersicum. Table S6: TPM data of Glycine soja and Glycine max. Table S7: DW scores for Oryza rufipogon and Oryza sataiva japonica. Table S8: DW scores for Solanum pennellii, Solanum arcanum, and Solanum lycopersicum. Table S9: DW scores for Glycine soja and Glycine max. Table S10: Gene list for Oryza sativa rufipogon enrichment analysis. Table S11: Gene list for Oryza sativa japonica enrichment analysis. Table S12: Gene list for Solanum pennellii and Solanum arcanum enrichment analysis. Table S13: Gene list for Solanum lycopersicum enrichment analysis. Table S14: Gene list for Glycine soja enrichment analysis. Table S15: Gene list for Glycine max enrichment analysis. Table S16: Arabidopsis thalianaOryza rufipogon DW score Top5 Percent_Gene_Correspondence. Table S17: Arabidopsis thalianaOryza sativa japonica DW score Top5 Percent_Gene_Correspondence. Table S18: Arabidopsis thalianaSolanum pennellii and Solanum arcanum DW score Top5 Percent_Gene_Correspondence. Table S19: Arabidopsis thalianaSolanum lycopersicum DW score Top5 Percent_Gene_Correspondence. Table S20: Arabidopsis thalianaGlycine soja DW score Top5 Percent_Gene_Correspondence. Table S21: Arabidopsis thalianaGlycine max DW score Top5 Percent_Gene_Correspondence.

Author Contributions

Conceptualization, M.Y. and H.B.; methodology, M.Y. and H.B.; software, M.Y.; validation, M.Y. and H.B.; formal analysis, M.Y.; investigation, M.Y.; resources, H.B.; data curation, M.Y.; writing—original draft preparation, M.Y.; writing—review and editing, M.Y. and H.B.; visualization, M.Y.; supervision, H.B.; project administration, H.B.; funding acquisition, H.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Center of Innovation for Bio-Digital Transformation (BioDX), an open innovation platform for industry-academia co-creation (COI-NEXT), the Japan Science and Technology Agency (JST) (grant number JPMJPF2010).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in FigShare at https://doi.org/10.6084/m9.figshare.c.7801100.v1.

Acknowledgments

Computations were performed using the computers at the Hiroshima University Genome Editing Innovation Center.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
RNA-seqRNA sequencing
NCBI GEONational Center for Biotechnology Information Gene Expression Omnibus
TPMTranscripts per million
DWDomesticated and wild
DEGDifferentially expressed gene
GOGene Ontology
ABAAbscisic acid
MATEMultidrug and toxic compound extrusion

References

  1. Challinor, A.J.; Watson, J.; Lobell, D.B.; Howden, S.M.; Smith, D.R.; Chhetri, N. A Meta-Analysis of Crop Yield under Climate Change and Adaptation. Nat. Clim. Change 2014, 4, 287–291. [Google Scholar] [CrossRef]
  2. Aggarwal, P.; Vyas, S.; Thornton, P.; Campbell, B.M.; Kropff, M. Importance of Considering Technology Growth in Impact Assessments of Climate Change on Agriculture. Glob. Food Secur. 2019, 23, 41–48. [Google Scholar] [CrossRef]
  3. Lobell, D.B.; Gourdji, S.M. The Influence of Climate Change on Global Crop Productivity. Plant Physiol. 2012, 160, 1686–1697. [Google Scholar] [CrossRef] [PubMed]
  4. Wheeler, T.; von Braun, J. Climate Change Impacts on Global Food Security. Science 2013, 341, 508–513. [Google Scholar] [CrossRef]
  5. van Dijk, M.; Morley, T.; Rau, M.L.; Saghai, Y. A Meta-Analysis of Projected Global Food Demand and Population at Risk of Hunger for the Period 2010–2050. Nat. Food 2021, 2, 494–501. [Google Scholar] [CrossRef]
  6. Kishimoto, K.; Washio, Y.; Yoshiura, Y.; Toyoda, A.; Ueno, T.; Fukuyama, H.; Kato, K.; Kinoshita, M. Production of a Breed of Red Sea Bream Pagrus Major with an Increase of Skeletal Muscle Mass and Reduced Body Length by Genome Editing with CRISPR/Cas9. Aquaculture 2018, 495, 415–427. [Google Scholar] [CrossRef]
  7. Nonaka, S.; Arai, C.; Takayama, M.; Matsukura, C.; Ezura, H. Efficient Increase of Ɣ-Aminobutyric Acid (GABA) Content in Tomato Fruits by Targeted Mutagenesis. Sci. Rep. 2017, 7, 7057. [Google Scholar] [CrossRef]
  8. Yu, H.; Lin, T.; Meng, X.; Du, H.; Zhang, J.; Liu, G.; Chen, M.; Jing, Y.; Kou, L.; Li, X.; et al. A Route to de Novo Domestication of Wild Allotetraploid Rice. Cell 2021, 184, 1156–1170.e14. [Google Scholar] [CrossRef]
  9. Zsögön, A.; Čermák, T.; Naves, E.R.; Notini, M.M.; Edel, K.H.; Weinl, S.; Freschi, L.; Voytas, D.F.; Kudla, J.; Peres, L.E.P. De Novo Domestication of Wild Tomato Using Genome Editing. Nat. Biotechnol. 2018, 36, 1211–1216. [Google Scholar] [CrossRef]
  10. Zsögön, A.; Cermak, T.; Voytas, D.; Peres, L.E.P. Genome Editing as a Tool to Achieve the Crop Ideotype and de Novo Domestication of Wild Relatives: Case Study in Tomato. Plant Sci. 2017, 256, 120–130. [Google Scholar] [CrossRef]
  11. Krug, A.S.; Drummond, E.B.M.; Van Tassel, D.L.; Warschefsky, E.J. The next Era of Crop Domestication Starts Now. Proc. Natl. Acad. Sci. USA 2023, 120, e2205769120. [Google Scholar] [CrossRef]
  12. Curtin, S.; Qi, Y.; Peres, L.E.P.; Fernie, A.R.; Zsögön, A. Pathways to de Novo Domestication of Crop Wild Relatives. Plant Physiol. 2022, 188, 1746–1756. [Google Scholar] [CrossRef] [PubMed]
  13. Yu, H.; Li, J. Breeding Future Crops to Feed the World through de Novo Domestication. Nat. Commun. 2022, 13, 1171. [Google Scholar] [CrossRef]
  14. Gasparini, K.; dos Reis Moreira, J.; Peres, L.E.P.; Zsögön, A. De Novo Domestication of Wild Species to Create Crops with Increased Resilience and Nutritional Value. Curr. Opin. Plant Biol. 2021, 60, 102006. [Google Scholar] [CrossRef]
  15. Razzaq, A.; Saleem, F.; Wani, S.H.; Abdelmohsen, S.A.M.; Alyousef, H.A.; Abdelbacki, A.M.M.; Alkallas, F.H.; Tamam, N.; Elansary, H.O. De-Novo Domestication for Improving Salt Tolerance in Crops. Front. Plant Sci. 2021, 12, 681367. [Google Scholar] [CrossRef]
  16. Shafiq, M.; Azeem, F.; Waheed, Y.; Pamirsky, I.E.; Feng, X.; Golokhvast, K.S.; Nawaz, M.A. Meta-Analysis of RNA-Seq Data of Soybean under Heat, Water, and Drought Stresses. Plant Biotechnol. Rep. 2025, 19, 205–222. [Google Scholar] [CrossRef]
  17. Fukuda, Y.; Kawaguchi, K.; Fukushima, A. AtSRGA: A Shiny Application for Retrieving and Visualizing Stress-Responsive Genes in Arabidopsis thaliana. Plant Physiol. 2025, 197, kiaf105. [Google Scholar] [CrossRef]
  18. Available online: https://www.ncbi.nlm.nih.gov/sra/docs/sragrowth/ (accessed on 1 April 2025).
  19. Barrett, T.; Wilhite, S.E.; Ledoux, P.; Evangelista, C.; Kim, I.F.; Tomashevsky, M.; Marshall, K.A.; Phillippy, K.H.; Sherman, P.M.; Holko, M.; et al. NCBI GEO: Archive for Functional Genomics Data Sets-Update. Nucleic Acids Res. 2013, 41, D991–D995. [Google Scholar] [CrossRef]
  20. Ncbi/Sra-Tools: SRA Tools. Available online: https://github.com/ncbi/sra-tools (accessed on 5 March 2025).
  21. Chen, S.; Zhou, Y.; Chen, Y.; Gu, J. Fastp: An Ultra-Fast All-in-One FASTQ Preprocessor. Bioinformatics 2018, 34, i884–i890. [Google Scholar] [CrossRef]
  22. Ewels, P.; Magnusson, M.; Lundin, S.; Käller, M. MultiQC: Summarize Analysis Results for Multiple Tools and Samples in a Single Report. Bioinformatics 2016, 32, 3047–3048. [Google Scholar] [CrossRef]
  23. Patro, R.; Duggal, G.; Love, M.I.; Irizarry, R.A.; Kingsford, C. Salmon Provides Fast and Bias-Aware Quantification of Transcript Expression. Nat. Methods 2017, 14, 417–419. [Google Scholar] [CrossRef]
  24. Tximport. Available online: http://bioconductor.org/packages/tximport/ (accessed on 20 March 2025).
  25. Yonezawa, S. Yonesora56/HS_rice. (21 December 2023). Jupyter Notebook. [Online]. Available online: https://github.com/yonesora56/HS_rice (accessed on 20 March 2025).
  26. Ono, Y.; Bono, H. Multi-Omic Meta-Analysis of Transcriptomes and the Bibliome Uncovers Novel Hypoxia-Inducible Genes. Biomedicines 2021, 9, 582. [Google Scholar] [CrossRef] [PubMed]
  27. Ge, S.X.; Jung, D.; Yao, R. ShinyGO: A Graphical Gene-Set Enrichment Tool for Animals and Plants. Bioinforma. Oxf. Engl. 2020, 36, 2628–2629. [Google Scholar] [CrossRef] [PubMed]
  28. Kinsella, R.J.; Kähäri, A.; Haider, S.; Zamora, J.; Proctor, G.; Spudich, G.; Almeida-King, J.; Staines, D.; Derwent, P.; Kerhornou, A.; et al. Ensembl BioMarts: A Hub for Data Retrieval across Taxonomic Space. Database 2011, 2011, bar030. [Google Scholar] [CrossRef]
  29. Athar, A.; Füllgrabe, A.; George, N.; Iqbal, H.; Huerta, L.; Ali, A.; Snow, C.; Fonseca, N.A.; Petryszak, R.; Papatheodorou, I.; et al. ArrayExpress Update—from Bulk to Single-Cell Expression Data. Nucleic Acids Res. 2019, 47, D711–D715. [Google Scholar] [CrossRef]
  30. Leinonen, R.; Sugawara, H.; Shumway, M.; on behalf of the International Nucleotide Sequence Database Collaboration. The Sequence Read Archive. Nucleic Acids Res. 2011, 39, D19–D21. [Google Scholar] [CrossRef] [PubMed]
  31. Lopez-Molina, L.; Mongrand, S.; Chua, N.-H. A Postgermination Developmental Arrest Checkpoint Is Mediated by Abscisic Acid and Requires the ABI5 Transcription Factor in Arabidopsis. Proc. Natl. Acad. Sci. USA 2001, 98, 4782–4787. [Google Scholar] [CrossRef]
  32. Sah, S.K.; Reddy, K.R.; Li, J. Abscisic Acid and Abiotic Stress Tolerance in Crop Plants. Front. Plant Sci. 2016, 7, 571. [Google Scholar] [CrossRef]
  33. Finkelstein, R.R.; Lynch, T.J. Abscisic Acid Inhibition of Radicle Emergence But Not Seedling Growth Is Suppressed by Sugars1. Plant Physiol. 2000, 122, 1179–1186. [Google Scholar] [CrossRef]
  34. Huang, Y.; Zhou, J.; Li, Y.; Quan, R.; Wang, J.; Huang, R.; Qin, H. Salt Stress Promotes Abscisic Acid Accumulation to Affect Cell Proliferation and Expansion of Primary Roots in Rice. Int. J. Mol. Sci. 2021, 22, 10892. [Google Scholar] [CrossRef]
  35. Horie, T.; Hauser, F.; Schroeder, J.I. HKT Transporter-Mediated Salinity Resistance Mechanisms in Arabidopsis and Monocot Crop Plants. Trends Plant Sci. 2009, 14, 660–668. [Google Scholar] [CrossRef] [PubMed]
  36. Uchiyama, T.; Saito, S.; Yamanashi, T.; Kato, M.; Takebayashi, K.; Hamamoto, S.; Tsujii, M.; Takagi, T.; Nagata, N.; Ikeda, H.; et al. The HKT1 Na+ Transporter Protects Plant Fertility by Decreasing Na+ Content in Stamen Filaments. Sci. Adv. 2023, 9, eadg5495. [Google Scholar] [CrossRef]
  37. Abe, H.; Yamaguchi-Shinozaki, K.; Urao, T.; Iwasaki, T.; Hosokawa, D.; Shinozaki, K. Role of Arabidopsis MYC and MYB Homologs in Drought- and Abscisic Acid-Regulated Gene Expression. Plant Cell 1997, 9, 1859–1868. [Google Scholar] [CrossRef]
  38. Matus, J.T.; Aquea, F.; Espinoza, C.; Vega, A.; Cavallini, E.; Santo, S.D.; Cañón, P.; Guardia, A.R.-H.; Serrano, J.; Tornielli, G.B.; et al. Inspection of the Grapevine BURP Superfamily Highlights an Expansion of RD22 Genes with Distinctive Expression Features in Berry Development and ABA-Mediated Stress Responses. PLoS ONE 2014, 9, e110372. [Google Scholar] [CrossRef]
  39. Abe, H.; Urao, T.; Ito, T.; Seki, M.; Shinozaki, K.; Yamaguchi-Shinozaki, K. Arabidopsis AtMYC2 (bHLH) and AtMYB2 (MYB) Function as Transcriptional Activators in Abscisic Acid Signaling. Plant Cell 2003, 15, 63–78. [Google Scholar] [CrossRef] [PubMed]
  40. Shinozaki, K.; Yamaguchi-Shinozaki, K. Gene Expression and Signal Transduction in Water-Stress Response. Plant Physiol. 1997, 115, 327–334. [Google Scholar] [CrossRef] [PubMed]
  41. Nakashima, K.; Kiyosue, T.; Yamaguchi-Shinozaki, K.; Shinozaki, K. A Nuclear Gene, Erd1, Encoding a Chloroplast-Targeted Clp Protease Regulatory Subunit Homolog Is Not Only Induced by Water Stress but Also Developmentally up-Regulated during Senescence in Arabidopsis thaliana. Plant J. 1997, 12, 851–861. [Google Scholar] [CrossRef]
  42. Ré, D.A.; Capella, M.; Bonaventure, G.; Chan, R.L. Arabidopsis AtHB7 and AtHB12 Evolved Divergently to Fine Tune Processes Associated with Growth and Responses to Water Stress. BMC Plant Biol. 2014, 14, 150. [Google Scholar] [CrossRef]
  43. Hamberg, M.; Sanz, A.; Castresana, C. α-Oxidation of Fatty Acids in Higher Plants: Identification of a pathogen-inducible oxygenase (piox) as an α-dioxygenase and biosynthesis of 2-hydroperoxylinolenic acid*. J. Biol. Chem. 1999, 274, 24503–24513. [Google Scholar] [CrossRef]
  44. De León, I.P.; Sanz, A.; Hamberg, M.; Castresana, C. Involvement of the Arabidopsisα-DOX1 Fatty Acid Dioxygenase in Protection against Oxidative Stress and Cell Death. Plant J. 2002, 29, 61–72. [Google Scholar] [CrossRef]
  45. Vicente, J.; Cascón, T.; Vicedo, B.; García-Agustín, P.; Hamberg, M.; Castresana, C. Role of 9-Lipoxygenase and α-Dioxygenase Oxylipin Pathways as Modulators of Local and Systemic Defense. Mol. Plant 2012, 5, 914–928. [Google Scholar] [CrossRef] [PubMed]
  46. Denekamp, M.; Smeekens, S.C. Integration of Wounding and Osmotic Stress Signals Determines the Expression of the AtMYB102 Transcription Factor Gene. Plant Physiol. 2003, 132, 1415–1423. [Google Scholar] [CrossRef]
  47. De Vos, M.; Denekamp, M.; Dicke, M.; Vuylsteke, M.; Van Loon, L.C.; Smeekens, S.C.; Pieterse, C. The Arabidopsis thaliana Transcription Factor AtMYB102 Functions in Defense Against The Insect Herbivore Pieris Rapae. Plant Signal. Behav. 2006, 1, 305–311. [Google Scholar] [CrossRef]
  48. Ortiz-García, P.; Pérez-Alonso, M.-M.; González Ortega-Villaizán, A.; Sánchez-Parra, B.; Ludwig-Müller, J.; Wilkinson, M.D.; Pollmann, S. The Indole-3-Acetamide-Induced Arabidopsis Transcription Factor MYB74 Decreases Plant Growth and Contributes to the Control of Osmotic Stress Responses. Front. Plant Sci. 2022, 13, 928386. [Google Scholar] [CrossRef] [PubMed]
  49. Cheng, Y.; Zhao, Y. A Role for Auxin in Flower Development. J. Integr. Plant Biol. 2007, 49, 99–104. [Google Scholar] [CrossRef]
  50. Cao, J.; Li, G.; Qu, D.; Li, X.; Wang, Y. Into the Seed: Auxin Controls Seed Development and Grain Yield. Int. J. Mol. Sci. 2020, 21, 1662. [Google Scholar] [CrossRef]
  51. Diener, A.C.; Gaxiola, R.A.; Fink, G.R. Arabidopsis ALF5, a Multidrug Efflux Transporter Gene Family Member, Confers Resistance to Toxins. Plant Cell 2001, 13, 1625–1638. [Google Scholar] [CrossRef] [PubMed]
  52. Li, L.; He, Z.; Pandey, G.K.; Tsuchiya, T.; Luan, S. Functional Cloning and Characterization of a Plant Efflux Carrier for Multidrug and Heavy Metal Detoxification*. J. Biol. Chem. 2002, 277, 5360–5368. [Google Scholar] [CrossRef]
  53. Alengebawy, A.; Abdelkhalek, S.T.; Qureshi, S.R.; Wang, M.-Q. Heavy Metals and Pesticides Toxicity in Agricultural Soil and Plants: Ecological Risks and Human Health Implications. Toxics 2021, 9, 42. [Google Scholar] [CrossRef]
  54. Rose, J.K.C.; Braam, J.; Fry, S.C.; Nishitani, K. The XTH Family of Enzymes Involved in Xyloglucan Endotransglucosylation and Endohydrolysis: Current Perspectives and a New Unifying Nomenclature. Plant Cell Physiol. 2002, 43, 1421–1435. [Google Scholar] [CrossRef]
  55. Ishida, K.; Yokoyama, R. Reconsidering the Function of the Xyloglucan Endotransglucosylase/Hydrolase Family. J. Plant Res. 2022, 135, 145–156. [Google Scholar] [CrossRef]
  56. Shintani, M.; Tamura, K.; Bono, H. Meta-Analysis of Public RNA Sequencing Data of Abscisic Acid-Related Abiotic Stresses in Arabidopsis thaliana. Front. Plant Sci. 2024, 15, 1343787. [Google Scholar] [CrossRef]
  57. Tamura, K.; Bono, H. Meta-Analysis of RNA Sequencing Data of Arabidopsis and Rice under Hypoxia. Life 2022, 12, 1079. [Google Scholar] [CrossRef]
Figure 1. Gene expression data of wild relatives and domesticated species of rice, tomatoes, and soybeans were collected from public databases and manually curated. To ensure comparability, data only from wild and domesticated species included in the same research project were selected and paired. Subsequently, gene expression levels were quantified and differentially expressed genes between wild relatives and domesticated species were identified.
Figure 1. Gene expression data of wild relatives and domesticated species of rice, tomatoes, and soybeans were collected from public databases and manually curated. To ensure comparability, data only from wild and domesticated species included in the same research project were selected and paired. Subsequently, gene expression levels were quantified and differentially expressed genes between wild relatives and domesticated species were identified.
Life 15 01088 g001
Figure 2. Number of plant samples and their tissue types used in this analysis. All tomato plant tissue samples were leaves, totaling 36 pairs. For soybean, the tissues included first trifoliates, seeds, and roots, with a total of 56 pairs. For rice, the tissues consisted of roots, seedlings, panicles, and seeds, totaling 21 pairs. Each pair comprised one sample from a wild relative and one sample from a domesticated species. Each sample was paired such that the wild and domesticated species shared the same tissue type and experimental conditions.
Figure 2. Number of plant samples and their tissue types used in this analysis. All tomato plant tissue samples were leaves, totaling 36 pairs. For soybean, the tissues included first trifoliates, seeds, and roots, with a total of 56 pairs. For rice, the tissues consisted of roots, seedlings, panicles, and seeds, totaling 21 pairs. Each pair comprised one sample from a wild relative and one sample from a domesticated species. Each sample was paired such that the wild and domesticated species shared the same tissue type and experimental conditions.
Life 15 01088 g002
Figure 3. Enrichment analysis and DW score scatter plot of gene groups with upregulated expression in Oryza sativa japonica and Oryza rufipogon. (a) The scatter plot represents the DW score of each gene, where positive and negative scores indicate upregulated expression in Oryza sativa japonica and Oryza rufipogon, respectively. The red, yellow, and blue dots represent the top 1%, 1–3%, and 3–5% of scores, respectively. (b) Enrichment analysis results of gene groups with upregulated expression in Oryza rufipogon. The analysis was based on DW score, utilizing approximately the top 3% of genes, which included 898 genes (−20 ≦ DW score ≦ −7). (c) Enrichment analysis results of gene groups with upregulated expression in Oryza sativa japonica. The analysis was based on DW score, utilizing approximately the top 3% of genes, which included 1073 genes (9 ≦ DW score ≦ 20).
Figure 3. Enrichment analysis and DW score scatter plot of gene groups with upregulated expression in Oryza sativa japonica and Oryza rufipogon. (a) The scatter plot represents the DW score of each gene, where positive and negative scores indicate upregulated expression in Oryza sativa japonica and Oryza rufipogon, respectively. The red, yellow, and blue dots represent the top 1%, 1–3%, and 3–5% of scores, respectively. (b) Enrichment analysis results of gene groups with upregulated expression in Oryza rufipogon. The analysis was based on DW score, utilizing approximately the top 3% of genes, which included 898 genes (−20 ≦ DW score ≦ −7). (c) Enrichment analysis results of gene groups with upregulated expression in Oryza sativa japonica. The analysis was based on DW score, utilizing approximately the top 3% of genes, which included 1073 genes (9 ≦ DW score ≦ 20).
Life 15 01088 g003
Figure 4. Enrichment analysis and DW score scatter plot of gene groups with upregulated expression in Solanum lycopersicum, Solanum pennellii, and Solanum arcanum. (a) The scatter plot represents the DW score of each gene, where positive and negative scores indicate upregulated expression in Solanum lycopersicum as well as Solanum pennellii and Solanum arcanum, respectively. The red, yellow, and blue dots represent the top 1%, 1–3%, and 3–5% of scores, respectively. (b) Enrichment analysis results of gene groups with upregulated expression in Solanum pennellii and Solanum arcanum. The analysis was based on DW score, utilizing approximately the top 1% of genes, which included 391 genes (−36 ≦ DW score ≦ −27). (c) Enrichment analysis results of gene groups with upregulated expression in Solanum lycopersicum. The analysis was based on DW scores, utilizing approximately the top 1% of genes, which included 382 genes (28 ≦ DW score ≦ 36).
Figure 4. Enrichment analysis and DW score scatter plot of gene groups with upregulated expression in Solanum lycopersicum, Solanum pennellii, and Solanum arcanum. (a) The scatter plot represents the DW score of each gene, where positive and negative scores indicate upregulated expression in Solanum lycopersicum as well as Solanum pennellii and Solanum arcanum, respectively. The red, yellow, and blue dots represent the top 1%, 1–3%, and 3–5% of scores, respectively. (b) Enrichment analysis results of gene groups with upregulated expression in Solanum pennellii and Solanum arcanum. The analysis was based on DW score, utilizing approximately the top 1% of genes, which included 391 genes (−36 ≦ DW score ≦ −27). (c) Enrichment analysis results of gene groups with upregulated expression in Solanum lycopersicum. The analysis was based on DW scores, utilizing approximately the top 1% of genes, which included 382 genes (28 ≦ DW score ≦ 36).
Life 15 01088 g004
Figure 5. Enrichment analysis and DW score scatter plot of gene groups with upregulated expression in Glycine max and Glycine soja. (a) The scatter plot represents the DW score of each gene, where positive and negative scores indicate upregulated expression in Glycine max and Glycine soja, respectively. The red, yellow, and blue dots represent the top 1%, 1–3%, and 3–5% of scores, respectively. (b) Enrichment analysis results of gene groups with upregulated expression in Glycine soja. The analysis was based on DW score, utilizing approximately the top 1% of genes, which included 582 genes (−54 ≦ DW score ≦ −20). (c) Enrichment analysis results of gene groups with upregulated expression in Glycine max. The analysis was based on DW score, utilizing approximately the top 1% of genes, which included 590 genes (20 ≦ DW score ≦ 53).
Figure 5. Enrichment analysis and DW score scatter plot of gene groups with upregulated expression in Glycine max and Glycine soja. (a) The scatter plot represents the DW score of each gene, where positive and negative scores indicate upregulated expression in Glycine max and Glycine soja, respectively. The red, yellow, and blue dots represent the top 1%, 1–3%, and 3–5% of scores, respectively. (b) Enrichment analysis results of gene groups with upregulated expression in Glycine soja. The analysis was based on DW score, utilizing approximately the top 1% of genes, which included 582 genes (−54 ≦ DW score ≦ −20). (c) Enrichment analysis results of gene groups with upregulated expression in Glycine max. The analysis was based on DW score, utilizing approximately the top 1% of genes, which included 590 genes (20 ≦ DW score ≦ 53).
Life 15 01088 g005
Figure 6. (a) UpSet plot of top 5% genes in wild relatives of rice, tomato, and soybean. (b) UpSet plot of top 5% genes in domesticated species of rice, tomato, and soybean. Enrichment analysis was performed on the top 5% of genes in both wild and domesticated species. (c) Enrichment analysis results for the 18 genes commonly upregulated in wild relatives. (d) Enrichment analysis results for the 36 genes commonly upregulated in domesticated species.
Figure 6. (a) UpSet plot of top 5% genes in wild relatives of rice, tomato, and soybean. (b) UpSet plot of top 5% genes in domesticated species of rice, tomato, and soybean. Enrichment analysis was performed on the top 5% of genes in both wild and domesticated species. (c) Enrichment analysis results for the 18 genes commonly upregulated in wild relatives. (d) Enrichment analysis results for the 36 genes commonly upregulated in domesticated species.
Life 15 01088 g006
Table 1. Genes commonly upregulated in wild relatives (Oryza rufipogon, Solanum pennellii, Solanum arcanum, and Glycine soja).
Table 1. Genes commonly upregulated in wild relatives (Oryza rufipogon, Solanum pennellii, Solanum arcanum, and Glycine soja).
Gene Stable IDGene NameDescription
AT1G10400AT1G10400UDP-glycosyltransferase superfamily protein
AT1G10810AT1G10810NAD(P)-linked oxidoreductase superfamily protein
AT1G23800ALDH2B7aldehyde dehydrogenase 2B7
AT1G60680AT1G60680NAD(P)-linked oxidoreductase superfamily protein
AT1G60690AT1G60690NAD(P)-linked oxidoreductase superfamily protein
AT1G60710ATB2NAD(P)-linked oxidoreductase superfamily protein
AT1G60730AT1G60730NAD(P)-linked oxidoreductase superfamily protein
AT1G60750AT1G60750NAD(P)-linked oxidoreductase superfamily protein
AT2G16890AT2G16890UDP-glycosyltransferase superfamily protein
AT2G46370JAR1Auxin-responsive GH3 family protein
AT2G46680HB-7Homeobox 7
AT3G01420DOX1Peroxidase superfamily protein
AT3G61890HB-12Homeobox 12
AT4G05100MYB74MYB domain protein 74
AT4G10310HKT1High-affinity K + transporter 1
AT4G21440MYB102MYB-like 102
AT5G14860AT5G14860UDP-glycosyltransferase superfamily protein
AT5G25610RD22BURP domain-containing protein
Table 2. Genes commonly upregulated in domesticated species (Oryza sativa japonica, Solanum lycopersicum, and Glycine max).
Table 2. Genes commonly upregulated in domesticated species (Oryza sativa japonica, Solanum lycopersicum, and Glycine max).
Gene Stable IDGene NameDescription
AT1G04240SHY2AUX/IAA transcriptional regulator family protein
AT1G04250AXR3AUX/IAA transcriptional regulator family protein
AT1G10960FD1Ferredoxin 1
AT1G20930CDKB2;2Cyclin-dependent kinase B2;2
AT1G60950FD22Fe-2S ferredoxin-like superfamily protein
AT1G64590AT1G64590NAD(P)-binding Rossmann-fold superfamily protein
AT1G64820AT1G64820MATE efflux family protein
AT1G66760AT1G66760MATE efflux family protein
AT1G66780AT1G66780MATE efflux family protein
AT1G76540CDKB2;1Cyclin-dependent kinase B2;1
AT2G04040DTX1MATE efflux family protein
AT2G04050AT2G04050MATE efflux family protein
AT2G04066AT2G04066MATE efflux family protein
AT2G04070AT2G04070MATE efflux family protein
AT2G04080AT2G04080MATE efflux family protein
AT2G04090AT2G04090MATE efflux family protein
AT2G04100AT2G04100MATE efflux family protein
AT2G13610ABCG5ABC-2 type transporter family protein
AT2G16850PIP2;8Plasma membrane intrinsic protein 2;8
AT2G36870XTH32Xyloglucan endotransglucosylase/hydrolase 32
AT2G42380BZIP34Basic-leucine zipper (bZIP) transcription factor family protein
AT2G48020AT2G48020Major facilitator superfamily protein
AT3G23030IAA2Indole-3-acetic acid inducible 2
AT3G23050IAA7Indole-3-acetic acid 7
AT3G58120BZIP61Basic-leucine zipper (bZIP) transcription factor family protein
AT4G14550IAA14Indole-3-acetic acid inducible 14
AT4G14560IAA1Indole-3-acetic acid inducible
AT4G17340TIP2;2Tonoplast intrinsic protein 2;2
AT4G25000AMY1Alpha-amylase-like protein
AT4G35100PIP3Plasma membrane intrinsic protein 3
AT5G10130AT5G10130Pollen Ole e 1 allergen and extension family protein
AT5G10180SULTR2;1Sulfate transporter 2;1
AT5G10570AT5G10570Basic helix-loop-helix (bHLH) DNA-binding superfamily protein
AT5G43700ATAUX2-11AUX/IAA transcriptional regulator family protein
AT5G47450TIP2;3Tonoplast intrinsic protein 2;3
AT5G65640bHLH093Beta HLH protein 93
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yumiya, M.; Bono, H. Meta-Analysis of Wild Relatives and Domesticated Species of Rice, Tomato, and Soybean Using Publicly Available Transcriptome Data. Life 2025, 15, 1088. https://doi.org/10.3390/life15071088

AMA Style

Yumiya M, Bono H. Meta-Analysis of Wild Relatives and Domesticated Species of Rice, Tomato, and Soybean Using Publicly Available Transcriptome Data. Life. 2025; 15(7):1088. https://doi.org/10.3390/life15071088

Chicago/Turabian Style

Yumiya, Makoto, and Hidemasa Bono. 2025. "Meta-Analysis of Wild Relatives and Domesticated Species of Rice, Tomato, and Soybean Using Publicly Available Transcriptome Data" Life 15, no. 7: 1088. https://doi.org/10.3390/life15071088

APA Style

Yumiya, M., & Bono, H. (2025). Meta-Analysis of Wild Relatives and Domesticated Species of Rice, Tomato, and Soybean Using Publicly Available Transcriptome Data. Life, 15(7), 1088. https://doi.org/10.3390/life15071088

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop