Next Article in Journal
Pollination of Enclosed Avocado Trees by Blow Flies (Diptera: Calliphoridae) and a Hover Fly (Diptera: Syrphidae)
Previous Article in Journal
A Synopsis of Two Decades of Arthropod Related Research at the Forensic Anthropology Research Facility (FARF), Texas State University (TXST), San Marcos, Texas, USA
Previous Article in Special Issue
Insights into Lead Toxicity and Detoxification Mechanisms in the Silkworm, Bombyx mori
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Alternative Characterizations of Methyl Lucidone-Responsive Differentially Expressed Genes in Drosophila melanogaster Using DEG-by-Index Ratio Transformation

1
Core Research Facility & Analysis Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
2
Department of Microbiology and Molecular Biology, Chungnam National University, Daejeon 34134, Republic of Korea
3
Department of Southern Area Crop Science, National Institute of Crop Science, Rural Development Administration, Miryang 50424, Republic of Korea
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Insects 2025, 16(9), 898; https://doi.org/10.3390/insects16090898
Submission received: 1 July 2025 / Revised: 18 August 2025 / Accepted: 26 August 2025 / Published: 27 August 2025
(This article belongs to the Special Issue Insect Transcriptomics)

Simple Summary

Standard normalization methods—such as relative log expression (RLE) via DESeq2 and trimmed mean of M-values (TMM) via edgeR—failed to consistently identify differentially expressed genes (DEGs) in our five primary RNA-Seq experiments and four validation datasets. This limitation prompted us to develop the DEG-by-index ratio transformation (DiRT), which is a novel normalization approach that significantly improves DEG detection and enables reliable validation across independent experiments.

Abstract

Identifying robust differentially expressed genes (DEGs) in RNA-Seq data remains challenging under variable experimental conditions. To address this, we performed five independent RNA-Seq experiments using Drosophila melanogaster larvae treated with methyl lucidone—a putative juvenile hormone disruptor—and compared conventional normalization methods (relative log expression [RLE] via DESeq2 and trimmed mean of M-values [TMM] via edgeR) against our novel DEG-by-index ratio transformation (DiRT). DESeq2 identified two significant DEGs, while edgeR detected none; both methods showed limited validation across four additional independent experiments. In contrast, DiRT identified a distinct set of numerous DEGs with improved reproducibility and reliable validation. KEGG pathway analysis revealed that DiRT-derived DEGs were functionally enriched in pathways related to methyl lucidone detoxification, including the proteasome, drug metabolism, and xenobiotic metabolism mediated by cytochrome P450 and other enzymes. Although DESeq2 and edgeR remain widely used standard methods, DiRT offers a novel complementary approach to enhance DEG characterization in RNA-Seq studies affected by experimental variability.

1. Introduction

Methyl lucidone, a plant-derived diterpene from species like Lindera erythrocarpa, acts as a juvenile hormone (JH) disruptor in insects [1,2,3]. JHs are critical regulators of insect development, metamorphosis, and reproduction [4]. Disruption of JH signaling can lead to developmental arrest and mortality in insects [5].
RNA sequencing (RNA-Seq) has emerged as a powerful alternative to RT-qPCR for transcriptome-wide analysis [6], enabling comprehensive detection of differentially expressed genes (DEGs). However, RNA-Seq data exhibit significant technical and biological variations, necessitating normalization for reliable DEGs characterization. Widely used methods include DESeq2 (relative log expression, RLE) [7], edgeR (trimmed mean of M-values, TMM) [8], and limma/voom (precision weights derived from nonparametric mean–variance modeling) [9,10].
While traditional RNA-Seq workflows—typically involving three or more independent experiments with RLE/TMM normalization and RT-qPCR validation—have successfully identified DEGs in many studies, this approach failed in our investigation of methyl lucidone effects. Despite performing five independent RNA-Seq experiments using Drosophila melanogaster larvae, conventional methods only identified either two DEGs (DeSeq2) or none (edgeR), with no validation in four additional RNA-Seq datasets.
To address this limitation, we developed the DEG-by-index ratio transformation (DiRT): a novel method grounded in compositional data analysis (CDA) principles [11,12,13]. In DiRT, an index gene has an expression profile that is highly correlated with a target DEG under control (untreated) conditions, thereby serving as a stable internal reference unique to that specific DEG. To identify candidate index genes, we calculated the normalized standard deviation (NSD) of the expression ratio between each gene and every other gene across the control samples and ranked all ~10,000 genes accordingly. For each gene, the 10 candidates with the lowest NSD values—indicating the greatest similarity in expression under control conditions—were selected. We empirically observed that DEG–index gene pairings drawn from this low-NSD set consistently produced DiRT ratios that maximized discrimination between control and treatment groups. From the resulting ~100,000 (10 × 10,000) DEG–index gene combinations, the optimal pairing for each DEG was chosen based on the highest separation between groups. Unlike traditional single-reference normalization methods, which apply the same housekeeping or reference gene to all targets, the index gene in DiRT is selected individually for each DEG to preserve DEG-specific co-expression patterns. Crucially, the index gene must remain non-responsive to the experimental perturbation—in this case, methyl lucidone treatment—ensuring that changes in the DEG’s expression reflect true treatment effects rather than shared regulation or technical noise. This DEG-specific pairing provides a stable denominator for ratio-based normalization, thereby minimizing inter-sample variability and enhancing reproducibility across independent experiments.
We hypothesized that this gene-specific normalization strategy would enhance the robustness and reproducibility of DEG detection, particularly under conditions of high experimental variability, as supported by KEGG pathway analysis. Our results demonstrate that this approach can reveal reproducible, biologically relevant DEGs that remain undetected using conventional global normalization methods.

2. Materials and Methods

2.1. Drosophila melanogaster Sample Acquisition and Preparation for RNA-Seq

Twenty male and twenty female Drosophila melanogaster adults were placed in individual vials, each containing 3 g of artificial diet mixed with either 0.5% methyl lucidone (w/v) or 0.5% ethanol (w/v) as a control. After 2 days of oviposition, adult flies were removed, and eggs were allowed to develop. Second-instar larvae (4–5 days post-oviposition) were collected from each vial. Total RNA was isolated using the RNeasy isolation kit (Qiagen, Hilden, Germany) following the manufacturer’s protocols. RNA libraries were prepared for Illumina sequencing (San Diego, CA, USA).

2.2. Data Processing of 18 D. melanogaster RNA-Seqs

After sequencing, raw FASTQ files from all batches were processed through Galaxy server [14,15] pipelines. Single-end reads underwent quality control and adapter trimming with fastp (Galaxy Version 1.0.1+galaxy1) [16], followed by alignment to the dm6 genome [17] using HISAT2 (Galaxy Version 2.2.1+galaxy1) [18]. The mapped BAM files were then quantified with featureCounts (Galaxy Version 2.1.1+galaxy0) [19] to obtain the read count (RC) of each transcript (dm6 NCBI reference genes). General information on the 18 RNA-Seq datasets, including NCBI SRA accession numbers, is shown in Table 1.

2.3. DiRT Analysis Workflow

For the DiRT analysis, we selected the 10,000 most abundantly expressed genes from 17,868 annotated D. melanogaster (dm6) transcriptome genes based on their RCs across 10 RNA-Seq datasets (five control: C1–C5; five treated: T1–T5). This threshold minimized the inclusion of low-expression genes, which could produce unstable RC ratios and result in randomly selected or unreliable index genes.
For each target gene, the RC ratios were calculated by dividing its read count by that of every other gene in the dataset. From these ratios, we identified 10 gene pairs with the lowest normalized standard deviation (NSD) scores in the control samples as DEG–index or index–DEG pair candidates. This approach ensured the stability of the RC ratio under baseline conditions.
Using a custom Python 3 script (https://github.com/shinwoongg/DiRT-normalization), we generated a comprehensive DiRT candidate database comprising 100,000 columns. DiRT-normalized values (RCtarget gene/RCanother gene) were used to compute p-values using a two-sample, two-tailed t-test, assuming equal variance between the control and treated groups.
To reduce false positives from DEG-DEG pairs, we excluded DiRTs with denominator genes that showed differential expressions (CPM-based p-value < 0.1). When multiple DiRT values were available for a single target gene, we retained only one entry with the lowest adjusted p-value (Benjamini–Hochberg correction, rank = 9988). This yielded a final dataset consisting of 9988 DiRT-normalized expression levels (see Supplementary Data S1).

2.4. DESeq2/edgeR Analyses

We used R version 4.3.1 together with the Bioconductor packages DESeq2 (v1.40.2) and edgeR (v3.42.4) for differential expression analyses. Parallel to the DiRT pipeline, we selected the same 10,000 most abundantly expressed genes from the D. melanogaster (dm6) transcriptome to ensure consistency across methods.
For the DESeq2 analysis, the raw RCs were normalized using the RLE method. Differential expression was assessed using the Wald test, and p-values were adjusted for multiple testing using the Benjamini–Hochberg false discovery rate (FDR) correction, as implemented within the DESeq2 framework (see Supplementary Data S2).
For the edgeR analysis, normalization was performed using the trimmed mean of M-values (TMM) method. Differential expressions were assessed using a two-sample, two-tailed t-test identical to the statistical approach applied in the DiRT analysis, assuming equal variance. p-values were adjusted using the Benjamini–Hochberg method, ranking 10,000 genes. Adjusted p-values greater than 1 were indicated as NA (see Supplementary Data S3).

2.5. Heatmaps

Heatmaps were generated to visualize the expression profiles of 50 genes selected by three different normalization methods: DiRT, DESeq2, and edgeR. For the top 50 DiRTs, numerator genes with the lowest adjusted p-values were used; for DESeq2, genes were selected based on the lowest p-values obtained from the Wald test; and for edgeR, genes with the lowest p-values from the t-test were chosen. Expression values were normalized relative to the mean expression across all RNA-Seq samples, and heatmaps were created using Python libraries including matplotlib, seaborn, and pandas. Red indicates expression values greater than 1 (above average) and blue represents values lower than 1 (below average).

2.6. Methods for p-Value Calculation and Adjustment

p-values were calculated using a two-tailed, two-sample t-test assuming equal variance; the control samples (C1–C5) were compared to treated samples (T1–T5) based on either TMM-normalized values (edgeR) or DiRT-normalized values. Multiple testing corrections were performed using the Benjamini–Hochberg procedure. The number of genes ranked for the adjustment was 9988 for DiRT. For the DESeq2 package, p-values and adjusted p-values were calculated using the internal Wald test. None of the adjusted p-values from the edgeR analysis and only two genes from the DESeq2 analysis reached significance at a threshold of 0.05; therefore, unadjusted p-values were used for subsequent analyses.

2.7. KEGG Pathway Analysis of DiRT-Derived DEGs

From the filtered DiRTs with validation-adjusted p-values below 0.05 (Supplementary Data S4), we manually identified the apparent DEGs. These DEGs were either consistently upregulated or downregulated across all nine independent experiments (Supplementary Data S5). The remaining putative DEGs in the DiRT set lacked complete directional concordance. KEGG pathway analysis was then performed separately for the upregulated DEGs and downregulated DEGs, using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) [20].

3. Results

3.1. Analayes of Methyl-Lucidone-Treated D. melanogaster RNA-Seq Using DESeq2 and edgeR

We conducted RNA-Seq on second-instar D. melanogaster larvae treated with either methyl lucidone or ethanol (as a control) across five independent experiments. DESeq2 identified the following two DEGs at adjusted values of p < 0.05: CG14265 and Gyc76C (Supplementary Data S2). EdgeR detected no significant DEGs under the same threshold (Supplementary Data S3).
Validation using four additional independent RNA-Seq datasets (four control and four treated datasets) showed no significant differential expressions for CG14265 and Gyc76C (Figure 1a,b). Similarly, the top DEG candidate genes were identified using edgeR; the lowest p-values (CG7414 and CG13197) exhibited no significance during validation (Figure 1c,d).

3.2. DEG-by-Index Ratio Transformation (DiRT) Normalization Identifies Reproducible DEGs

In contrast to the limited number of DEGs identified using DESeq2 and edgeR, DiRT normalization identified 1608 DiRTs with the ability to distinguish between control and treatment groups out of 9988 total DiRTs (Supplementary Data S1; adjusted p < 0.05). Heatmap analysis revealed clear separation between control and treated samples using DiRT-normalized values (Figure 2a); this pattern was not observed with DESeq2 (Figure 2b) or edgeR (Figure 2c). Moreover, the top 50 DiRT-derived DEGs—defined as numerator DEG–denominator index gene pairs—showed minimal overlap with the top candidates identified using DESeq2 or edgeR (Figure 2d).
Among the 1608 DiRTs, 280 met the validation criterion with adjusted p-values below 0.05 (Supplementary Data S4). DiRT normalization and further validation identified methyl lucidone-responsive DEGs that are undetectable using conventional methods. We identified two examples of DiRTs with the lowest adjusted p-values that also satisfied the validation criterion (p < 0.01) in Figure 3: (a) CG9259/bond and (b) tim/MTF-1. The validation of both DiRTs demonstrated complete consistency across the four control and treated samples (Figure 3).
The DEGs associated with these DiRTs were CG9259 (Figure 4a) and timeless (tim) (Figure 4d), both of which showed consistent downregulation across all nine RNA-Seq datasets (five discovery and four validation datasets). In contrast, their respective index genes, bond (Figure 4b) and MTF-1 (Figure 4e), displayed inconsistent or negligible differences in expression between control and treated samples. Control samples demonstrated strong correlations in expression between the DEG–index gene pairs, including the validated controls and discovery controls (e.g., CG9259/bond: Figure 4c; tim/MTF-1: Figure 4f).
Among the top 10 DiRTs with the lowest adjusted p-values meeting the validation criterion (p < 0.01), we identified eight DiRT-associated DEGs that exhibited consistent up- or downregulation across all nine independent RNA-Seq experiments (Supplementary Data S5; indicated in yellow). The respective index genes of these DiRTs displayed inconsistent or negligible differences in expression between control and treated samples (Supplementary Data S5). In these eight cases, the DiRT values of the DEG-index pairs were derived from either RCDEG/RCindex or RCindex/RCDEG calculations. The remaining two DiRTs consisted of DEG–DEG pairs: one showed consistent expression changes across all nine datasets (shown in green), and the other showed consistency in eight datasets (shown in orange). Notably, in both DEG–DEG pairs, the two genes were oppositely regulated.

4. Discussion

As this study’s primary goal was technical validation of DiRT, we did not pursue the deep biological interpretation of individual DEGs. Nevertheless, the DEGs identified via DiRT normalization exhibited distinct molecular characteristics.
From 280 validated DiRTs, we manually identified 109 strictly defined DEGs. These DEGs exhibited consistent regulation—86 entirely upregulated and 23 entirely downregulated genes—across all nine independent experiments (Supplementary Data S5). We conducted KEGG pathway analysis using DAVID, which revealed significant (adjusted p < 0.05) gene enrichment in drugs and xenobiotic metabolism via cytochrome P450 or other enzymes in the proteasome and methyl lucidone detoxification pathways (Table 2).
KEGG analysis of the 23 downregulated DEGs did not reveal significant pathway enrichment, likely due to the limited number of genes in this set. The number of downregulated genes—and thus the potential for detecting enriched pathways—could increase substantially if less stringent criteria for DEG definition were applied.
In insects, JH signaling is mediated by the Methoprene-tolerant bHLH-PAS receptor (Met) together with the coactivator Taiman (also known as SRC or FISC), and this complex activates JH-responsive genes such as Kr-h1 [21]. In the mosquito Aedes aegypti, an MET/CYCLE (CYC) bHLH-PAS heterodimer links JH to circadian outputs in a light-dependent manner [22]. In honeybees (Apis mellifera), the juvenile hormone was shown to strengthen circadian rhythms and accelerate rhythm development in newly emerged worker bees [23]. In Drosophila, tim encodes a core clock protein that forms a complex with PERIOD (PER) to repress the transcriptional activators CLOCK–CYCLE (CLK–CYC) [24]. Our finding that tim is downregulated by the JH disruptor methyl lucidone suggests that JH signaling can intersect with the circadian clock through shared bHLH-PAS transcription factors and hormone-dependent modulation of clock gene expression, as seen in MET/CYC-mediated regulation in mosquitoes and hormone-induced changes in circadian outputs in other insects.
We observed that DEG–DEG pairs with opposite regulation could generate DiRT values with low adjusted p-values (Supplementary Data S6). Such combinations may represent artificial pairings that coincidentally satisfy the selection criteria. To minimize their occurrence, we applied a Wald test p-value threshold (>0.1) to the index gene; however, this approach did not fully eliminate DEG–DEG pairings. Nonetheless, only two of the top 10 DiRTs fell into this category, indicating that the majority of DiRTs are composed of genuine DEG–index or index–DEG pairs. This phenomenon likely occurs because dividing the expression of an upregulated gene by that of a downregulated gene (or vice versa) can amplify the treatment–control ratio, producing exceptionally small adjusted p-values even in the absence of a stable reference. While such artificial pairings are statistically possible, their low frequency in our dataset suggests minimal impact on overall DiRT performance. In many practical applications, the inclusion of a small number of these pairs is unlikely to affect the main biological conclusions, as they significantly contribute to the ranking of top candidates. Nevertheless, awareness of their potential presence remains important for accurate interpretation of DiRT results.
An additional consideration is pleiotropy, where a single DEG may influence multiple biological processes or pathways simultaneously [25]. In the context of methyl lucidone–responsive DEGs, such pleiotropic effects could mean that the observed transcriptional changes are driven by overlapping roles in different pathways, making direct functional attributions more challenging. Although this complexity does not alter the statistical robustness of DiRT, it highlights the need for careful biological interpretation, particularly for DEGs with known multifunctional roles.
The differences between Figure 1 and Figure 4 reflect the effect of normalizing with a different number of RNA-Seq datasets—10 discovery datasets in the former versus all 18 datasets in the latter. For conventional normalization methods such as RLE and TMM, scaling factors are estimated from the entire set of samples under analysis. The addition of new datasets can alter these scaling factors, particularly if the added samples differ in sequencing depth, library composition, or batch-specific biases, leading to subtle shifts in normalized values for all samples. In contrast, DiRT normalization is based on the DEG–index gene ratios computed within each dataset, which reduces the influence of unrelated samples and minimizes the propagation of batch effects from additional datasets. Although both approaches are influenced by sample composition, DiRT remained relatively stable across dataset expansions. This highlights its potential advantage in multi-batch RNA-Seq studies.
DiRT normalization identified numerous DEGs with successful validation, demonstrating its capability to detect methyl lucidone-responsive genes. As the primary focus of this study was the technical validation of DiRT, we did not pursue an in-depth biological interpretation of individual DEGs. Nevertheless, DiRT normalization identified 1608 putative DEGs across five discovery RNA-Seq experiments; 287 of these DiRTs were validated using four independent validation datasets. We expect that a substantial portion of these 287 DiRTs represent genuine DEGs, given that DEGs from 8 of the top 10 DiRTs showed consistent regulation across all nine RNA-Seq experiments. The potential biological relevance of these DEGs to methyl lucidone treatment remains to be explored in future studies.
The partial or failed validation observed in some DiRTs likely stems from the limited scope of initial discovery experiments; this may prove incapable of fully capturing biological and technical variability. Nevertheless, DiRT yielded markedly higher DEG discovery rates compared to conventional methods. Unlike global normalization approaches, DiRT’s gene-specific strategy effectively mitigated batch effects in cases where DESeq2 and edgeR failed.

5. Conclusions

In conclusion, we developed the DiRT method, a gene-specific normalization approach that identifies robust DEGs in D. melanogaster RNA-seq datasets under various experimental conditions. By leveraging pairwise normalization with dynamically selected index references, DiRT effectively minimizes background noise and significantly improves the reproducibility of DEG detection across datasets. Compared to conventional normalization methods such as DESeq2 and edgeR, DiRT identified distinct sets of DEGs with validation and clearer expression patterns, demonstrating its applicability to complex systems where experimental heterogeneity obscures transcriptional responses. To further establish its generality, DiRT should be tested using RNA-seq datasets from additional species and diverse experimental designs. Moreover, functional studies—such as targeted knockdowns of top-ranked DiRT-identified DEGs using RNA interference or CRISPR, followed by phenotypic analysis or biochemical assays—could provide direct evidence linking transcriptional changes to functional outcomes.

Supplementary Materials

The following supporting information can be downloaded from https://www.mdpi.com/article/10.3390/insects16090898/s1, Data S1. DiRT-normalized measures of 10 RNA-Seq; Data S2. RLE-normalized measures of 10 RNA-Seq; Data S3. TMM-normalized measures of 10 RNA-Seq; Data S4. List of 280 validated DiRTs meeting the validation criterion (p-value < 0.01) and their DiRT values in nine RNA-Seq experiments; Data S5. 109 strictly defined DiRT-derived DEGs which exhibited consistent regulation—86 entirely upregulated and 23 entirely downregulated genes—across all nine independent experiments; Data S6. DEG/index or index/DEG pairs and their RLE-normalized expression values for the top 10 DiRTs with the lowest adjusted p-values.

Author Contributions

Conceptualization, S.W.S. and H.-W.O.; software, S.W.S.; formal analysis, S.W.S., J.A.K., J.H.J., K.P. and S.L.; writing—original draft preparation, S.W.S.; writing—review and editing, J.A.K., S.L., J.H.J., K.P. and H.-W.O.; project administration, S.W.S. and H.-W.O.; funding acquisition, H.-W.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Korea Research Institute of Bioscience and Biotechnology (KRIBB) Research Initiative Program (KGM1022511) grant awarded to H.-W.O., and the Brain Pool Program funded by the Ministry of Science and ICT through the National Research Foundation of Korea (2022H1D3A2A01053247) grant awarded to H.-W.O.

Data Availability Statement

The RNA-Seq datasets were deposited in the NCBI SRA database. The accession numbers are listed in Table 1. The Python script used to generate the DiRT candidate databases is available in the GitHub repository [https://github.com/shinwoongg/SHIN].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
DEGdifferentially expressed gene
RLErelative log expression
TMMtrimmed mean of M-values
DiRTDEG-by-index ratio transformation
JHjuvenile hormone
RNA-SeqRNA sequencing
CDAcompositional data analysis
NSDnormalized standard deviation

References

  1. Lee, S.H.; Oh, H.W.; Fang, Y.; An, S.B.; Park, D.S.; Song, H.H.; Oh, S.R.; Kim, S.Y.; Kim, S.; Kim, N.; et al. Identification of plant compounds that disrupt the insect juvenile hormone receptor complex. Proc. Natl. Acad. Sci. USA 2015, 112, 1733–1738. [Google Scholar] [CrossRef] [PubMed]
  2. Shin, S.W.; Jeon, J.H.; Jeong, S.A.; Kim, J.-A.; Park, D.-S.; Shin, Y.; Oh, H.-W. A plant diterpene counteracts juvenile hormone-mediated gene regulation during Drosophila melanogaster larval development. PLoS ONE 2018, 13, e0200706. [Google Scholar] [CrossRef] [PubMed]
  3. Shin, S.-W.; Jeon, J.-H.; Kim, J.-A.; Park, D.-S.; Shin, Y.-J.; Oh, H.-W. Inducible expression of several Drosophila melanogaster genes encoding juvenile hormone binding proteins by a plant diterpene secondary metabolite, methyl lucidone. Insects 2022, 13, 420. [Google Scholar] [CrossRef] [PubMed]
  4. Riddiford, L.M. How does juvenile hormone control insect metamorphosis and reproduction? Gen. Comp. Endocrinol. 2012, 179, 477–484. [Google Scholar] [CrossRef]
  5. Jindra, M.; Bellés, X.; Shinoda, T. Molecular basis of juvenile hormone signaling. Curr. Opin. Insect Sci. 2015, 11, 39–46. [Google Scholar] [CrossRef]
  6. Byron, S.A.; Van Keuren-Jensen, K.R.; Engelthaler, D.M.; Carpten, J.D.; Craig, D.W. Translating RNA sequencing into clinical diagnostics: Opportunities and challenges. Nat. Rev. Genet. 2016, 17, 257–271. [Google Scholar] [CrossRef]
  7. Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef]
  8. Robinson, M.D.; McCarthy, D.J.; Smyth, G.K. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010, 26, 139–140. [Google Scholar] [CrossRef]
  9. Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef]
  10. Law, C.W.; Chen, Y.; Shi, W.; Smyth, G.K. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014, 15, R29. [Google Scholar] [CrossRef]
  11. Aitchison, J. The statistical analysis of compositional data. J. R. Stat. Soc. Ser. B Methodol. 1982, 44, 139–177. [Google Scholar] [CrossRef]
  12. Gloor, G.B.; Macklaim, J.M.; Vu, M.; Fernandes, A.D. Compositional uncertainty should not be ignored in high-throughput sequencing data analysis. Austrian J. Stat. 2016, 45, 73–87. [Google Scholar] [CrossRef]
  13. Quinn, T.P.; Erb, I.; Gloor, G.; Notredame, C.; Richardson, M.F.; Crowley, T.M. A field guide for the compositional analysis of any-omics data. Gigascience 2019, 8, giz107. [Google Scholar] [CrossRef] [PubMed]
  14. Hoyt, J.M.; Wilson, S.K.; Kasa, M.; Rise, J.S.; Topalidou, I.; Ailion, M. The SEK-1 p38 MAP kinase pathway modulates Gq signaling in Caenorhabditis elegans. G3 2017, 7, 2979–2989. [Google Scholar] [CrossRef] [PubMed]
  15. Bhargava, M.; Viken, K.; Wang, Q.; Jatap, P.; Bitterman, P.; Ingbar, D.; Wendt, C. Bronchoalveolar lavage fluid protein expression in acute respiratory distress syndrome provides insights into pathways activated in subjects with different outcomes. Sci. Rep. 2017, 7, 7464. [Google Scholar] [CrossRef]
  16. Chen, S.; Zhou, Y.; Chen, Y.; Gu, J. fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 2018, 34, i884–i890. [Google Scholar] [CrossRef]
  17. dos Santos, G.; Schroeder, A.J.; Goodman, J.L.; Strelets, V.B.; Crosby, M.A.; Thurmond, J.; Emmert, D.B.; Gelbart, W.M.; Flybase Consortium. FlyBase: Introduction of the Drosophila melanogaster Release 6 reference genome assembly and large-scale migration of genome annotations. Nucleic Acids Res. 2015, 43, D690–D697. [Google Scholar] [CrossRef]
  18. Sirén, J.; Välimäki, N.; Mäkinen, V. Indexing graphs for path queries with applications in genome research. IEEE ACM Trans. Comput. Biol. Bioinform. 2014, 11, 375–388. [Google Scholar] [CrossRef]
  19. Liao, Y.; Smyth, G.K.; Shi, W. featureCounts: An efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 2014, 30, 923–930. [Google Scholar] [CrossRef]
  20. Dennis, G.; Sherman, B.T.; Hosack, D.A.; Yang, J.; Gao, W.; Lane, H.C.; Lempicki, R.A. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 2003, 4, R60. [Google Scholar] [CrossRef]
  21. Zhang, X.; Li, S.; Liu, S. Juvenile Hormone Studies in Drosophila melanogaster. Front. Physiol. 2022, 12, 785320. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  22. Shin, S.W.; Zou, Z.; Saha, T.T.; Raikhel, A.S. bHLH-PAS heterodimer of methoprene-tolerant and Cycle mediates circadian expression of juvenile hormone-induced mosquito genes. Proc. Natl. Acad. Sci. USA 2012, 109, 16576–16581. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  23. Pandey, A.; Motro, U.; Bloch, G. Juvenile Hormone Affects the Development and Strength of Circadian Rhythms in Young Bumble Bee (Bombus terrestris) Workers. Neurobiol. Sleep Circadian Rhythm. 2020, 9, 100056. [Google Scholar] [CrossRef]
  24. Panda, S.; Hogenesch, J.B.; Kay, S.A. Circadian rhythms from flies to human. Nature 2002, 417, 329–335. [Google Scholar] [CrossRef] [PubMed]
  25. Li, Y.; Zhang, J. Transcriptomic and proteomic effects of gene deletion are not evolutionarily conserved. Genome Res. 2025, 35, 512–521. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Figure 1. Validation of top DEG candidates identified using DESeq2 and edgeR. Expression profiles of the two most significant DEGs from (a,b) DESeq2 (CG14265, Gyc76C) and (c,d) edgeR (CG7414, CG13197) analyses. Initial analysis used 10 RNA-Seq datasets (C1–C5 controls, T1–T5 treated with methyl lucidone; blue/red bars). Validation in 8 independent datasets (C6–C9 controls, T6–T9 treated; light blue/orange bars). RLE: relative log expression (used in DESeq2); TMM: trimmed mean of M-values (used in edgeR); DEG: differentially expressed gene.
Figure 1. Validation of top DEG candidates identified using DESeq2 and edgeR. Expression profiles of the two most significant DEGs from (a,b) DESeq2 (CG14265, Gyc76C) and (c,d) edgeR (CG7414, CG13197) analyses. Initial analysis used 10 RNA-Seq datasets (C1–C5 controls, T1–T5 treated with methyl lucidone; blue/red bars). Validation in 8 independent datasets (C6–C9 controls, T6–T9 treated; light blue/orange bars). RLE: relative log expression (used in DESeq2); TMM: trimmed mean of M-values (used in edgeR); DEG: differentially expressed gene.
Insects 16 00898 g001
Figure 2. Heatmaps of the top 50 measures identified using DiRT, DESeq2, and edgeR, showing minimal overlap between DiRT-derived DEGs and the genes with the lowest p-values obtained using DESeq2/edgeR (a) Heatmap of the top 50 DiRT-normalized measures ranked based on the lowest p-values. (b) Heatmap of the top 50 genes identified using DESeq2 based on the lowest p-values. (c) Heatmap of the top 50 genes identified using edgeR based on the lowest p-values. (d) Venn diagram illustrating the top 50 DiRT-derived DEGs with the lowest p-values showing minimal overlap. The top 50 genes were identified using DESeq2 or edgeR based on the lowest p-values. All measures in (ac) were normalized to the average values across 10 RNA-Seq samples (C1–C5 for controls; T1–T5 for treated samples). Red indicates values greater than 1, and blue indicates values less than 1. DEG: differentially expressed gene; DiRT: DEG-by-index ratio transformation.
Figure 2. Heatmaps of the top 50 measures identified using DiRT, DESeq2, and edgeR, showing minimal overlap between DiRT-derived DEGs and the genes with the lowest p-values obtained using DESeq2/edgeR (a) Heatmap of the top 50 DiRT-normalized measures ranked based on the lowest p-values. (b) Heatmap of the top 50 genes identified using DESeq2 based on the lowest p-values. (c) Heatmap of the top 50 genes identified using edgeR based on the lowest p-values. (d) Venn diagram illustrating the top 50 DiRT-derived DEGs with the lowest p-values showing minimal overlap. The top 50 genes were identified using DESeq2 or edgeR based on the lowest p-values. All measures in (ac) were normalized to the average values across 10 RNA-Seq samples (C1–C5 for controls; T1–T5 for treated samples). Red indicates values greater than 1, and blue indicates values less than 1. DEG: differentially expressed gene; DiRT: DEG-by-index ratio transformation.
Insects 16 00898 g002
Figure 3. Discovery and validation of DiRTs. DiRT values—calculated as the ratio of read counts between a putative DEG and its corresponding index gene—were determined using five initial RNA-Seq experiments and subsequently validated in four independent RNA-Seq datasets. Both panels (a,b) illustrate cases of complete validation, with dotted lines representing the predefined threshold for validation. The Y-axis denotes DiRT values. DEG: differentially expressed gene; DiRT: DEG-by-index ratio transformation.
Figure 3. Discovery and validation of DiRTs. DiRT values—calculated as the ratio of read counts between a putative DEG and its corresponding index gene—were determined using five initial RNA-Seq experiments and subsequently validated in four independent RNA-Seq datasets. Both panels (a,b) illustrate cases of complete validation, with dotted lines representing the predefined threshold for validation. The Y-axis denotes DiRT values. DEG: differentially expressed gene; DiRT: DEG-by-index ratio transformation.
Insects 16 00898 g003
Figure 4. Expression patterns of DiRT-identified DEGs and their corresponding index genes. Expression profiles of two DiRT-associated DEGs and their corresponding index genes, selected as the top two DiRTs with the lowest adjusted p-values from the initial screening performed across 10 RNA-Seq datasets; they were validated with a p-value threshold of <0.01 using eight additional RNA-Seq datasets. Panels (a) and (d) show the RLE-normalized expression of DEGs CG9259 and tim, respectively, while panels (b,e) show the RLE-normalized expression of their corresponding index genes bond and MTF-1. Blue bars represent control samples from both the discovery set (C1–C5) and the validation set (C6–C9); red bars represent methyl lucidone-treated samples from the discovery set (T1–T5) and validation set (T6–T9). Panels (c,f) illustrate strong expression correlations between DEG–index gene pairs across all nine control samples, evident in both discovery and validation controls. To evaluate these correlations (y-axis: normalized DESeq2), RLE-normalized expression values were further scaled relative to the average RLE values computed across nine independent control RNA-Seq datasets. RLE: relative log expression (used in DESeq2); DEG: differentially expressed gene; DiRT: DEG-by-index ratio transformation.
Figure 4. Expression patterns of DiRT-identified DEGs and their corresponding index genes. Expression profiles of two DiRT-associated DEGs and their corresponding index genes, selected as the top two DiRTs with the lowest adjusted p-values from the initial screening performed across 10 RNA-Seq datasets; they were validated with a p-value threshold of <0.01 using eight additional RNA-Seq datasets. Panels (a) and (d) show the RLE-normalized expression of DEGs CG9259 and tim, respectively, while panels (b,e) show the RLE-normalized expression of their corresponding index genes bond and MTF-1. Blue bars represent control samples from both the discovery set (C1–C5) and the validation set (C6–C9); red bars represent methyl lucidone-treated samples from the discovery set (T1–T5) and validation set (T6–T9). Panels (c,f) illustrate strong expression correlations between DEG–index gene pairs across all nine control samples, evident in both discovery and validation controls. To evaluate these correlations (y-axis: normalized DESeq2), RLE-normalized expression values were further scaled relative to the average RLE values computed across nine independent control RNA-Seq datasets. RLE: relative log expression (used in DESeq2); DEG: differentially expressed gene; DiRT: DEG-by-index ratio transformation.
Insects 16 00898 g004aInsects 16 00898 g004b
Table 1. General information and SRA submission numbers of 18 Drosophila melanogaster RNA-seq data.
Table 1. General information and SRA submission numbers of 18 Drosophila melanogaster RNA-seq data.
Sample IDReads Assigned to dm6-Annotated GenesNCBI Accession #Condition
C167,399,437SRR22894279EtOH-treated
C294,044,531SRR22891343EtOH-treated
C392,115,653SRR22891368EtOH-treated
C468,365,272SRR22891367EtOH-treated
C574,929,527SRR22891357EtOH-treated
T150,864,597SRR22891355Methyl lucidone-treated
T291,480,523SRR22891354Methyl lucidone-treated
T392,471,264SRR22891353Methyl lucidone-treated
T467,774,651SRR22891366Methyl lucidone-treated
T576,983,129SRR22891362Methyl lucidone-treated
C665,937,883SRR22891360EtOH-treated
C762,985,695SRR22891359EtOH-treated
C863,522,838SRR22891358EtOH-treated
C969,304,669SRR22891356EtOH-treated
T666,503,607SRR22891364Methyl lucidone-treated
T766,182,329SRR22891363Methyl lucidone-treated
T869,941,736SRR22891361Methyl lucidone-treated
T965,816,863SRR22891365Methyl lucidone-treated
Table 2. KEGG pathway analysis in 86 DiRT-derived downregulated DEGs.
Table 2. KEGG pathway analysis in 86 DiRT-derived downregulated DEGs.
KEGG PathwaysGene Countp-ValueBenjamini
Proteasome82.8 × 10−60.00015
Drug metabolism—cytochrome P45080.0000150.00033
Metabolism of xenobiotics via cytochrome P45080.0000190.00033
Drug metabolism—other enzymes70.0010.014
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shin, S.W.; Kim, J.A.; Jeon, J.H.; Park, K.; Lee, S.; Oh, H.-W. Alternative Characterizations of Methyl Lucidone-Responsive Differentially Expressed Genes in Drosophila melanogaster Using DEG-by-Index Ratio Transformation. Insects 2025, 16, 898. https://doi.org/10.3390/insects16090898

AMA Style

Shin SW, Kim JA, Jeon JH, Park K, Lee S, Oh H-W. Alternative Characterizations of Methyl Lucidone-Responsive Differentially Expressed Genes in Drosophila melanogaster Using DEG-by-Index Ratio Transformation. Insects. 2025; 16(9):898. https://doi.org/10.3390/insects16090898

Chicago/Turabian Style

Shin, Sang Woon, Ji Ae Kim, Jun Hyoung Jeon, Kunhyang Park, SooJin Lee, and Hyun-Woo Oh. 2025. "Alternative Characterizations of Methyl Lucidone-Responsive Differentially Expressed Genes in Drosophila melanogaster Using DEG-by-Index Ratio Transformation" Insects 16, no. 9: 898. https://doi.org/10.3390/insects16090898

APA Style

Shin, S. W., Kim, J. A., Jeon, J. H., Park, K., Lee, S., & Oh, H.-W. (2025). Alternative Characterizations of Methyl Lucidone-Responsive Differentially Expressed Genes in Drosophila melanogaster Using DEG-by-Index Ratio Transformation. Insects, 16(9), 898. https://doi.org/10.3390/insects16090898

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop