Microarray data analysis typically consists in identifying a list of differentially expressed genes (DEG), i.e., the genes that are differentially expressed between two experimental conditions. Variance shrinkage methods have been considered a better choice than the standard t-test for selecting the DEG because they correct the dependence of the error with the expression level. This dependence is mainly caused by errors in background correction, which more severely affects genes with low expression values. Here, we propose a new method for identifying the DEG that overcomes this issue and does not require background correction or variance shrinkage. Unlike current methods, our methodology is easy to understand and implement. It consists of applying the standard t-test directly on the normalized intensity data, which is possible because the probe intensity is proportional to the gene expression level and because the t-test is scale- and location-invariant. This methodology considerably improves the sensitivity and robustness of the list of DEG when compared with the t-test applied to preprocessed data and to the most widely used shrinkage methods, Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA). Our approach is useful especially when the genes of interest have small differences in expression and therefore get ignored by standard variance shrinkage methods.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited