# t-Test at the Probe Level: An Alternative Method to Identify Statistically Significant Genes for Microarray Data

## Abstract

## 1. Introduction

## 2. Methodology

## 3. Results

#### 3.1. Sensitivity

#### 3.2. Robustness

**Figure 1.**ROC curve comparing the performance (sensitivity) of different ranking methodologies in identifying differentially expressed genes in spike-in data. (

**A**) Comparison of our approach (median t-values) with different preprocessing methods used with t-test as ranking method to select the differentially expressed genes. (

**B**) Comparison of our approach (median t-values) with the best performance of other ranking methods: t-test, SAM and LIMMA. The best performance of t-test, SAM and LIMMA is obtained when using RMA as a preprocessing algorithm.

**Figure 2.**Average of the fraction of genes shared by two lists of differentially expressed genes (overlap) as a function of the sample size using the Leukemia dataset. Each list of differentially expressed genes is composed by the top 100 genes chosen according to different ranking methods, i.e., t-test, SAM and LIMMA (preprocessed by the RMA method), and our approach (median t-value) which does not require a preprocessing algorithm. The average value of the overlap between the lists is calculated over 100 lists chosen randomly.

**Figure 3.**Average of the fraction of genes shared by two lists of differentially expressed genes (overlap) as a function of the sample size using the Multiple Myeloma dataset divided according to (

**A**) Overall Survival Milestone Outcome (OS-MO) and (

**B**) Event Free Survival Milestone Outcome (EFS-MO). Each list of differentially expressed genes is composed by the top 100 genes chosen according to different ranking methods, i.e., t-test, SAM and LIMMA (preprocessed by the RMA method), and our approach (median t-value) which does not require a preprocessing algorithm. The average value of the overlap between the lists is calculated over 100 lists chosen randomly.

**Figure 4.**Average of the fraction of genes shared by two lists of differentially expressed genes (overlap) as a function of the sample size using the Breast Cancer dataset divided according to (

**A**) pre-operative treatment response (pCR, pathologic complete response) and (

**B**) estrogen receptor (ER) endpoint. Each list of differentially expressed genes is composed by the top 100 genes chosen according to different ranking methods, i.e., t-test, SAM and LIMMA (preprocessed by the RMA method), and our approach (median t-value) which does not require a preprocessing algorithm. The average value of the overlap between the lists is calculated over 100 lists chosen randomly.

## 4. Discussion

## Acknowledgments

## Author Contributions

## Appendices

**Table A1.**Latin Square design, which consists of 14 spike-in gene groups in 14 experimental groups with 3 repetitions.

gene/exp | 1–3 | 4–6 | 7–9 | 10–12 | 13–15 | 16–18 | 19–21 | 22–24 | 25–27 | 28–30 | 31–33 | 34–36 | 37–39 | 40–42 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1–3 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 |

4–6 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 0 |

7–9 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 0 | 0.125 |

10–12 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 0 | 0.125 | 0.25 |

13–15 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 |

16–18 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 |

19–21 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 |

22–24 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 |

25–27 | 16 | 32 | 64 | 128 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 |

28–30 | 32 | 64 | 128 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 |

31–33 | 64 | 128 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 |

34–36 | 128 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 |

37–39 | 256 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 |

40–42 | 512 | 0 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 |

**Table A2.**Class distribution of the datasets. The Breast Cancer and Multiple Myeloma datasets were obtained from the MAQC consortium [25]. The training and validation datasets were generated by different experimental groups. In our analysis we considered all samples: both training and validation samples.

Dataset | Training Set | Validation Set | ||||
---|---|---|---|---|---|---|

Number of Samples | Positive | Negative | Number of Samples | Positive | Negative | |

Breast Cancer (pCR) | 130 | 33 | 97 | 100 | 15 | 85 |

Breast Cancer (ER) | 130 | 80 | 50 | 100 | 61 | 39 |

Multiple Myeloma (OS-MO) | 340 | 51 | 289 | 214 | 27 | 187 |

Multiple Myeloma (EFS-MO) | 340 | 84 | 256 | 214 | 34 | 180 |

## Conflicts of Interest

© 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

