Background: Blood has advantages over tissue samples as a diagnostic tool, and blood mRNA transcriptomics is an exciting research field. To realize the full potential of blood transcriptomic investigations requires improved methods for gene expression measurement and data interpretation able to detect biological signatures within the “noisy” variability of whole blood. Methods: We demonstrate collection tube bias compensation during the process of identifying a liver cancer-specific gene signature. The candidate probe set list of liver cancer was filtered, based on previous repeatability performance obtained from technical replicates. We built a prediction model using differential pairs to reduce the impact of confounding factors. We compared prediction performance on an independent test set against prediction on an alternative model derived by Weka. The method was applied to an independent set of 157 blood samples collected in PAXgene tubes. Results: The model discriminated liver cancer equally well in both EDTA and PAXgene collected samples, whereas the Weka-derived model (using default settings) was not able to compensate for collection tube bias. Cross-validation results show our procedure predicted membership of each sample within the disease groups and healthy controls. Conclusion: Our versatile method for blood transcriptomic investigation overcomes several limitations hampering research in blood-based gene tests.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited