| Literature DB >> 30895303 |
Yadong Yang1, Tao Zhang2, Rudan Xiao1, Xiaopeng Hao3, Huiqiang Zhang3, Hongzhu Qu1, Bingbing Xie1, Tao Wang3, Xiangdong Fang1.
Abstract
Peripheral blood gene expression intensity-based methods for distinguishing healthy individuals from cancer patients are limited by sensitivity to batch effects and data normalization and variability between expression profiling assays. To improve the robustness and precision of blood gene expression-based tumour detection, it is necessary to perform molecular diagnostic tests using a more stable approach. Taking breast cancer as an example, we propose a machine learning-based framework that distinguishes breast cancer patients from healthy subjects by pairwise rank transformation of gene expression intensity in each sample. We showed the diagnostic potential of the method by performing RNA-seq for 37 peripheral blood samples from breast cancer patients and by collecting RNA-seq data from healthy donors in Genotype-Tissue Expression project and microarray mRNA expression datasets in Gene Expression Omnibus. The framework was insensitive to experimental batch effects and data normalization, and it can be simultaneously applied to new sample prediction.Entities:
Keywords: blood; cancer detection; expression; framework; rank
Year: 2020 PMID: 30895303 DOI: 10.1093/bib/bbz027
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622