Literature DB >> 28472273

EBT: a statistic test identifying moderate size of significant features with balanced power and precision for genome-wide rate comparisons.

Xinjie Hui1, Yueming Hu1, Ming-An Sun2, Xingsheng Shu1, Rongfei Han1, Qinggang Ge3, Yejun Wang1.   

Abstract

MOTIVATION: In genome-wide rate comparison studies, there is a big challenge for effective identification of an appropriate number of significant features objectively, since traditional statistical comparisons without multi-testing correction can generate a large number of false positives while multi-testing correction tremendously decreases the statistic power.
RESULTS: In this study, we proposed a new exact test based on the translation of rate comparison to two binomial distributions. With modeling and real datasets, the exact binomial test (EBT) showed an advantage in balancing the statistical precision and power, by providing an appropriate size of significant features for further studies. Both correlation analysis and bootstrapping tests demonstrated that EBT is as robust as the typical rate-comparison methods, e.g. χ 2 test, Fisher's exact test and Binomial test. Performance comparison among machine learning models with features identified by different statistical tests further demonstrated the advantage of EBT. The new test was also applied to analyze the genome-wide somatic gene mutation rate difference between lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), two main lung cancer subtypes and a list of new markers were identified that could be lineage-specifically associated with carcinogenesis of LUAD and LUSC, respectively. Interestingly, three cilia genes were found selectively with high mutation rates in LUSC, possibly implying the importance of cilia dysfunction in the carcinogenesis.
AVAILABILITY AND IMPLEMENTATION: An R package implementing EBT could be downloaded from the website freely: http://www.szu-bioinf.org/EBT . CONTACT: wangyj@szu.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Year:  2017        PMID: 28472273     DOI: 10.1093/bioinformatics/btx294

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  5 in total

1.  Association of specific gene mutations derived from machine learning with survival in lung adenocarcinoma.

Authors:  Han-Jun Cho; Soonchul Lee; Young Geon Ji; Dong Hyeon Lee
Journal:  PLoS One       Date:  2018-11-12       Impact factor: 3.240

2.  A Multi-Gene Model Effectively Predicts the Overall Prognosis of Stomach Adenocarcinomas With Large Genetic Heterogeneity Using Somatic Mutation Features.

Authors:  Xianming Liu; Xinjie Hui; Huayu Kang; Qiongfang Fang; Aiyue Chen; Yueming Hu; Desheng Lu; Xianxiong Chen; Yejun Wang
Journal:  Front Genet       Date:  2020-08-26       Impact factor: 4.599

3.  Combination of Genetic Markers and Age Effectively Facilitates the Identification of People with High Risk of Preeclampsia in the Han Chinese Population.

Authors:  Lu Zhou; Xinjie Hui; Huijuan Yuan; Yinglin Liu; Yejun Wang
Journal:  Biomed Res Int       Date:  2018-07-19       Impact factor: 3.411

4.  T1SEstacker: A Tri-Layer Stacking Model Effectively Predicts Bacterial Type 1 Secreted Proteins Based on C-Terminal Non-repeats-in-Toxin-Motif Sequence Features.

Authors:  Zewei Chen; Ziyi Zhao; Xinjie Hui; Junya Zhang; Yixue Hu; Runhong Chen; Xuxia Cai; Yueming Hu; Yejun Wang
Journal:  Front Microbiol       Date:  2022-02-08       Impact factor: 5.640

5.  Improvement in prediction of prostate cancer prognosis with somatic mutational signatures.

Authors:  Shengping Zhang; Yafei Xu; Xinjie Hui; Fei Yang; Yueming Hu; Jianlin Shao; Hui Liang; Yejun Wang
Journal:  J Cancer       Date:  2017-09-15       Impact factor: 4.207

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.