Yan Huang1, Dianshuang Zhou1, Yihan Wang2, Xingda Zhang2, Mu Su1, Cong Wang1, Zhongyi Sun1, Qinghua Jiang1, Baoqing Sun3, Yan Zhang1,3. 1. School of Life Science & Technology, Computational Biology Research Center, Harbin Institute of Technology, Harbin 150001, China. 2. College of Bioinformatics Science & Technology, Harbin Medical University, Harbin 150081, China. 3. Department of Allergy & Clinical Immunology, Guangzhou Institute of Respiratory health, State Key Laboratory of Respiratory Disease, National Clinical Research Center of Respiratory Disease, First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.
Abstract
Aim: We aim to predict transcription factor (TF) binding events from knowledge of gene expression and epigenetic modifications. Materials & methods: TF-binding events based on the Encode project and The Cancer Genome Atlas data were analyzed by the random forest method. Results: We showed the high performance of TF-binding predictive models in GM12878, HeLa, HepG2 and K562 cell lines and applied them to other cell lines and tissues. The genes bound by the top TFs (MAX and MAZ) were significantly associated with cancer-related processes such as cell proliferation and DNA repair. Conclusion: We successfully constructed TF-binding predictive models in cell lines and applied them in tissues.
Aim: We aim to predict transcription factor (TF) binding events from knowledge of gene expression and epigenetic modifications. Materials & methods: TF-binding events based on the Encode project and The Cancer Genome Atlas data were analyzed by the random forest method. Results: We showed the high performance of TF-binding predictive models in GM12878, HeLa, HepG2 and K562 cell lines and applied them to other cell lines and tissues. The genes bound by the top TFs (MAX and MAZ) were significantly associated with cancer-related processes such as cell proliferation and DNA repair. Conclusion: We successfully constructed TF-binding predictive models in cell lines and applied them in tissues.