Literature DB >> 29594496

pDHS-ELM: computational predictor for plant DNase I hypersensitive sites based on extreme learning machines.

Shanxin Zhang1,2, Minjun Chang3, Zhiping Zhou4, Xiaofeng Dai5,6, Zhenghong Xu7,5,6.   

Abstract

DNase I hypersensitive sites (DHSs) are hallmarks of chromatin zones containing transcriptional regulatory elements, making them critical in understanding the regulatory mechanisms of gene expression. Although large amounts of DHSs in the plant genome have been identified by high-throughput techniques, current DHSs obtained from experimental methods cover only a fraction of plant species and cell processes. Furthermore, these experimental methods are both time-consuming and expensive. Hence, it is urgent to develop automated computational means to efficiently and accurately predict DHSs in the plant genome. Recently, several methods have been proposed to predict the DHSs. However, all these methods took a lot of time to build the model, making them inappropriate for data with massive volume. In the present work, a new ensemble extreme learning machine (ELM)-based model called pDHS-ELM was proposed to predict the DHSs in the plant genome by fusing two different modes of pseudo-nucleotide composition. Here, two kinds of features including reverse complement kmer and pseudo-nucleotide composition were used to represent the DHSs. The ELM model was used to build the base classifiers. Then, an ensemble framework was employed to combine the outputs of these base classifiers. When applied to DHSs in Arabidopsis thaliana and rice (Oryza sativa) genome, the proposed method could obtain accuracies up to 88.48 and 87.58%, respectively. Compared with the state-of-the-art techniques, pDHS-ELM achieved higher sensitivity, specificity, and Matthew's correlation coefficient with much less training and test time. By employing pDHS-ELM, we identified 42,370 and 103,979 DHSs in A. thaliana and rice genome, respectively. The predicted DHSs were depleted of bulk nucleosomes and were tightly associated with transcription factors. Approximately 90% of the predicted DHSs were overlapped with transcription factors. Meanwhile, we demonstrated that the predicted DHSs were also associated with DNA methylation, nucleosome positioning/occupancy, and histone modification. This result suggests that pDHS-ELM can be considered as a new promising and powerful tool for transcriptional regulatory elements analysis. Our pDHS-ELM tool is available from the following website https://github.com/shanxinzhang/pDHS-ELM/ .

Entities:  

Keywords:  DNase I hypersensitive sites; Ensemble; Extreme learning machine; Plant; Prediction

Mesh:

Substances:

Year:  2018        PMID: 29594496     DOI: 10.1007/s00438-018-1436-3

Source DB:  PubMed          Journal:  Mol Genet Genomics        ISSN: 1617-4623            Impact factor:   3.291


  29 in total

Review 1.  Conserved noncoding sequences (CNSs) in higher plants.

Authors:  Michael Freeling; Shabarinath Subramaniam
Journal:  Curr Opin Plant Biol       Date:  2009-02-25       Impact factor: 7.834

Review 2.  The 'dark matter' in the plant genomes: non-coding and unannotated DNA sequences associated with open chromatin.

Authors:  Jiming Jiang
Journal:  Curr Opin Plant Biol       Date:  2015-01-24       Impact factor: 7.834

3.  iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals.

Authors:  Xiang Cheng; Shu-Guang Zhao; Xuan Xiao; Kuo-Chen Chou
Journal:  Bioinformatics       Date:  2017-02-01       Impact factor: 6.937

4.  Mapping and dynamics of regulatory DNA and transcription factor networks in A. thaliana.

Authors:  Alessandra M Sullivan; Andrej A Arsovski; Janne Lempe; Kerry L Bubb; Matthew T Weirauch; Peter J Sabo; Richard Sandstrom; Robert E Thurman; Shane Neph; Alex P Reynolds; Andrew B Stergachis; Benjamin Vernot; Audra K Johnson; Eric Haugen; Shawn T Sullivan; Agnieszka Thompson; Fidencio V Neri; Molly Weaver; Morgan Diegel; Sanie Mnaimneh; Ally Yang; Timothy R Hughes; Jennifer L Nemhauser; Christine Queitsch; John A Stamatoyannopoulos
Journal:  Cell Rep       Date:  2014-09-15       Impact factor: 9.423

5.  High-resolution mapping of open chromatin in the rice genome.

Authors:  Wenli Zhang; Yufeng Wu; James C Schnable; Zixian Zeng; Michael Freeling; Gregory E Crawford; Jiming Jiang
Journal:  Genome Res       Date:  2011-11-22       Impact factor: 9.043

6.  PCSD: a plant chromatin state database.

Authors:  Yue Liu; Tian Tian; Kang Zhang; Qi You; Hengyu Yan; Nannan Zhao; Xin Yi; Wenying Xu; Zhen Su
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

7.  pDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine.

Authors:  Shanxin Zhang; Zhiping Zhou; Xinmeng Chen; Yong Hu; Lindong Yang
Journal:  J Theor Biol       Date:  2017-05-26       Impact factor: 2.691

8.  pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach.

Authors:  Jianhua Jia; Zi Liu; Xuan Xiao; Bingxiang Liu; Kuo-Chen Chou
Journal:  J Theor Biol       Date:  2016-01-22       Impact factor: 2.691

9.  iRNA-PseU: Identifying RNA pseudouridine sites.

Authors:  Wei Chen; Hua Tang; Jing Ye; Hao Lin; Kuo-Chen Chou
Journal:  Mol Ther Nucleic Acids       Date:  2016

10.  Some remarks on protein attribute prediction and pseudo amino acid composition.

Authors:  Kuo-Chen Chou
Journal:  J Theor Biol       Date:  2010-12-17       Impact factor: 2.691

View more
  2 in total

1.  i6mA-DNCP: Computational Identification of DNA N6-Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features.

Authors:  Liang Kong; Lichao Zhang
Journal:  Genes (Basel)       Date:  2019-10-20       Impact factor: 4.096

Review 2.  A Natural Isoquinoline Alkaloid With Antitumor Activity: Studies of the Biological Activities of Berberine.

Authors:  Da Liu; Xue Meng; Donglu Wu; Zhidong Qiu; Haoming Luo
Journal:  Front Pharmacol       Date:  2019-02-14       Impact factor: 5.810

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.