Literature DB >> 29108274

Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

Hui Peng1, Chaowang Lan1, Yuansheng Liu1, Tao Liu2, Michael Blumenstein3, Jinyan Li1.   

Abstract

Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.

Entities:  

Keywords:  chromosome preference; long noncoding RNA; vectorization

Year:  2017        PMID: 29108274      PMCID: PMC5668007          DOI: 10.18632/oncotarget.20481

Source DB:  PubMed          Journal:  Oncotarget        ISSN: 1949-2553


  49 in total

1.  KEGG: kyoto encyclopedia of genes and genomes.

Authors:  M Kanehisa; S Goto
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  DisGeNET: a Cytoscape plugin to visualize, integrate, search and analyze gene-disease networks.

Authors:  Anna Bauer-Mehren; Michael Rautschka; Ferran Sanz; Laura I Furlong
Journal:  Bioinformatics       Date:  2010-09-21       Impact factor: 6.937

Review 3.  Long non-coding RNAs: challenges for diagnosis and therapies.

Authors:  Yolanda Sánchez; Maite Huarte
Journal:  Nucleic Acid Ther       Date:  2013-02       Impact factor: 5.486

4.  LncDisease: a sequence based bioinformatics tool for predicting lncRNA-disease associations.

Authors:  Junyi Wang; Ruixia Ma; Wei Ma; Ji Chen; Jichun Yang; Yaguang Xi; Qinghua Cui
Journal:  Nucleic Acids Res       Date:  2016-02-16       Impact factor: 16.971

5.  Disease Ontology: a backbone for disease semantic integration.

Authors:  Lynn Marie Schriml; Cesar Arze; Suvarna Nadendla; Yu-Wei Wayne Chang; Mark Mazaitis; Victor Felix; Gang Feng; Warren Alden Kibbe
Journal:  Nucleic Acids Res       Date:  2011-11-12       Impact factor: 16.971

6.  Entrez Gene: gene-centered information at NCBI.

Authors:  Donna Maglott; Jim Ostell; Kim D Pruitt; Tatiana Tatusova
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

7.  Positive-unlabeled learning for the prediction of conformational B-cell epitopes.

Authors:  Jing Ren; Qian Liu; John Ellis; Jinyan Li
Journal:  BMC Bioinformatics       Date:  2015-12-09       Impact factor: 3.169

8.  Expression Atlas update--an integrated database of gene and protein expression in humans, animals and plants.

Authors:  Robert Petryszak; Maria Keays; Y Amy Tang; Nuno A Fonseca; Elisabet Barrera; Tony Burdett; Anja Füllgrabe; Alfonso Muñoz-Pomer Fuentes; Simon Jupp; Satu Koskinen; Oliver Mannion; Laura Huerta; Karine Megy; Catherine Snow; Eleanor Williams; Mitra Barzine; Emma Hastings; Hendrik Weisser; James Wright; Pankaj Jaiswal; Wolfgang Huber; Jyoti Choudhary; Helen E Parkinson; Alvis Brazma
Journal:  Nucleic Acids Res       Date:  2015-10-19       Impact factor: 16.971

9.  LARP7 suppresses P-TEFb activity to inhibit breast cancer progression and metastasis.

Authors:  Xiaodan Ji; Huasong Lu; Qiang Zhou; Kunxin Luo
Journal:  Elife       Date:  2014-07-22       Impact factor: 8.140

10.  ILNCSIM: improved lncRNA functional similarity calculation model.

Authors:  Yu-An Huang; Xing Chen; Zhu-Hong You; De-Shuang Huang; Keith C C Chan
Journal:  Oncotarget       Date:  2016-05-03
View more
  1 in total

1.  Laplacian normalization and bi-random walks on heterogeneous networks for predicting lncRNA-disease associations.

Authors:  Yaping Wen; Guosheng Han; Vo V Anh
Journal:  BMC Syst Biol       Date:  2018-12-31
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.