Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

Literature DB >> 29108274

Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

Hui Peng¹, Chaowang Lan¹, Yuansheng Liu¹, Tao Liu², Michael Blumenstein³, Jinyan Li¹.

Abstract

Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.

Entities: Chemical Disease Gene Species

Keywords: chromosome preference; long noncoding RNA; vectorization

Year: 2017 PMID： 29108274 PMCID： PMC5668007 DOI： 10.18632/oncotarget.20481

Source DB: PubMed Journal: Oncotarget ISSN： 1949-2553

49 in total

1. KEGG: kyoto encyclopedia of genes and genomes.

Authors: M Kanehisa; S Goto
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. DisGeNET: a Cytoscape plugin to visualize, integrate, search and analyze gene-disease networks.

Authors: Anna Bauer-Mehren; Michael Rautschka; Ferran Sanz; Laura I Furlong
Journal: Bioinformatics Date: 2010-09-21 Impact factor: 6.937

Review 3. Long non-coding RNAs: challenges for diagnosis and therapies.

Authors: Yolanda Sánchez; Maite Huarte
Journal: Nucleic Acid Ther Date: 2013-02 Impact factor: 5.486

4. LncDisease: a sequence based bioinformatics tool for predicting lncRNA-disease associations.

Authors: Junyi Wang; Ruixia Ma; Wei Ma; Ji Chen; Jichun Yang; Yaguang Xi; Qinghua Cui
Journal: Nucleic Acids Res Date: 2016-02-16 Impact factor: 16.971

5. Disease Ontology: a backbone for disease semantic integration.

Authors: Lynn Marie Schriml; Cesar Arze; Suvarna Nadendla; Yu-Wei Wayne Chang; Mark Mazaitis; Victor Felix; Gang Feng; Warren Alden Kibbe
Journal: Nucleic Acids Res Date: 2011-11-12 Impact factor: 16.971

6. Entrez Gene: gene-centered information at NCBI.

Authors: Donna Maglott; Jim Ostell; Kim D Pruitt; Tatiana Tatusova
Journal: Nucleic Acids Res Date: 2005-01-01 Impact factor: 16.971

7. Positive-unlabeled learning for the prediction of conformational B-cell epitopes.

Authors: Jing Ren; Qian Liu; John Ellis; Jinyan Li
Journal: BMC Bioinformatics Date: 2015-12-09 Impact factor: 3.169

8. Expression Atlas update--an integrated database of gene and protein expression in humans, animals and plants.

Authors: Robert Petryszak; Maria Keays; Y Amy Tang; Nuno A Fonseca; Elisabet Barrera; Tony Burdett; Anja Füllgrabe; Alfonso Muñoz-Pomer Fuentes; Simon Jupp; Satu Koskinen; Oliver Mannion; Laura Huerta; Karine Megy; Catherine Snow; Eleanor Williams; Mitra Barzine; Emma Hastings; Hendrik Weisser; James Wright; Pankaj Jaiswal; Wolfgang Huber; Jyoti Choudhary; Helen E Parkinson; Alvis Brazma
Journal: Nucleic Acids Res Date: 2015-10-19 Impact factor: 16.971