Literature DB >> 33482718

Mining influential genes based on deep learning.

Lingpeng Kong1, Yuanyuan Chen2, Fengjiao Xu2, Mingmin Xu1, Zutan Li1, Jingya Fang1, Liangyun Zhang3, Cong Pian4.   

Abstract

BACKGROUND: Currently, large-scale gene expression profiling has been successfully applied to the discovery of functional connections among diseases, genetic perturbation, and drug action. To address the cost of an ever-expanding gene expression profile, a new, low-cost, high-throughput reduced representation expression profiling method called L1000 was proposed, with which one million profiles were produced. Although a set of ~ 1000 carefully chosen landmark genes that can capture ~ 80% of information from the whole genome has been identified for use in L1000, the robustness of using these landmark genes to infer target genes is not satisfactory. Therefore, more efficient computational methods are still needed to deep mine the influential genes in the genome.
RESULTS: Here, we propose a computational framework based on deep learning to mine a subset of genes that can cover more genomic information. Specifically, an AutoEncoder framework is first constructed to learn the non-linear relationship between genes, and then DeepLIFT is applied to calculate gene importance scores. Using this data-driven approach, we have re-obtained a landmark gene set. The result shows that our landmark genes can predict target genes more accurately and robustly than that of L1000 based on two metrics [mean absolute error (MAE) and Pearson correlation coefficient (PCC)]. This reveals that the landmark genes detected by our method contain more genomic information.
CONCLUSIONS: We believe that our proposed framework is very suitable for the analysis of biological big data to reveal the mysteries of life. Furthermore, the landmark genes inferred from this study can be used for the explosive amplification of gene expression profiles to facilitate research into functional connections.

Entities:  

Keywords:  AutoEncoder; Deep learning; DeepLIFT; Landmark genes

Mesh:

Year:  2021        PMID: 33482718      PMCID: PMC7821411          DOI: 10.1186/s12859-021-03972-5

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  24 in total

1.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.

Authors:  Ron Edgar; Michael Domrachev; Alex E Lash
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

2.  MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis.

Authors:  Helena Brunel; Joan-Josep Gallardo-Chacón; Alfonso Buil; Montserrat Vallverdú; José Manuel Soria; Pere Caminal; Alexandre Perera
Journal:  Bioinformatics       Date:  2010-06-18       Impact factor: 6.937

3.  Gene expression inference with deep learning.

Authors:  Yifei Chen; Yi Li; Rajiv Narayan; Aravind Subramanian; Xiaohui Xie
Journal:  Bioinformatics       Date:  2016-02-11       Impact factor: 6.937

4.  i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome.

Authors:  Wei Chen; Hao Lv; Fulei Nie; Hao Lin
Journal:  Bioinformatics       Date:  2019-08-15       Impact factor: 6.937

Review 5.  Deep learning in bioinformatics.

Authors:  Seonwoo Min; Byunghan Lee; Sungroh Yoon
Journal:  Brief Bioinform       Date:  2017-09-01       Impact factor: 11.622

6.  In Situ Transcription Profiling of Single Cells Reveals Spatial Organization of Cells in the Mouse Hippocampus.

Authors:  Sheel Shah; Eric Lubeck; Wen Zhou; Long Cai
Journal:  Neuron       Date:  2016-10-19       Impact factor: 17.173

7.  Stromal gene expression defines poor-prognosis subtypes in colorectal cancer.

Authors:  Alexandre Calon; Enza Lonardo; Antonio Berenguer-Llergo; Elisa Espinet; Xavier Hernando-Momblona; Mar Iglesias; Marta Sevillano; Sergio Palomo-Ponce; Daniele V F Tauriello; Daniel Byrom; Carme Cortina; Clara Morral; Carles Barceló; Sebastien Tosi; Antoni Riera; Camille Stephan-Otto Attolini; David Rossell; Elena Sancho; Eduard Batlle
Journal:  Nat Genet       Date:  2015-02-23       Impact factor: 38.330

8.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks.

Authors:  David R Kelley; Jasper Snoek; John L Rinn
Journal:  Genome Res       Date:  2016-05-03       Impact factor: 9.043

9.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records.

Authors:  Riccardo Miotto; Li Li; Brian A Kidd; Joel T Dudley
Journal:  Sci Rep       Date:  2016-05-17       Impact factor: 4.379

10.  Learning a hierarchical representation of the yeast transcriptomic machinery using an autoencoder model.

Authors:  Lujia Chen; Chunhui Cai; Vicky Chen; Xinghua Lu
Journal:  BMC Bioinformatics       Date:  2016-01-11       Impact factor: 3.169

View more
  2 in total

1.  DRUG-seq Provides Unbiased Biological Activity Readouts for Neuroscience Drug Discovery.

Authors:  Jingyao Li; Daniel J Ho; Martin Henault; Chian Yang; Marilisa Neri; Robin Ge; Steffen Renner; Leandra Mansur; Alicia Lindeman; Brian Kelly; Tayfun Tumkaya; Xiaoling Ke; Gilberto Soler-Llavina; Gopi Shanker; Carsten Russ; Marc Hild; Caroline Gubser Keller; Jeremy L Jenkins; Kathleen A Worringer; Frederic D Sigoillot; Robert J Ihry
Journal:  ACS Chem Biol       Date:  2022-05-04       Impact factor: 4.634

2.  Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development.

Authors:  Shang Liu; Hailiang Cheng; Javaria Ashraf; Youping Zhang; Qiaolian Wang; Limin Lv; Man He; Guoli Song; Dongyun Zuo
Journal:  BMC Bioinformatics       Date:  2022-03-15       Impact factor: 3.169

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.