| Literature DB >> 36081985 |
Bijun Zhang1, Ting Fan2.
Abstract
Introduction: Deep learning technology has been widely used in genetic research because of its characteristics of computability, statistical analysis, and predictability. Herein, we aimed to summarize standardized knowledge and potentially innovative approaches for deep learning applications of genetics by evaluating publications to encourage more research.Entities:
Keywords: bibliometric; deep learning; genetics; knowledge graph; machine learning
Year: 2022 PMID: 36081985 PMCID: PMC9445221 DOI: 10.3389/fgene.2022.951939
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Number of published articles about deep learning application in genetics research from 2000 to 2021.
Top 10 countries, institutions, and journals.
| Rank | Country | Count | H-index | Institution | Count | H-index | Cited journal | Count | If (2021) |
|---|---|---|---|---|---|---|---|---|---|
| 1 | United States | 728 | 65 | Chinese Academy of Sciences | 41 | 13 | BIOINFORMATICS | 1,175 | 6.93 |
| 2 | CHINA | 407 | 35 | Harvard Medical School | 37 | 12 | NATURE | 1,082 | 49.96 |
| 3 | GERMANY | 118 | 33 | Stanford University | 29 | 18 | NUCLEIC ACIDS RES | 1,075 | 16.97 |
| 4 | ENGLAND | 101 | 35 | University of Pennsylvania | 28 | 11 | P NATL ACAD SCI United States | 868 | 11.20 |
| 5 | CANADA | 92 | 22 | Harvard University | 26 | 25 | PLOS ONE | 856 | 3.24 |
| 6 | AUSTRALIA | 54 | 19 | University of Toronto | 25 | 12 | SCIENCE | 780 | 47.72 |
| 7 | INDIA | 48 | 11 | Columbia University | 25 | 11 | BMC BIOINFORMATICS | 764 | 3.16 |
| 8 | FRANCE | 43 | 16 | Yale University | 24 | 14 | NAT GENET | 734 | 38.33 |
| 9 | ITALY | 41 | 15 | University of Washington | 24 | 13 | CELL | 683 | 41.58 |
| 10 | JAPAN | 41 | 16 | Shanghai Jiao Tong University | 22 | 9 | GENOME BIOL | 622 | 13.58 |
FIGURE 2Cooperation of countries in the field of Deep Learning application in genetics research from 2000 to 2021.
FIGURE 3Cooperation of institutions contributed to publications for Deep Learning applications in genetic research.
FIGURE 4Network map of cited journals in the field of Deep Learning application in genetics research from 2000 to 2021.
FIGURE 5Dual-map overlay of journals in the field of Deep Learning application in genetics research from 2000 to 2021.
Top 10 cited references on VR in rehabilitation.
| Rank | DOI | Title of cited reference | Count | Centrality | Interpretation of the findings | Year |
|---|---|---|---|---|---|---|
| 1 | 10.1038/nature14539 | Deep learning | 89 | 0.01 | This article discussed deep learning methods such as deep convolutional nets and recurrent nets that have dramatically improved speech and visual recognition. Other domains such as drug discovery and genomics brought about breakthroughs | 2015 |
| 2 |
| Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning | 77 | 0.08 | This study built a stand-alone software by using a diverse array of experimental data and evaluation metrics ascertained sequence specificities that is essential for identifying causal disease variants | 2015 |
| 3 | 10.1038/nmeth.3547 | Predicting effects of noncoding variants with a deep learning-based sequence model | 65 | 0.07 | This document developed a deep learning–based algorithmic framework that enables the prediction of noncoding variants | 2015 |
| 4 | 10.1145/2939672.2939785 | Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining | 54 | 0 | This study described a highly effective scalable tree boosting machine learning method and proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning | 2016 |
| 5 | 10.1101/gr.200535.115 | Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks | 47 | 0.1 | This study offered a powerful computational approach to annotating and interpreting the noncoding genome. Researchers perform a single sequencing by CNN's assay to annotate every mutation in the genome with its influence on present accessibility and latent potential for accessibility | 2016 |
| 6 | 10.1038/nature19057 | Analysis of protein-coding genetic variation in 60,706 humans | 44 | 0.01 | This study analysis protein-coding genetic variation in 60,706 humans, and it can efficiently filtering of candidate disease-causing variants and discover human ‘knockout’ variants in protein-coding genes | 2016 |
| 7 | 10.1093/nar/gkw226 | DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences | 36 | 0.05 | This study proposed a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function | 2014 |
| 8 | 10.1038/nature14248 | Integrative analysis of 111 reference human epigenomes | 36 | 0.08 | The article described the integrative analysis of 111 reference human epigenomes generated and profiled for histone modification patterns, DNA accessibility, DNA methylation, and RNA expression | 2015 |
| 9 | 10.15252/msb.20156651 | Deep learning for computational biology | 34 | 0.03 | This study reviewed the applications of this new breed of analysis approaches in regulatory genomics and cellular imaging | 2014 |
| 10 |
| A general framework for estimating the relative pathogenicity of human genetic variants | 34 | 0.05 | This study discussed a framework that objectively integrates many diverse annotations into a single, quantitative score to differentiate 14.7 million simulated variants | 2015 |
FIGURE 6Reference co-citation map of publications from 2000 to 2021.
Highly link strength of the top 20 occurrence keywords.
| Rank | Keyword | Occurrence | Total link strength | Rank | Keyword | Occurrence | Total link strength |
|---|---|---|---|---|---|---|---|
| 1 | Machine learning | 553 | 481 | 11 | DNA methylation | 36 | 44 |
| 2 | Deep learning | 201 | 194 | 12 | Support vector machine | 32 | 44 |
| 3 | Classifications | 54 | 90 | 13 | Prediction | 31 | 43 |
| 4 | Random forest | 52 | 69 | 14 | Biomarker | 30 | 45 |
| 5 | Bioinformatics | 48 | 68 | 15 | Cancer | 30 | 46 |
| 6 | Gene expression | 48 | 83 | 16 | Rna-seq | 24 | 38 |
| 7 | Feature selection | 46 | 80 | 17 | Genomic prediction | 23 | 42 |
| 8 | Artificial intelligence | 43 | 80 | 18 | Breast cancer | 22 | 32 |
| 9 | Genomics | 37 | 56 | 19 | Gene regulation | 19 | 26 |
| 10 | Convolutional neural | 36 | 24 | 20 | Neural network | 19 | 24 |
FIGURE 7Network map of keywords is divided into 6 clusters.
FIGURE 8Keywords with the strongest citation bursts of publications from 2000 to 2021.