| Literature DB >> 34901036 |
Abstract
Structural variations in the genome are closely related to human health and the occurrence and development of various diseases. To understand the mechanisms of diseases, find pathogenic targets, and carry out personalized precision medicine, it is critical to detect such variations. The rapid development of high-throughput sequencing technologies has accelerated the accumulation of large amounts of genomic mutation data, including synonymous mutations. Identifying pathogenic synonymous mutations that play important roles in the occurrence and development of diseases from all the available mutation data is of great importance. In this paper, machine learning theories and methods are reviewed, efficient and accurate pathogenic synonymous mutation prediction methods are developed, and a standardized three-level variant analysis framework is constructed. In addition, multiple variation tolerance prediction models are studied and integrated, and new ideas for structural variation detection based on deep information mining are explored.Entities:
Keywords: genome; genomic mutation; machine learning; prediction; variation
Year: 2021 PMID: 34901036 PMCID: PMC8656232 DOI: 10.3389/fcell.2021.795883
Source DB: PubMed Journal: Front Cell Dev Biol ISSN: 2296-634X
FIGURE 1Idea map of the paper.
Summary of genomic variation prediction methods.
| Type | Methods | Algorithm |
|---|---|---|
| Pathogenic synonymous mutations | SilVA ( | Random forest |
| DDIG-SN( | Support vector machine | |
| regSNPs-splicing ( | Random forest | |
| Syntool ( | — | |
| TraP ( | Random forest | |
| Genome sequencing | CADD ( | Support vector machine |
| MutationTaster2 ( | Naive Bayes | |
| Mut-Pred ( | Random forest | |
| PolyPhen-2 ( | Naive Bayes | |
| PON-P2 ( | Random forest | |
| VEST ( | Random forest | |
| Deep mining of structural variation information | DeepBind( | deep learning |
| DeepVariant ( | deep neural networks | |
| DeepCpG ( | deep neural networks |
FIGURE 2Diagram of deep mining of structural variation information.
Summary of genomic variations associated with disease.
| Disease | Causes | Result |
|---|---|---|
| Type 2 diabetes ( | There were 139 common gene variants and 4 rare gene variants | Availability of Inhaled Insulin Promotes greater perceived acceptance of insulin therapy in Patients with type 2 diabetes |
| Neonatal epilepsy ( | Whole gene repeats of SCN2A and SCN3A | Extra copy of SCN2A has an effect on epilepsy pathogenesis |
| Bladder cancer ( | Copy number variation in GSTM1 gene | A loss of 9p21 was less predictive for detecting bladder cancer |
| Lung cancer ( | Cnv-67048 variation on WWOX | be related with altered WWOX gene expression and exons absence in them |
| A wide variety of tumor ( | BAP1 mutation | BAP1 is the candidate gene in only a small subset of hereditary UM, suggesting the contribution of other candidate genes. |