| Literature DB >> 34065872 |
Mahaly Baptiste1, Sarah Shireen Moinuddeen1, Courtney Lace Soliz1, Hashimul Ehsan1, Gen Kaneko1.
Abstract
Precision medicine is a medical approach to administer patients with a tailored dose of treatment by taking into consideration a person's variability in genes, environment, and lifestyles. The accumulation of omics big sequence data led to the development of various genetic databases on which clinical stratification of high-risk populations may be conducted. In addition, because cancers are generally caused by tumor-specific mutations, large-scale systematic identification of single nucleotide polymorphisms (SNPs) in various tumors has propelled significant progress of tailored treatments of tumors (i.e., precision oncology). Machine learning (ML), a subfield of artificial intelligence in which computers learn through experience, has a great potential to be used in precision oncology chiefly to help physicians make diagnostic decisions based on tumor images. A promising venue of ML in precision oncology is the integration of all available data from images to multi-omics big data for the holistic care of patients and high-risk healthy subjects. In this review, we provide a focused overview of precision oncology and ML with attention to breast cancer and glioma as well as the Bayesian networks that have the flexibility and the ability to work with incomplete information. We also introduce some state-of-the-art attempts to use and incorporate ML and genetic information in precision oncology.Entities:
Keywords: breast cancer; glioma; precision medicine; single nucleotide polymorphisms
Year: 2021 PMID: 34065872 PMCID: PMC8151328 DOI: 10.3390/genes12050722
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Overview of precision oncology.
Representative genes responsible for breast cancer and found in glioma.
| Type | Gene | Accession No. | Function |
|---|---|---|---|
| Breast |
| NM_007294.4 | Transcriptional regulator of DNA repair genes and tumor suppressor genes. |
|
| NM_000051.4 | This gene encodes a cell cycle checkpoint kinase that belongs to the PI3/PI4-kinase family. The normal function of this gene is to help repair DNA damage or kills the cell if it is unable to fix the damaged DNA. | |
|
| NM_000546.6 | Halts the growth of cells with damaged DNA. | |
|
| NM_007194.4 | The CHEK2 protein is a cell cycle checkpoint regulator and a possible tumor suppressor that is known to phosphorylate BRCA1. Mutations in this gene have been correlated with the development of Li-Fraumeni syndrome. This mutation increases the likelihood of predisposition to sarcomas, breast cancer, and brain tumors. | |
|
| NM_001304717.5 | Tumor suppressor gene that is mutated in a large quantity of cancers at high frequency. Helps regulate cell growth. | |
|
| NM_001317185.2 | Encodes epithelial cadherin or E-cadherin. When individuals inherit the mutated form of this gene, it causes | |
| NM_000455.5 | Encodes serine/threonine kinase 11 that regulates cell polarity and acts as a tumor suppressor. Mutations in | ||
|
| NM_024675.4 | Encodes a protein that binds to | |
|
| NM_000465.4 | Encodes protein that interacts with the N-terminal of BRCA1. Shares homology with the two most conserved regions of BRCA1, the N-terminal RING motif and the C-terminal BRCT domain. The RING motif is typically found in proteins that regulate cell growth. The protein encoded by | |
|
| NM_032043.3 | The protein interacts with the BRCT repeats of | |
|
| NM_001372051.1 | Encodes a member of the cysteine-aspartic acid protease (caspase) family. This protein allows for the apoptosis induced by Fas. Associated with the risk of developing cancer [ | |
|
| NM_005214.5 | A member of the immunoglobin gene superfamily. Encodes a protein that sends an inhibitory signal to T cells. Expressed in some cancer cells [ | |
|
| NM_000141.5 | The protein encoded by this gene is a member of the fibroblast growth factor receptor family, where amino acid sequence is highly conserved. Aberrations in | |
|
| NR_002196.2 | Gene only expressed from maternally inherited chromosome. Encodes a non-coding RNA that functions as a tumor suppressor. Mutations in | |
|
| NM_05591.4 | Encodes a nuclear protein involved in homologous recombination, telomere length maintenance, and DNA double-strand break repair. This protein is a member of the MRE11/RAD50 double-strand break repair complex composed of 5 proteins. | |
|
| NM_002485.5 | Mutations in | |
|
| NM_002875.5 | Encodes a protein important for repairing damaged DNA. The protein is a member of the RAD51 family. It interacts with single-strand DNA-binding protein RPA and RAD52. This protein is also thought to be involved in homologous pairing and strand transfer of DNA. It interacts with | |
|
| NM_198253.3 | Encodes one subunit of the enzyme telomerase that lengthens telomeres at the end of chromosomes. The lengthening of the cancer cell telomeres allows them to continually survive. | |
|
| NM_001080430.4 | This gene encodes a protein that holds an HMG-box. This protein is possibly engaged in bending and unwinding DNA and modulating chromatin structure because of the HMG-box. This gene’s minor allele has been associated with a higher risk of developing breast cancer. | |
| Glioma |
| NM_006576 | Encodes a member of gelsolin/villin family of actin regulatory proteins. May contribute to the development of ganglia. |
|
| NM_004994.3 | The matrix metalloproteinase (MMP) breaks down the extracellular matrix. | |
|
| NM_212482.4 | Encodes fibronectin, a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and in extracellular matrix. Fibronectin is known to be involved in cell adhesion and migration progresses such as metastasis. | |
|
| NM_000090.4 | Encodes the pro-alpha1 chains of type III collagen found in skin, lungs, intestinal walls, and the walls of blood vessels. Mutations in this gene are associated with the development of Ehlers-Danlos syndrome type IV. |
Representative machine learning algorithms used in precision oncology.
| Algorithm | Characteristics |
|---|---|
| K-nearest neighbor (KNN) | Often described as the simplest ML algorithm; no training phase is required. |
| Support vector machine (SVM) | Simple structure and high generalization capability; works well with insufficient training data [ |
| Artificial neural network (ANN) | Mimics neuronal network, in which each node changes the connection strength by experience. |
| Decision tree (DT) learning | A popular tree-based method for classification and regression, in which the learned model is represented as a decision tree. |
| Naive Bayes (NB) | Probabilistic classifier that treats each feature variable as an independent variable. |
| Bayesian network (BN) | A probabilistic graphical model in which a directed acyclic graph represents potential causal relationship between variables. |
Figure 2Simple pictorial of a Bayesian network without probabilities.