Literature DB >> 33874881

Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods.

Muhammad Muneeb1, Andreas Henschel2.   

Abstract

BACKGROUND: Genotype-phenotype predictions are of great importance in genetics. These predictions can help to find genetic mutations causing variations in human beings. There are many approaches for finding the association which can be broadly categorized into two classes, statistical techniques, and machine learning. Statistical techniques are good for finding the actual SNPs causing variation where Machine Learning techniques are good where we just want to classify the people into different categories. In this article, we examined the Eye-color and Type-2 diabetes phenotype. The proposed technique is a hybrid approach consisting of some parts from statistical techniques and remaining from Machine learning.
RESULTS: The main dataset for Eye-color phenotype consists of 806 people. 404 people have Blue-Green eyes where 402 people have Brown eyes. After preprocessing we generated 8 different datasets, containing different numbers of SNPs, using the mutation difference and thresholding at individual SNP. We calculated three types of mutation at each SNP no mutation, partial mutation, and full mutation. After that data is transformed for machine learning algorithms. We used about 9 classifiers, RandomForest, Extreme Gradient boosting, ANN, LSTM, GRU, BILSTM, 1DCNN, ensembles of ANN, and ensembles of LSTM which gave the best accuracy of 0.91, 0.9286, 0.945, 0.94, 0.94, 0.92, 0.95, and 0.96% respectively. Stacked ensembles of LSTM outperformed other algorithms for 1560 SNPs with an overall accuracy of 0.96, AUC = 0.98 for brown eyes, and AUC = 0.97 for Blue-Green eyes. The main dataset for Type-2 diabetes consists of 107 people where 30 people are classified as cases and 74 people as controls. We used different linear threshold to find the optimal number of SNPs for classification. The final model gave an accuracy of 0.97%.
CONCLUSION: Genotype-phenotype predictions are very useful especially in forensic. These predictions can help to identify SNP variant association with traits and diseases. Given more datasets, machine learning model predictions can be increased. Moreover, the non-linearity in the Machine learning model and the combination of SNPs Mutations while training the model increases the prediction. We considered binary classification problems but the proposed approach can be extended to multi-class classification.

Entities:  

Keywords:  Bioinformatics; Eye color; Genotype–phenotype; Machine learning; Type-2 diabetes

Year:  2021        PMID: 33874881     DOI: 10.1186/s12859-021-04077-9

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  26 in total

1.  Test of association between haplotypes and phenotypes in case-control studies: examination of validity of the application of an algorithm for samples from cohort or clinical trials to case-control samples using simulated and real data.

Authors:  Shiori Furihata; Toshikazu Ito; Naoyuki Kamatani
Journal:  Genetics       Date:  2006-09-15       Impact factor: 4.562

2.  DNA-based eye colour prediction across Europe with the IrisPlex system.

Authors:  Susan Walsh; Andreas Wollstein; Fan Liu; Usha Chakravarthy; Mati Rahu; Johan H Seland; Gisele Soubrane; Laura Tomazzoli; Fotis Topouzis; Johannes R Vingerling; Jesus Vioque; Astrid E Fletcher; Kaye N Ballantyne; Manfred Kayser
Journal:  Forensic Sci Int Genet       Date:  2011-08-02       Impact factor: 4.882

3.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

Review 4.  Why are individuals so different from each other?

Authors:  P Bateson
Journal:  Heredity (Edinb)       Date:  2014-11-19       Impact factor: 3.821

5.  Protocols, Methods, and Tools for Genome-Wide Association Studies (GWAS) of Dental Traits.

Authors:  Cary S Agler; Dmitry Shungin; Andrea G Ferreira Zandoná; Paige Schmadeke; Patricia V Basta; Jason Luo; John Cantrell; Thomas D Pahel; Beau D Meyer; John R Shaffer; Arne S Schaefer; Kari E North; Kimon Divaris
Journal:  Methods Mol Biol       Date:  2019

Review 6.  A tutorial on statistical methods for population association studies.

Authors:  David J Balding
Journal:  Nat Rev Genet       Date:  2006-10       Impact factor: 53.242

7.  Mass spectra alignment using virtual lock-masses.

Authors:  Francis Brochu; Pier-Luc Plante; Alexandre Drouin; Dominic Gagnon; Dave Richard; Francine Durocher; Caroline Diorio; Mario Marchand; Jacques Corbeil; François Laviolette
Journal:  Sci Rep       Date:  2019-06-11       Impact factor: 4.379

8.  Eye color prediction using single nucleotide polymorphisms in Saudi population.

Authors:  Jahad Alghamdi; Manal Amoudi; Ahmad Ch Kassab; Mansour Al Mufarrej; Saleh Al Ghamdi
Journal:  Saudi J Biol Sci       Date:  2018-09-28       Impact factor: 4.219

9.  S100A6 Promotes B Lymphocyte Penetration Through the Blood-Brain Barrier in Autoimmune Encephalitis.

Authors:  Meng-Han Tsai; Chih-Hsiang Lin; Kuo-Wang Tsai; Ming-Hong Lin; Chen-Jui Ho; Yan-Ting Lu; Ken-Pen Weng; Yuyu Lin; Pei-Hsien Lin; Sung-Chou Li
Journal:  Front Genet       Date:  2019-11-22       Impact factor: 4.599

10.  Quantification of the Underlying Mechanisms and Relationships Among Cancer, Metastasis, and Differentiation and Development.

Authors:  Chong Yu; Qiong Liu; Cong Chen; Jin Wang
Journal:  Front Genet       Date:  2020-03-02       Impact factor: 4.599

View more
  3 in total

1.  Correction to: Eye‑color and Type‑2 diabetes phenotype prediction from genotype data using deep learning methods.

Authors:  Muhammad Muneeb; Andreas Henschel
Journal:  BMC Bioinformatics       Date:  2021-06-11       Impact factor: 3.169

2.  Development and validation of immune-based biomarkers and deep learning models for Alzheimer's disease.

Authors:  Yijie He; Lin Cong; Qinfei He; Nianping Feng; Yun Wu
Journal:  Front Genet       Date:  2022-08-22       Impact factor: 4.772

3.  LSTM input timestep optimization using simulated annealing for wind power predictions.

Authors:  Muhammad Muneeb
Journal:  PLoS One       Date:  2022-10-07       Impact factor: 3.752

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.