Literature DB >> 33477975

Identification of Target Chicken Populations by Machine Learning Models Using the Minimum Number of SNPs.

Dongwon Seo1,2, Sunghyun Cho1,2, Prabuddha Manjula1, Nuri Choi3, Young-Kuk Kim2,4, Yeong Jun Koh2,4, Seung Hwan Lee1,2, Hyung-Yong Kim5, Jun Heon Lee1,2.   

Abstract

A marker combination capable of classifying a specific chicken population could improve commercial value by increasing consumer confidence with respect to the origin of the population. This would facilitate the protection of native genetic resources in the market of each country. In this study, a total of 283 samples from 20 lines, which consisted of Korean native chickens, commercial native chickens, and commercial broilers with a layer population, were analyzed to determine the optimal marker combination comprising the minimum number of markers, using a 600 k high-density single nucleotide polymorphism (SNP) array. Machine learning algorithms, a genome-wide association study (GWAS), linkage disequilibrium (LD) analysis, and principal component analysis (PCA) were used to distinguish a target (case) group for comparison with control chicken groups. In the processing of marker selection, a total of 47,303 SNPs were used for classifying chicken populations; 96 LD-pruned SNPs (50 SNPs per LD block) served as the best marker combination for target chicken classification. Moreover, 36, 44, and 8 SNPs were selected as the minimum numbers of markers by the AdaBoost (AB), Random Forest (RF), and Decision Tree (DT) machine learning classification models, which had accuracy rates of 99.6%, 98.0%, and 97.9%, respectively. The selected marker combinations increased the genetic distance and fixation index (Fst) values between the case and control groups, and they reduced the number of genetic components required, confirming that efficient classification of the groups was possible by using a small number of marker sets. In a verification study including additional chicken breeds and samples (12 lines and 182 samples), the accuracy did not significantly change, and the target chicken group could be clearly distinguished from the other populations. The GWAS, PCA, and machine learning algorithms used in this study can be applied efficiently, to determine the optimal marker combination with the minimum number of markers that can distinguish the target population among a large number of SNP markers.

Entities:  

Keywords:  genome-wide association study (GWAS); linkage disequilibrium (LD); machine learning; principal component analysis (PCA); single nucleotide polymorphism (SNP)

Year:  2021        PMID: 33477975      PMCID: PMC7835996          DOI: 10.3390/ani11010241

Source DB:  PubMed          Journal:  Animals (Basel)        ISSN: 2076-2615            Impact factor:   2.752


  33 in total

1.  Comparative analysis of five different methods to design a breed-specific SNP panel for cattle.

Authors:  Harshit Kumar; Manjit Panigrahi; Supriya Chhotaray; Subhashree Parida; Anuj Chauhan; Bharat Bhushan; G K Gaur; B P Mishra; R K Singh
Journal:  Anim Biotechnol       Date:  2019-07-31       Impact factor: 2.282

2.  Comparisons of likelihood and machine learning methods of individual classification.

Authors:  B Guinand; A Topchy; K S Page; M K Burnham-Curtis; W F Punch; K T Scribner
Journal:  J Hered       Date:  2002 Jul-Aug       Impact factor: 2.645

3.  Genome-wide prediction for complex traits under the presence of dominance effects in simulated populations using GBLUP and machine learning methods.

Authors:  Anderson Antonio Carvalho Alves; Rebeka Magalhães da Costa; Tiago Bresolin; Gerardo Alves Fernandes Júnior; Rafael Espigolan; André Mauric Frossard Ribeiro; Roberto Carvalheiro; Lucia Galvão de Albuquerque
Journal:  J Anim Sci       Date:  2020-06-01       Impact factor: 3.159

Review 4.  Clinical utility of the polygenic LDL-C SNP score in familial hypercholesterolemia.

Authors:  Marta Futema; Mafalda Bourbon; Maggie Williams; Steve E Humphries
Journal:  Atherosclerosis       Date:  2018-10       Impact factor: 5.162

5.  Use of the canonical discriminant analysis to select SNP markers for bovine breed assignment and traceability purposes.

Authors:  C Dimauro; M Cellesi; R Steri; G Gaspa; S Sorbolini; A Stella; N P P Macciotta
Journal:  Anim Genet       Date:  2013-01-24       Impact factor: 3.169

6.  The development and characterization of a 60K SNP chip for chicken.

Authors:  Martien A M Groenen; Hendrik-Jan Megens; Yalda Zare; Wesley C Warren; LaDeana W Hillier; Richard P M A Crooijmans; Addie Vereijken; Ron Okimoto; William M Muir; Hans H Cheng
Journal:  BMC Genomics       Date:  2011-05-31       Impact factor: 3.969

7.  Application of high-dimensional feature selection: evaluation for genomic prediction in man.

Authors:  M L Bermingham; R Pong-Wong; A Spiliopoulou; C Hayward; I Rudan; H Campbell; A F Wright; J F Wilson; F Agakov; P Navarro; C S Haley
Journal:  Sci Rep       Date:  2015-05-19       Impact factor: 4.379

8.  Estimation of linkage disequilibrium and analysis of genetic diversity in Korean chicken lines.

Authors:  Dongwon Seo; Doo Ho Lee; Nuri Choi; Pita Sudrajad; Seung-Hwan Lee; Jun-Heon Lee
Journal:  PLoS One       Date:  2018-02-09       Impact factor: 3.240

9.  Discovery of significant porcine SNPs for swine breed identification by a hybrid of information gain, genetic algorithm, and frequency feature selection technique.

Authors:  Kitsuchart Pasupa; Wanthanee Rathasamuth; Sissades Tongsima
Journal:  BMC Bioinformatics       Date:  2020-05-26       Impact factor: 3.169

Review 10.  A Guide for Using Deep Learning for Complex Trait Genomic Prediction.

Authors:  Miguel Pérez-Enciso; Laura M Zingaretti
Journal:  Genes (Basel)       Date:  2019-07-20       Impact factor: 4.096

View more
  2 in total

1.  Feature Fusion and Detection in Alzheimer's Disease Using a Novel Genetic Multi-Kernel SVM Based on MRI Imaging and Gene Data.

Authors:  Xianglian Meng; Qingpeng Wei; Li Meng; Junlong Liu; Yue Wu; Wenjie Liu
Journal:  Genes (Basel)       Date:  2022-05-07       Impact factor: 4.141

2.  Identification of Ancestry Informative Markers in Mediterranean Trout Populations of Molise (Italy): A Multi-Methodological Approach with Machine Learning.

Authors:  Giovanna Salvatore; Valentino Palombo; Stefano Esposito; Nicolaia Iaffaldano; Mariasilvia D'Andrea
Journal:  Genes (Basel)       Date:  2022-07-28       Impact factor: 4.141

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.