| Literature DB >> 28084449 |
Neo Christopher Chung1, Joanna Szyda1, Magdalena Frąszczak1.
Abstract
Since domestication, population bottlenecks, breed formation, and selective breeding have radically shaped the genealogy and genetics of Bos taurus. In turn, characterization of population structure among diverse bull (males of Bos taurus) genomes enables detailed assessment of genetic resources and origins. By analyzing 432 unrelated bull genomes from 13 breeds and 16 countries, we demonstrate genetic diversity and structural complexity among the European/Western cattle population. Importantly, we relaxed a strong assumption of discrete or admixed population, by adapting latent variable models for individual-specific allele frequencies that directly capture a wide range of complex structure from genome-wide genotypes. As measured by magnitude of differentiation, selection pressure on SNPs within genes is substantially greater than that on intergenic regions. Additionally, broad regions of chromosome 6 harboring largest genetic differentiation suggest positive selection underlying population structure. We carried out gene set analysis using SNP annotations to identify enriched functional categories such as energy-related processes and multiple development stages. Our population structure analysis of bull genomes can support genetic management strategies that capture structural complexity and promote sustainable genetic breadth.Entities:
Mesh:
Year: 2017 PMID: 28084449 PMCID: PMC5234001 DOI: 10.1038/srep40688
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Average sequencing coverage of 432 bull samples.
Samples with average sequencing coverage >5 are removed in a preprocessing step.
Figure 2Bar plot of cattle breeds, with a number of samples colored by countries of origin.
Figure 3Hierarchical clustering of 432 bull genomes.
Genome-wide SNPs are clustered using Manhattan distances and samples are colored by breeds.
Figure 4Scatterplots of the top two logistic factors (LFs).
Data points corresponding to 432 bull genomes are colored by 13 breeds. Other scatterplots and interactive visualization are available at https://nnnn.shinyapps.io/bullstructure/.
Figure 5Genome-wide pseudo R2 measures with respect to logistic factors (LFs).
The distribution is highly skewed towards 0, which leads to overplotting in a low range (see an insert for a genome-wide histogram). Overall, the median and mean are 0.070 and 0.087, respectively.
Enriched functional clusters, for genes associated with R 2 > 0.5.
| Category | Term | Count | % | P Value |
|---|---|---|---|---|
| INTERPRO | IPR018247:EF-HAND 1 | 6 | 2.098 | 0.035 |
| INTERPRO | IPR018249:EF-HAND 2 | 6 | 2.098 | 0.038 |
| INTERPRO | IPR011992:EF-Hand type | 6 | 2.098 | 0.045 |
| GOTERM_MF_FAT | GO:0004198 ~ calcium-dependent cysteine-type endopeptidase activity | 3 | 1.049 | 0.011 |
| GOTERM_MF_FAT | GO:0008234 ~ cysteine-type peptidase activity | 4 | 1.399 | 0.066 |
| GOTERM_MF_FAT | GO:0004197 ~ cysteine-type endopeptidase activity | 3 | 1.049 | 0.106 |
| PIR_SUPERFAMILY | PIRSF000045:cytochrome P450 CYP2D6 | 3 | 1.049 | 0.013 |
| INTERPRO | IPR002401:Cytochrome P450, E-class, group I | 3 | 1.049 | 0.068 |
| INTERPRO | IPR017973:Cytochrome P450, C-terminal region | 3 | 1.049 | 0.080 |
| INTERPRO | IPR017972:Cytochrome P450, conserved site | 3 | 1.049 | 0.084 |
| SP_PIR_KEYWORDS | heme | 4 | 1.399 | 0.091 |
| INTERPRO | IPR001128:Cytochrome P450 | 3 | 1.049 | 0.107 |
| SP_PIR_KEYWORDS | Monooxygenase | 3 | 1.049 | 0.124 |
| COG_ONTOLOGY | Secondary metabolites biosynthesis, transport, and catabolism | 3 | 1.049 | 0.148 |
| GOTERM_MF_FAT | GO:0020037 ~ heme binding | 4 | 1.399 | 0.159 |
| GOTERM_MF_FAT | GO:0046906 ~ tetrapyrrole binding | 4 | 1.399 | 0.176 |
| GOTERM_MF_FAT | GO:0009055 ~ electron carrier activity | 4 | 1.399 | 0.301 |
| SP_PIR_KEYWORDS | iron | 4 | 1.399 | 0.399 |
| GOTERM_MF_FAT | GO:0005506 ~ iron ion binding | 4 | 1.399 | 0.614 |
| UP_SEQ_FEATURE | signal peptide | 19 | 6.643 | 0.048 |
| SP_PIR_KEYWORDS | signal | 19 | 6.643 | 0.111 |
| SP_PIR_KEYWORDS | glycoprotein | 16 | 5.594 | 0.492 |
| GOTERM_BP_FAT | GO:0045137 ~ development of primary sexual characteristics | 3 | 1.049 | 0.117 |
| GOTERM_BP_FAT | GO:0003006 ~ reproductive developmental process | 4 | 1.399 | 0.151 |
| GOTERM_BP_FAT | GO:0007548 ~ sex differentiation | 3 | 1.049 | 0.180 |
| GOTERM_MF_FAT | GO:0043167 ~ ion binding | 40 | 13.986 | 0.130 |
| GOTERM_MF_FAT | GO:0046872 ~ metal ion binding | 38 | 13.287 | 0.190 |
| GOTERM_MF_FAT | GO:0043169 ~ cation binding | 38 | 13.287 | 0.213 |
| GOTERM_BP_FAT | GO:0030324 ~ lung development | 3 | 1.049 | 0.145 |
| GOTERM_BP_FAT | GO:0030323 ~ respiratory tube development | 3 | 1.049 | 0.145 |
| GOTERM_BP_FAT | GO:0060541 ~ respiratory system development | 3 | 1.049 | 0.150 |
| GOTERM_BP_FAT | GO:0035295 ~ tube development | 3 | 1.049 | 0.400 |
| GOTERM_MF_FAT | GO:0004175 ~ endopeptidase activity | 8 | 2.797 | 0.129 |
| GOTERM_MF_FAT | GO:0070011 ~ peptidase activity, acting on L-amino acid peptides | 9 | 3.147 | 0.190 |
| GOTERM_MF_FAT | GO:0008233 ~ peptidase activity | 9 | 3.147 | 0.215 |
| GOTERM_BP_FAT | GO:0006508 ~ proteolysis | 12 | 4.196 | 0.242 |
| UP_SEQ_FEATURE | calcium-binding region:2 | 3 | 1.049 | 0.126 |
| INTERPRO | IPR002048:Calcium-binding EF-hand | 4 | 1.399 | 0.148 |
| UP_SEQ_FEATURE | calcium-binding region:1 | 3 | 1.049 | 0.157 |
| SMART | SM00054:EFh | 4 | 1.399 | 0.187 |
| UP_SEQ_FEATURE | domain:EF-hand 1 | 3 | 1.049 | 0.258 |
| UP_SEQ_FEATURE | domain:EF-hand 2 | 3 | 1.049 | 0.258 |
| INTERPRO | IPR018248:EF hand | 3 | 1.049 | 0.333 |
| GOTERM_BP_FAT | GO:0001824 ~ blastocyst development | 3 | 1.049 | 0.082 |
| GOTERM_BP_FAT | GO:0001701 ~ in utero embryonic development | 4 | 1.399 | 0.165 |
| GOTERM_BP_FAT | GO:0043009 ~ chordate embryonic development | 4 | 1.399 | 0.397 |
| GOTERM_BP_FAT | GO:0009792 ~ embryonic development ending in birth or egg hatching | 4 | 1.399 | 0.400 |
| KEGG_PATHWAY | bta05412:Arrhythmogenic right ventricular cardiomyopathy (ARVC) | 3 | 1.049 | 0.240 |
| KEGG_PATHWAY | bta05410:Hypertrophic cardiomyopathy (HCM) | 3 | 1.049 | 0.277 |
| KEGG_PATHWAY | bta05414:Dilated cardiomyopathy | 3 | 1.049 | 0.304 |
| GOTERM_MF_FAT | GO:0004672 ~ protein kinase activity | 9 | 3.147 | 0.213 |
| GOTERM_BP_FAT | GO:0006468 ~ protein amino acid phosphorylation | 9 | 3.147 | 0.291 |
| GOTERM_BP_FAT | GO:0016310 ~ phosphorylation | 9 | 3.147 | 0.447 |