| Literature DB >> 28172841 |
Lingyang Xu1,2,3, Ryan J Haasl4, Jiajie Sun5, Yang Zhou1,6, Derek M Bickhart1, Junya Li2, Jiuzhou Song3, Tad S Sonstegard1, Curtis P Van Tassell1, Harris A Lewin7, George E Liu1.
Abstract
Short tandem repeats (STRs), or microsatellites, are genetic variants with repetitive 2–6 base pair motifs in many mammalian genomes. Using high-throughput sequencing and experimental validations, we systematically profiled STRs in five Holsteins. We identified a total of 60,106 microsatellites and generated the first high-resolution STR map, representing a substantial pool of polymorphism in dairy cattle. We observed significant STRs overlap with functional genes and quantitative trait loci (QTL). We performed evolutionary and population genetic analyses using over 20,000 common dinucleotide STRs. Besides corroborating the well-established positive correlation between allele size and variance in allele size, these analyses also identified dozens of outlier STRs based on two anomalous relationships that counter expected characteristics of neutral evolution. And one STR locus overlaps with a significant region of a summary statistic designed to detect STR-related selection. Additionally, our results showed that only 57.1% of STRs located within SNP-based linkage disequilibrium (LD) blocks whereas the other 42.9% were out of blocks. Therefore, a substantial number of STRs are not tagged by SNPs in the cattle genome, likely due to STR's distinct mutation mechanism and elevated polymorphism. This study provides the foundation for future STR-based studies of cattle genome evolution and selection.Entities:
Mesh:
Year: 2017 PMID: 28172841 PMCID: PMC5381564 DOI: 10.1093/gbe/evw256
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Genomic landscape of STRs on autosomes in five Holsteins. Tracks from outside to inside are: STR frequencies across five Holsteins; chromosomes in different colors; frequencies of 11,676 STRs overlapped with genes; selected polymorphic genes; STR counts in each of 4,213 genes; Allele size plot for 100 invariant microsatellites with allele size ≥20; Variance plot in allele size for 44 dinucleotide loci that are diallelic with a maximum allele size at least two time greater than the size of the alternate allele.
PCR Sanger Sequencing Results vs. lobSTR Results for Selected STRs
| UMD3.1 | PCR (bp) | lobSTR (bp) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No | STR | Chr | Begin | End | Motif | (bp) | Animal | A1 | A2 | A1 | A2 | Call | Coverage |
|
| 1 | BM1818 | 23 | 39,294,224 | 39,294,249 | GT | 37 | Blackstar | 33 | 37 | 33 | 33 | P | 4 | 0.69 |
| 2 | Elevation | 35 | 37 | 37 | 37 | P | 10 | 2.39 | ||||||
| 3 | Ivanhoe | 33 | 37 | 37 | 37 | P | 2 | 0.30 | ||||||
| 4 | Chairman | 33 | 37 | 33 | 37 | Y | 30 | 0.00 | ||||||
| 5 | BM1824 | 1 | 132,498,006 | 132,498,034 | CA | 29 | Blackstar | 29 | 35 | 33 | 35 | P | 4 | 2.35 |
| 6 | Elevation | 29 | 35 | 27 | 35 | P | 10 | 2.39 | ||||||
| 7 | BM2113 | 2 | 127,591,877 | 127,591,917 | AC | 42 | Elevation | 26 | 48 | 28 | 38 | N | 14 | 5.52 |
| 8 | Blackstar | 28 | 36 | 28 | 40 | P | 6 | 3.88 | ||||||
| 9 | ETH10 | 5 | 56,657,954 | 56,657,996 | CA | 41 | Elevation | 39 | 43 | 37 | 43 | P | 6 | 4.09 |
| 10 | ETH152 | 5 | 114,885,382 | 114,885,416 | TG | 37 | Blackstar | 33 | 35 | 35 | 35 | P | 14 | 4.17 |
| 11 | Elevation | 33 | 35 | 35 | 35 | P | 6 | 1.79 | ||||||
| 12 | ETH225 | 9 | 10,858,165 | 10,858,199 | CA | 34 | Elevation | 32 | 34 | 30 | 32 | N | 8 | 0.69 |
| 13 | Blackstar | 26 | 38 | 38 | 38 | P | 2 | 0.25 | ||||||
| 14 | ETH3 | 19 | 56648417 | 56648461 | AC | 46 | Blackstar | 42 | 52 | 44 | 52 | P | 6 | 6.00 |
| 15 | HAUT27 | 26 | 29,127,336 | 29,127,370 | GT | 39 | Elevation | 29 | 41 | 31 | 31 | N | 2 | 0.37 |
| 16 | ILSTS006 | 7 | 96,709,240 | 96,709,279 | TG | 43 | Elevation | 39 | 43 | 35 | 41 | N | 6 | 4.12 |
| 17 | Blackstar | 39 | 43 | 43 | 43 | P | 2 | 0.37 | ||||||
| 18 | INRA023 | 3 | 33,011,005 | 33,011,044 | CA | 43 | Chairman | 31 | 37 | 31 | 31 | P | 28 | 3.62 |
| 19 | Blackstar | 31 | 35 | 31 | 35 | Y | 6 | 2.82 | ||||||
| 20 | Elevation | 35 | 39 | 35 | 39 | Y | 6 | 2.87 | ||||||
| 21 | Ivanhole | 31 | 39 | 31 | 39 | Y | 18 | 4.06 | ||||||
| 22 | INRA037 | 10 | 76,365,534 | 76,365,555 | TG | 32 | Blackstar | 31 | 38 | 31 | 31 | P | 8 | 1.94 |
| 23 | INRA063 | 18 | 40,699,867 | 40,699,892 | GT | 35 | Blackstar | 27 | 27 | 27 | 27 | Y | 4 | 0.80 |
| 24 | STR_chr8 | 8 | 38,693,251 | 38,693,297 | ACT | 46 | Blackstar | 39 | 39 | 39 | 39 | Y | 4 | 3.11 |
| 25 | Elevation | 39 | 39 | 39 | 39 | Y | 14 | 3.74 | ||||||
| 26 | STR_chr7 | 7 | 2,965,462 | 2,965,506 | ATCC | 44 | Blackstar | 44 | 44 | 44 | 44 | Y | 4 | 1.23 |
| 27 | Elevation | 44 | 48 | 44 | 48 | Y | 14 | 5.70 | ||||||
| 28 | STR_chr10 | 10 | 37,414,396 | 37,414,434 | ATCC | 38 | Blackstar | 38 | 38 | 38 | 38 | Y | 2 | 0.70 |
| 29 | Elevation | 38 | 38 | 38 | 38 | Y | 10 | 3.01 | ||||||
| 30 | STR_chr16 | 16 | 81,503,345 | 81,503,411 | ATCC | 66 | Blackstar | 30 | 66 | 30 | 66 | Y | 4 | big |
| 31 | Elevation | 30 | 30 | 30 | 30 | Y | 14 | 4.21 | ||||||
| 32 | STR_chr19 | 19 | 46,403,668 | 46,403,701 | ATCC | 33 | Blackstar | 33 | 37 | 33 | 37 | Y | 12 | big |
| 33 | Elevation | 33 | 33 | 33 | 33 | Y | 10 | 3.01 | ||||||
| 34 | STR_chr24 | 24 | 34,465,896 | 34,465,926 | AGAT | 30 | Blackstar | 30 | 30 | 30 | 30 | Y | 2 | 0.70 |
| 35 | Elevation | 26 | 26 | 30 | 30 | N | 8 | 2.41 | ||||||
Note.—Y: Both platforms agree. P: lobSTR reported only one allele out of two. N: lobSTR reported an allele that does not exist.
. 2.—Mean variance in allele size versus maximum allele size for 19,338 dinucleotide STRs with maximum allele size ≤30. Error bars are standard errors on the estimate of the mean variance in allele size.
. 3.—(A) The number of alleles for loci with maximum allele sizes on the interval [20,34]. Only 1.7% of these loci show no variation, which is unexpected for alleles of this size. (B) Kernel density estimate of variance in allele size of the same alleles summarized in (A). Median variance in allele size was 3.43 (vertical, dashed line). Only 1.7% of loci possessed variance in allele size of 0. (C) Kernel density estimate of variance in allele size for all 19,338 dinucleotide loci analysed. Of these, 44 (0.8%) were diallelic with a large alleles size at least two times as great as the small allele size. These loci show very high variance in allele size; all 44 loci possess variance in allele size >4.9, indicated by the vertical, dashed line.
. 4.—Biplot of the first two principal components based on an analysis of genetic variation at 19,338 dincucleotide STR loci.