| Literature DB >> 27085184 |
Derek M Bickhart1, Lingyang Xu2, Jana L Hutchison3, John B Cole3, Daniel J Null3, Steven G Schroeder3, Jiuzhou Song4, Jose Fernando Garcia5, Tad S Sonstegard3, Curtis P Van Tassell3, Robert D Schnabel6, Jeremy F Taylor7, Harris A Lewin8, George E Liu1.
Abstract
The diversity and population genetics of copy number variation (CNV) in domesticated animals are not well understood. In this study, we analysed 75 genomes of major taurine and indicine cattle breeds (including Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, and Romagnola), sequenced to 11-fold coverage to identify 1,853 non-redundant CNV regions. Supported by high validation rates in array comparative genomic hybridization (CGH) and qPCR experiments, these CNV regions accounted for 3.1% (87.5 Mb) of the cattle reference genome, representing a significant increase over previous estimates of the area of the genome that is copy number variable (∼2%). Further population genetics and evolutionary genomics analyses based on these CNVs revealed the population structures of the cattle taurine and indicine breeds and uncovered potential diversely selected CNVs near important functional genes, including AOX1, ASZ1, GAT, GLYAT, and KRTAP9-1 Additionally, 121 CNV gene regions were found to be either breed specific or differentially variable across breeds, such as RICTOR in dairy breeds and PNPLA3 in beef breeds. In contrast, clusters of the PRP and PAG genes were found to be duplicated in all sequenced animals, suggesting that subfunctionalization, neofunctionalization, or overdominance play roles in diversifying those fertility-related genes. These CNV results provide a new glimpse into the diverse selection histories of cattle breeds and a basis for correlating structural variation with complex traits in the future. Published by Oxford University Press on behalf of Kazusa DNA Research Institute 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.Entities:
Keywords: cattle genome; copy number variation; indicine; population sequencing; taurine
Mesh:
Year: 2016 PMID: 27085184 PMCID: PMC4909312 DOI: 10.1093/dnares/dsw013
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Samples and sequence data sets
| Breed | Subspecies | Purpose | Animal count | Coverage range | CNV count | Average CNVs per animala |
|---|---|---|---|---|---|---|
| Brahman (BRM) | Beef | 7 | 5–9× | 3,836 | 548 (86) | |
| Gir (GIR) | Beef/dairy | 6 | 5–14× | 3,724 | 621 (30) | |
| Nelore (NEL) | Beef | 8 | 6–20× | 4,855 | 607 (38) | |
| Angus (ANG) | Beef | 16 | 5–30× | 11,657 | 729 (52) | |
| Holstein (HOL) | Dairy | 22 | 4–20× | 12,430 | 565 (80) | |
| Jersey (JER) | Dairy | 6 | 4–13× | 3,487 | 581 (46) | |
| Limousin (LIM) | Beef | 6 | 5–10× | 3,650 | 608 (48) | |
| Romagnola (ROM) | Beef/draft | 4 | 6–10× | 2,708 | 677 (20) |
aNumbers in parentheses indicate 1 SD.
Figure 1.Population differentiation for copy number variation. Population differentiation, estimated by V, is plotted along each chromosome for the two taurine and indicine comparisons: (A) RefSeq genes and (B) genome-wide 1 kb windows. Example CNVs exhibiting high population differentiation are labelled. This figure is available in black and white in print and in colour at DNA Research online.
Figure 2.Population clustering based on CNV genotypes. A triangle plot showing the clustering of 69 lowly related cattle individuals assuming three ancestral populations (k = 3). The proximity of an individual to each apex of the triangle indicates the proportion of that genome that is estimated to have ancestry in each of the three inferred ancestral populations. The clustering together of most indicine individuals (BRM, GIR, NEL) in the right bottom apex indicates the clear discrimination between indicine and taurine cattle. In contrast, taurine cattle are scattered along the opposing side with the exception of ROM in the centre. ANG individuals were clustered together in the upper apex, while the other taurine cattle (HOL, LMS, JER) were dispersed around the left bottom corner, suggesting a possible discrimination between beef and dairy cattle. This figure is available in black and white in print and in colour at DNA Research online.
Selected copy number variable genes identified from population sequence data
| Gene name | Function | Gene UMD3.1 coordinates | Identifiedb | |
|---|---|---|---|---|
| Detoxification | chr2:89517708-89589232 | 0.5094 | Hou, Bickhart, and this study | |
| Spermatogenesis | chr4:51294534-51370343 | 0.2109 | Only this study | |
| Carbonic anhydrase | chr14:79520632-79530892 | 0.3270 | Hou, Bickhart, and this study | |
| Complement factor | chr16:5486704-6172566 | 0.0483 | Hou, Bickhart, and this study | |
| Translation initiation | chr28:25376358-25399769 | 0.2375 | Only this study | |
| Translation initiation | chr29:7723699-7725004 | 0.3285 | Only this study | |
| Ubiquitin protein ligase | chr8:10095869-10128675 | 0.3558 | Bickhart and this study | |
| Nervous system | chr8:10002971-10091175 | 0.3334 | Bickhart and this study | |
| Glycolipid catalysis | chr17:71660016-71678806 | 0.2435 | Hou and this study | |
| Detoxification | chr15:83472190-83493607 | 0.4336 | Liu, Hou, Bickhart, and this study | |
| Detoxification | chr15:83455512-83469280 | 0.4083 | Liu, Hou, Bickhart, and this study | |
| Biological oxidation | chr15:83508339-83515102 | 0.2257 | Bickhart and this study | |
| Keratin family | chr19:42101853-42103421 | 0.4578 | Bickhart and this study | |
| Function unknown | chr20:38116509-38163145 | 0.2573 | Only this study | |
| Progesterone receptor | chr15:8207682-8222806 | 0.0103 | Only this study | |
| Adipose tissue regulation | chr29:50742384-50747161 | 0.0000 | Only this study | |
| Carbohydrate binding | chr15:81920283-81926082 | 0.2041 | Liu, Bickhart, and this study | |
| MHC class 1 related | chr9:88231932-88402262 | 0.0000 | Liu, Hou, Bickhart, and this study | |
| Cell growth | chr20:35376523-35514753 | 0.1048 | Only this study | |
| Vesicle transport | chr21:49489514-49555507 | 0.3050 | Only this study | |
| Protease inhibitor | chr24:62364701-62371668 | 0.2418 | Liu, Hou, Bickhart, and this study | |
| Transcriptional activation | chr20:41122022-41143914 | 0.2282 | Only this study | |
| Secretory vesicle transport | chr17:54330420-54338333 | 0.0000 | Only this study | |
| Ubiquitin | chr6:71051155-71053533 | 0.5121 | Only this study | |
| Negative regulation of p53 | chr17:51251538-51262528 | 0.2658 | Liu, Hou, and this study |
aVST was calculated from the comparison between the taurine and indicine individuals.
bLiu, Hou, and Bickhart: we focused on the comparisons with the published CNV results based on the same bovine HapMap samples using array CGH,[10] BovineHD SNP array,[53] and individual NGS,[16] respectively.
Figure 3.Cattle gene family copy number diversity and evolution. The genes most stratified by copy number on the basis of VST analysis of taurine and indicine cattle (A). The most copy number variable genes in both taurine and indicine subspecies (legend insets denote group colors) tended to be immune system-related genes. Histograms showing the distributions of copy numbers among the unrelated individuals in each group are plotted for the KRTAP9-1 gene (B) and the MCM4 gene (C). X-axis values indicate copy number and Y-axis values indicate sample count. Individual copy number values for each gene can be found in Supplementary Table S7. This figure is available in black and white in print and in colour at DNA Research online.
Figure 4.Haplotype networks of two loci. (A) The GAT/GLYAT locus and (B) the ASZ1 locus. Each node represents a different haplotype, with the size of the circle proportional to frequency. Circles are colour coded according to breeds. This figure is available in black and white in print and in colour at DNA Research online.