| Literature DB >> 31369641 |
Shan Lin1, Hongyan Zhang1, Yali Hou2, Lin Liu3, Wenhui Li1, Jianping Jiang1, Bo Han1, Shengli Zhang1, Dongxiao Sun1.
Abstract
We have sequenced the whole genomes of eight proven Holstein bulls from the four half-sib or full-sib families with extremely high and low estimated breeding values (EBV) for milk protein percentage (PP) and fat percentage (FP) using Illumina re-sequencing technology. Consequently, 2.3 billion raw reads were obtained with an average effective depth of 8.1×. After single nucleotide variant (SNV) calling, total 10,961,243 SNVs were identified, and 57,451 of them showed opposite fixed sites between the bulls with high and low EBVs within each family (called as common differential SNVs). Next, we annotated the common differential SNVs based on the bovine reference genome, and observed that 45,188 SNVs (78.70%) were located in the intergenic region of genes and merely 11,871 SNVs (20.67%) located within the protein-coding genes. Of them, 13,099 common differential SNVs that were within or close to protein-coding genes with less than 5 kb were chosen for identification of candidate genes for milk compositions in dairy cattle. By integrated analysis of the 2,657 genes with the GO terms and pathways related to protein and fat metabolism, and the known quantitative trait loci (QTLs) for milk protein and fat traits, we identified 17 promising candidate genes: ALG14, ATP2C1, PLD1, C3H1orf85, SNX7, MTHFD2L, CDKN2D, COL5A3, FDX1L, PIN1, FIG4, EXOC7, LASP1, PGS1, SAO, GPLD1 and MGEA5. Our findings provided an important foundation for further study and a prompt for molecular breeding of dairy cattle.Entities:
Year: 2019 PMID: 31369641 PMCID: PMC6675115 DOI: 10.1371/journal.pone.0220629
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of the sequencing result and SNV counts for 8 extreme Holstein bulls.
| Sib-family | Sample | Raw reads | Mapped reads | Mapped reads (%) | Uniquely mapped reads (%) | Genome coverage (%) | Sequencing depth(X) | SNV |
|---|---|---|---|---|---|---|---|---|
| full-sib1 | high 1 | 289,952,310 | 261,783,075 | 92.01 | 82.9 | 98.58 | 8 | 4,925,685 |
| low 1 | 286,870,238 | 252,201,294 | 90.24 | 80.88 | 98.55 | 8 | 4,636,009 | |
| full-sib2 | high 2 | 292,878,886 | 257,840,281 | 91.28 | 82.97 | 98.59 | 8 | 5,075,588 |
| low 2 | 272,948,496 | 241,748,531 | 91.48 | 81.69 | 98.52 | 8 | 4,166,954 | |
| half-sib1 | high 3 | 251,953,446 | 218,677,291 | 89.03 | 81.15 | 98.34 | 7 | 4,306,882 |
| low 3 | 337,815,303 | 314,651,325 | 96.73 | 83.75 | 98.4 | 10 | 5,403,284 | |
| half-sib2 | high 4 | 288,003,254 | 253,311,627 | 91.01 | 82.45 | 98.61 | 8 | 4,572,191 |
| low 4 | 283,359,516 | 255,124,411 | 91.89 | 85.16 | 98.57 | 8 | 3,399,107 | |
| average | 287,972,681 | 256,917,229 | 91.71 | 82.62 | 98.52 | 8.1 | 4,560,713 | |
Annotation of 10,945,507 SNVs across 8 bulls.
| Category | Number of SNP | Percentage of SNP %1 |
|---|---|---|
| intergenic | 8,715,765 | 79.63 |
| upstream | 33,515 | 0.31 |
| downstream | 36,840 | 0.34 |
| upstream and downstream | 624 | 0.01 |
| 3’ UTR | 22,495 | 0.21 |
| 5’ UTR | 3,832 | 0.04 |
| ncRNA__exonicd | 216 | 0.002 |
| ncRNA__intronic | 2357 | 0.02 |
| intronic | 2,094,358 | 19.13 |
| exonic | 0.32 | |
| nonsynonymous | 11,843 | |
| synonymous | 19,617 | |
| stop gain | 75 | |
| stop loss | 5 | |
| unknown | 3,965 |
1Percentage was calculated based on total annotated SNVs.
aupstream from the nearest gene (<1kb).
bdownstream away from the nearest gene (<1kb).
cvariant located in both upstream and downstream regions for two different genes (<1kb).
dnon-coding RNA expressed within exon of a gene
enon-coding RNA expressed within intron of a gene
Fig 1Distribution of the 10,945,507 annotated SNVs according to the functional category.
After annotation of all SNVs among eight bulls, we found 8,715,765 intergenic SNVs (away from protein-coding genes more than 1 kb), 70,979 up/downstream SNVs, 2,573 ncRNA SNVs, 2,094,358 intronic SNVs, 26,327 untranslated regions (UTRs), 19,617 synonymous SNVs, 11,843 nonsynonymous substitutions, 75 stop gain SNVs, 5 stop loss SNVs and 3,965 unknown.
Fig 2The number of common differential SNVs in each chromosome.
Each point represents the location of a SNV on chromosome and the number above every chromosome represents the counts of SNV in this chromosome.
Annotation of 57,419 common differential SNVs across 8 bulls.
| Category | Number of SNVs | Percentage of SNVs %1 |
|---|---|---|
| intergenic | 45,188 | 78.70 |
| upstream | 161 | 0.28 |
| downstream | 193 | 0.34 |
| upstream;downstream | 4 | 0.01 |
| 3’UTR | 140 | 0.24 |
| 5’UTR | 14 | 0.02 |
| ncRNA_exonic | 2 | 0.003 |
| intronic | 11,541 | 20.10 |
| exonic | 0.31 | |
| nonsynonymous | 45 | |
| synonymous | 116 | |
| unknown | 15 |
1Percentage was calculated based on annotated 57,419 common differential SNVs.
aupstream from the nearest gene (<1kb).
bdownstream away from the nearest gene (<1kb).
cvariant located in both upstream and downstream regions for two different genes (<1kb).
dnon-coding RNA expressed within exon of a gene
Fig 3Distribution of the common differential SNVs in nine functional categories.
After annotation, we found 45,188 intergenic SNVs (away from protein coding genes more than 1kb), 358 up/downstream SNVs, 2 ncRNA SNVs, 11,541 intronic SNVs, 154 untranslated regions (UTRs), 116 synonymous SNVs, 45 nonsynonymous substitutions and 15 unknown.
Annotation of 13,099 common differential SNVs that were included or nearby 2,657 protein-coding genes.
| Category | Number of SNVs | Percentage of SNVs % |
|---|---|---|
| intregenic | 916 | 6.99 |
| upstream | 161 | 1.23 |
| downstream | 190 | 1.45 |
| upstream;downstream | 4 | 0.03 |
| 3’UTR | 140 | 0.11 |
| 5’UTR | 14 | 1.07 |
| intronic | 11,498 | 87.78 |
| exonic | 1.34 | |
| nonsynonymous | 45 | |
| synonymous | 116 | |
| unknown | 15 |
1Percentage was calculated based on 13,099 SNVs.
aupstream from the nearest gene (<1kb).
bdownstream away from the nearest gene (<1kb).
cvariant located in both upstream and downstream regions for two different genes (<1kb).
Fig 4Distribution of the 13,099 SNVs in eight functional categories.
A total of 13,099 SNVs was included or nearby the protein-coding genes with less than 5 kb. Of these, 916 SNVs were located in intergenic region (away from protein coding genes more than 1kb), 355 in up/downstream, 11,498 in intron, 154 untranslated regions (UTRs), 176 in exon (116 synonymous SNVs, 45 nonsynonymous substitutions and 15 unknown).
The information about the SNV allele in high and low groups of 17 candidate genes.
| Gene | Gene start | Gene end | Chromosome | Location of SNV | Physical position of SNV | SNV allele in high group | SNV allele in low group |
|---|---|---|---|---|---|---|---|
| ALG14 | 48600669 | 48699217 | 1 | UTR5 | 48600677 | T | C |
| ATP2C1 | 140368052 | 140522627 | 1 | exonic | 140375966 | A | G |
| 1 | exonic | 140388073 | A | G | |||
| PLD1 | 96517508 | 96676253 | 1 | exonic | 96594836 | C | T |
| C3H1orf85 | 14558546 | 14561400 | 3 | upstream | 14558465 | T | C |
| SNX7 | 44657521 | 44775833 | 3 | downstream | 44656702 | A | G |
| MTHFD2L | 90842884 | 90986715 | 6 | downstream | 90987596 | A | G |
| CDKN2D | 16298106 | 16300588 | 7 | UTR3 | 16298320 | G | T |
| COL5A3 | 15768627 | 15813561 | 7 | exonic | 15769886 | G | A |
| FDX1L | 16068661 | 16073072 | 7 | UTR3 | 16068671 | C | G |
| PIN1 | 15532997 | 15545250 | 7 | UTR3 | 15544947 | C | T |
| 7 | UTR3 | 15544948 | C | T | |||
| FIG4 | 40873044 | 41046003 | 9 | exonic | 40957415 | G | A |
| EXOC7 | 56190845 | 56208380 | 19 | exonic | 56206567 | G | A |
| LASP1 | 40090994 | 40131374 | 19 | UTR3 | 40129782 | C | G |
| 19 | downstream | 40131920 | A | G | |||
| PGS1 | 54404339 | 54441590 | 19 | downstream | 54404267 | T | C |
| SAO | 43555807 | 43559785 | 19 | downstream | 43560755 | A | G |
| 19 | downstream | 43560756 | A | C | |||
| GPLD1 | 32984455 | 33036038 | 23 | upstream | 32984009 | C | T |
| MGEA5 | 22390769 | 22417775 | 26 | upstream | 22418458 | A | G |
1genomic coordinates of genes were based on Bos_taurus_UMD_3.1
The detailed information of 17 candidate genes and the related QTLs.
| Gene | Position(bp) | Position(cM) | Previously reported QTL | |||
|---|---|---|---|---|---|---|
| Distance to QTL peak (cM) | CI and peak location(cM) | Trait | Reference | |||
| Chr1:48600669–48699217 | Chr3:48.4 | 0 | 48.37–48.37(peak:48.37) | PY | Marete et al., Frontiers in genetics, 2018[ | |
| Chr1:140368052–140522627 | Chr1:134.9 | 0.3 | 134.24–135.04 (peak:134.64) | PP | Russo et al., Animal genetics, 2012[ | |
| Chr1:96517508–96676253 | Chr1:93.7 | 0.2 | 77.7–122.3(peak:93.5) | PY | Nadesalingam et al., Mammalian genome,2001[ | |
| Chr3:14558546–14561400 | Chr3:24.3 | 0.7 | 22.6–27.4(peak:25) | FY,PP | Ashwell et al., J Dairy Sci,2004[ | |
| Chr3:44657521–44775833 | Chr3:46.7 | 0.8 | 45.93–45.93(peak:45.95) | FY | Cole JB et al., BMC Genomics, 2011[ | |
| 45.93–45.93(peak:45.95) | PY | Cole JB et al., BMC Genomics, 2011[ | ||||
| Chr6:90842884–90986715 | Chr6:100 | 0.9 | 100.09–100.09 (peak:100.9) | PP | Olsen HG et al., Genetics, selection, evolution : GSE, 2016[ | |
| 0.4 | 100.28–100.49 (peak:100.38) | PP | Zhou Y et al., BMC genomics, 2018[ | |||
| Chr7:16298106–16300588 | Chr7:18.4 | 0.6 | 15.2–38.5(peak:17.8,15.9) | PP | Ron et al., Journal of dairy science, 2004[ | |
| Chr7:15768627–15813561 | Chr7:17.9 | 0.1 | 15.2–38.5(peak:17.8,15.9) | PP | Ron et al., Journal of dairy science, 2004[ | |
| Chr7:16068661–16073072 | Chr7:18.2 | 0.4 | 15.2–38.5(peak:15.9,17.8) | PP | Ron et al., Journal of dairy science,2004[ | |
| Chr7:15532997–15545250 | Chr7:17.5 | 0.3 | 15.2–38.5(peak:17.8,15.9) | PP | Ron et al., Journal of dairy science,2004[ | |
| Chr9:40873044–41046003 | Chr9:45 | 0.6 | 42.5-50(peak:44.4,49.1) | FY,PY | Schnabel et al., Animal Genetics, 2005[ | |
| Chr19:56190845–56208380 | Chr19:96.1 | 0.9 | 82.8–101.4(peak:95.2) | PY | Boichard et al., Genetics,selection,evolution:GSE, 2003[ | |
| Chr19:40090994–40131374 | Chr19:51.3 | 0.7 | 2.4-91(peak:60.4) | FP | Bennewitz et al., Genetics, 2004[ | |
| Chr19:54404339–54441590 | Chr19:93.1 | 0.7 | 60.7–106.2(PEAK:92.4) | FY | Boichard et al., Genetics,selection,evolution:GSE, 2003[ | |
| Chr19:43555807–43559785 | Chr19:74.5 | 0.5 | 70.24–77.386 (peak:75.0) | FP | Viitala SM et al., J Dairy Sci, 2003[ | |
| Chr23:32984455–33036038 | Chr23:44.2 | 0.6 | 42.4–58.2(peak:43.6) | FY | Plante et al.,J Dairy Sci,2001[ | |
| Chr26:22390769–22417775 | Chr26:32.5 | 0.8 | 31.72–31.72(peak:31.72) | PP | Cole JB, et al, BMC Genomics, 2011[ | |
1The linkage position was estimated relative to UMD3.1 and based on the QTL mapper v.2.019 at www.animalgenome.org/cgi-bin/QTLdb/.
PP: protein percentage; PY: protein yield; FP: fat percentage; FY: fat yield.