| Literature DB >> 35360871 |
Qi Bao1,2, Xiaoming Ma1,2, Congjun Jia3, Xiaoyun Wu1,2, Yi Wu1, Guangyao Meng1,2, Pengjia Bao1,2, Min Chu1,2, Xian Guo1,2, Chunnian Liang1,2, Ping Yan1,2.
Abstract
Tianzhu white yak is a rare local yak breed with a pure white coat in China. In recent years, breeders have discovered long-haired individuals characterized by long hair on the forehead in the Tianzhu white yak, and the length and density of the hair on these two parts of the body are higher than that of the normal Tianzhu white yak. To elucidate the genetic mechanism of hair length in Tianzhu white yak, we re-sequence the whole genome of long-haired Tianzhu White yak (LTWY) (n = 10) and normal Tianzhu White yak (NTWY) (n = 10). Then, fixation index (F ST), θπ ratio, cross-population composite likelihood ratio (XP-CLR), integrated haplotype score (iHS), cross-population extended haplotype homozygosity (XP-EHH), and one composite method, the de-correlated composite of multiple signals (DCMS) were performed to discover the loci and genes related to long-haired traits. Based on five single methods, we found two hotspots of 0.2 and 1.1 MB in length on chromosome 6, annotating two (FGF5, CFAP299) and four genes (ATP8A1, SLC30A9, SHISA3, TMEM33), respectively. Function enrichment analysis of genes in two hotspots revealed Ras signaling pathway, MAPK signaling pathway, PI3K-Akt signaling pathway, and Rap1 signaling pathway were involved in the process of hair length differences. Besides, the DCMS method further found that four genes (ACOXL, PDPK1, MAGEL2, CDH1) were associated with hair follicle development. Henceforth, our work provides novel genetic insights into the mechanisms of hair growth in the LTWY.Entities:
Keywords: DCMS; long-haired trait; resequencing; selection signal; yak
Year: 2022 PMID: 35360871 PMCID: PMC8962741 DOI: 10.3389/fgene.2022.798076
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Summary statistics of NTWY and LTWY re-sequenced reads.
| Sample name | Number | Raw reads | Mapped reads | Properly paired reads | Average coverage | Average fold |
|---|---|---|---|---|---|---|
| NTWY | 10 | 1,544,406,470 | 1,518,525,222 | 1,428,802,730 | 98.33% | 7.58 |
| LTWY | 10 | 1,520,934,131 | 1,495,623,348 | 1,409,965,496 | 98.33% | 7.38 |
| Total | 20 | 3,065,340,601 | 3,014,148,570 | 2,838,768,226 | 98.33% | 7.48 |
Functional annotation of the identified single-nucleotide polymorphisms (SNPs) in NTWY and LTWY.
| Fields | NTWY | LTWY | Total | |
|---|---|---|---|---|
| Sample counts | 10 | 10 | 20 | |
| SNP count | 15,331,905 | 15,124,083 | 16,708,655 | |
| Ts/Tv ratio | 2.497 | 2.496 | *** | |
| Hom/Het | 0.61 | 0.63 | *** | |
| SNP types | ||||
| Exon | Synonymous variant | 128,679 | 132,764 | 152,356 |
| Initiator codon variant | 28 | 23 | 16 | |
| Start lost | 243 | 271 | 208 | |
| start_retained_variant | 2 | 2 | 39 | |
| Stop gained | 1,840 | 1,780 | 1,857 | |
| Stop lost | 227 | 235 | 197 | |
| Stop retained variant | 130 | 123 | 190 | |
| Splice site | Splice region variant | 25,046 | 25,556 | 26,836 |
| Splice acceptor variant | 546 | 546 | 429 | |
| Splice donor variant | 724 | 725 | 629 | |
| Intron | Intron variant | 13,948,749 | 13,869,588 | 15,421,077 |
| Intragenic variant | 568 | 632 | 607 | |
| UTR | 5 prime UTR variant | 24,657 | 25,413 | 23,110 |
| 5 prime UTR premature start codon gain variant | 3,687 | 3,836 | 3,914 | |
| 3 prime UTR variant | 55,993 | 57,992 | 60,922 | |
| Intergenic | Upstream gene variant | 1,174,257 | 1,194,925 | 1,277,565 |
| Downstream gene variant | 1,190,326 | 1,209,439 | 1,296,616 | |
| Intergenic region | 9,907,119 | 9,727,032 | 10,736,862 | |
| Functional classes | Missense | 106,774 | 107,373 | 209,965 |
| Nonsense | 1,840 | 1,780 | 3,746 | |
| Silent | 128,813 | 132,891 | 252,509 |
FIGURE 1Principal component analysis of NTWY and LTWY populations.
FIGURE 2(A) Manhattan plot of five selection signals on chromosome 6. The two hotspots are marked with shading. (B) Genes identified in two hotspots. (C) Functional enrichment analysis of 0.2-MB hotspot. (D) Functional enrichment analysis of 1.1-MB hotspot.
Allele frequencies for missense mutations in the candidate genes identified in NTWY and LTWY.
| Sites | Gene | Amino acid variation | Allele frequency (NTWY) | Allele frequency (LTWY) | Genotype | Genotype frequency (NTWY) | Genotype frequency (LTWY) | ||
|---|---|---|---|---|---|---|---|---|---|
| Before mutation | After mutation | Before mutation | After mutation | ||||||
| c.302G > C |
| Ser101Thr | 1.00 | 0.00 | 0.78 | 0.22 | CC | 1.00 | 0.67 |
| CG | 0.00 | 0.22 | |||||||
| GG | 0.00 | 0.11 | |||||||
| c.199G > A |
| Ala67Thr | 0.90 | 0.10 | 1.00 | 0.00 | CC | 0.90 | 1.00 |
| CT | 0.00 | 0.00 | |||||||
| TT | 0.10 | 0.00 | |||||||
| c.958G > A |
| Asp320Asn | 0.95 | 0.05 | 1.00 | 0.00 | CC | 0.90 | 1.00 |
| CT | 0.10 | 0.00 | |||||||
| TT | 0.00 | 0.00 | |||||||
| c.1360G > T |
| Val454Leu | 1.00 | 0.00 | 0.85 | 0.15 | CC | 1.00 | 0.80 |
| CA | 0.00 | 0.10 | |||||||
| AA | 0.00 | 0.10 | |||||||
| c.2274C > A |
| His758Gln | 0.00 | 1.00 | 0.06 | 0.94 | GG | 0.00 | 0.00 |
| GT | 0.00 | 0.11 | |||||||
| TT | 1.00 | 0.89 | |||||||
| c.325A > G |
| Met109Val | 0.86 | 0.14 | 1.00 | 0.00 | TT | 0.90 | 1.00 |
| TC | 0.00 | 0.00 | |||||||
| CC | 0.10 | 0.00 | |||||||
| c.1579T > C |
| Ser527Pro | 0.95 | 0.05 | 1.00 | 0.00 | AA | 0.90 | 1.00 |
| AG | 0.10 | 0.00 | |||||||
| GG | 0.00 | 0.00 | |||||||
FIGURE 3(A) Manhattan plot of decorrelated composite of multiple signals q-values of NTWY (skyblue) and LTWY (yellow) populations. (B) Overlapping genes identified by DCMS method between NTWY and LTWY. (C) Functional enrichment analysis of overlapping genes.