| Literature DB >> 26323515 |
Lele Wang, Yanjun Zhang, Meng Zhao, Ruijun Wang, Rui Su, Jinquan Li.
Abstract
The goat Capra hircus is one of several economically important livestock in China. Advances in molecular genetics have led to the identification of several single nucleotide variation markers associated with genes affecting economic traits. Validation of single nucleotide variations in a whole-transcriptome sequencing is critical for understanding the information of molecular genetics. In this paper, we aim to develop a large amount of convinced single nucleotide polymorphisms (SNPs) for Cashmere goat through transcriptome sequencing. In this study, the transcriptomes of Cashmere goat skin at four stages were measured using RNA-sequencing and 90% to 92% unique-mapped-reads were obtained from total-mapped-reads. A total of 56,231 putative SNPs distributed among 10,057 genes were identified. The average minor allele frequency of total SNPs was 18%. GO and KEGG pathway analysis were conducted to analyze the genes containing SNPs. Our follow up biological validation revealed that 64% of SNPs were true SNPs. Our results show that RNA-sequencing is a fast and efficient method for identification of a large number of SNPs. This work provides significant genetic resources for further research on Cashmere goats, especially for the high density linkage map construction and genome-wide association studies.Entities:
Keywords: Capra hircus; Goats; RNA Sequencing; Single Nucleotide Polymorphism; Transcriptome
Year: 2015 PMID: 26323515 PMCID: PMC4554862 DOI: 10.5713/ajas.15.0172
Source DB: PubMed Journal: Asian-Australas J Anim Sci ISSN: 1011-2367 Impact factor: 2.509
Quality control results and the high-quality clean reads of each sample
| Sample | QC | Raw data | Clean reads | Total mapped | Unique mapped |
|---|---|---|---|---|---|
| March | Q>30 | 37,009,564 | 30,271,778 (81.79%) | 24,382,712 (80.55%) | 22,514,780 (92.34%) |
| June | Q>30 | 33,605,950 | 30,112,074 (89.60%) | 24,280,948 (80.64%) | 22,352,163 (92.06%) |
| September | Q>30 | 38,272,220 | 32,450,593 (84.79%) | 25,307,393 (77.99%) | 23,083,567 (91.21%) |
| December | Q>30 | 31,993,598 | 28,435,878 (88.88%) | 23,184,085 (81.53%) | 21,398,690 (92.30%) |
QC, quality control.
Figure 1Quality score across all bases of March skin sample.
Figure 2Single nucleotide polymorphism occurrence over coding sequence, intron, and intergenic genomic regions.
Classification of putative SNPs
| SNP classification | Num_putative SNPs |
|---|---|
| Inter-genic | 9,543 |
| Down_stream | 2,904 |
| Exon | 35,763 |
| Intron | 4,611 |
| Up_stream | 3,410 |
| Total | 56,231 |
SNPs, single nucleotide polymorphisms.
Figure 3Single nucleotide polymorphism (SNP) distribution among genes. The horizontal axis represents number of SNPs per gene.
The top twenty annotated genes with the highest SNP frequency
| Gene | Num_SNP | Num_per_killo | Symbol | Product |
|---|---|---|---|---|
| goat_ENSP00000334922-D9_gene | 8 | 16.632 | LOC101113420 | Keratin-associated protein 4-7-like, transcript variant 2 |
| goat_ENSP00000396652_gene | 5 | 16.556 | MTPN | Myotrophin |
| goat_GLEAN_10007307_gene | 3 | 15.625 | - | - |
| goat_ENSBTAP00000024766-D2_gene | 4 | 14.76 | LOC101104203 | Keratin-associated protein 12-2-like |
| goat_GLEAN_10016337_gene | 6 | 12.987 | RPS9 | Ribosomal protein S9, transcript variant 2 |
| goat_ENSBTAP00000027844_gene | 3 | 12.048 | LOC101104441 | Interferon-induced transmembrane protein 3-like |
| goat_GLEAN_10010294_gene | 4 | 11.204 | B2M | Beta-2-microglobulin |
| goat_GLEAN_10008503_gene | 4 | 9.804 | LOC101108538 | Heterogeneous nuclear ribonucleo protein A1-like |
| goat_GLEAN_10009934_gene | 6 | 9.662 | LOC101113430 | Major allergen I polypeptide chain 1-like |
| goat_GLEAN_10011623_gene | 2 | 9.662 | LOC101102454 | ATP synthase subunit, mitochondrial-like |
| goat_GLEAN_10009932_gene | 7 | 8.872 | LOC101120105 | Uncharacterized LOC101120105 |
| goat_GLEAN_10017954_gene | 4 | 8.772 | RPS13 | Ribosomal protein S13 |
| goat_GLEAN_10009936_gene | 5 | 8.696 | LOC101120105 | - |
| goat_GLEAN_10013078_gene | 5 | 8.562 | LOC101114018 | Elongation factor 1-alpha 1-like |
| goat_ENSBTAP00000001484-D8_gene | 4 | 8.421 | LOC101118260 | Olfactory receptor 6C3-like |
| goat_GLEAN_10009821_gene | 3 | 8.403 | RPS15 | Uncharacterized LOC101120105 |
| goat_ENSP00000360682_gene | 6 | 8.368 | - | - |
| goat_GLEAN_10020768_gene | 4 | 7.937 | WBP11 | WW domain binding protein 11 |
| goat_ENSBTAP00000052749-D6_gene | 3 | 7.874 | LOC101111915 | Histone H2B type 1-like, transcript variant 1 |
| goat_GLEAN_10021032_gene | 2 | 7.752 | AP4E1 | Adaptor-related protein complex 4, epsilon 1 subunit |
SNP, single nucleotide polymorphism; ATP, adenosine triphosphate; -, no result.
Figure 4Statistics of single nucleotide polymorphism (SNP) read depth in Cashmere goat skin transcriptome. The horizontal axis represents the read depth of SNPs. The vertical axis represents the number of SNPs with the corresponding read depth.
Figure 5Statistics of minor allele frequency (MAF) of discovered single nucleotide polymorphisms (SNPs) in Cashmere goat skin transcriptome. The horizontal axis represents the SNP MAF in percentage, while the vertical axis represents the number of SNPs with given MAF. The average MAF is 0.18.
Figure 6Gene ontology of all annotated genes in Cashmere goat skin transcriptome and the expressed single nucleotide polymorphism-containing genes.
Figure 7The top 10 KEGG pathway classification of the top 200 highly expressed gene contained single nucleotide polymorphism.
The result of SNP validation
| Chr | Position | Ensembl transcript ID | Reference allele | Alternative allele | Validation result |
|---|---|---|---|---|---|
| 1 | 3118518 | goat _ENSBTAP00000025104_gene | G | A | NA |
| 1 | 3138042 | goat _ENSBTAP00000025104_gene | T | G | T |
| 1 | 3152902 | goat _ENSBTAP00000025104_gene | A | G | T |
| 1 | 3153099 | goat _ENSBTAP00000025104_gene | C | G | F |
| 1 | 3153136 | goat _ENSBTAP00000025104_gene | A | G | T |
| 1 | 1250786 | goat _ENSBTAP00000046531_gene | G | A | F |
| 1 | 1250824 | goat _ENSBTAP00000046531_gene | T | C | T |
| 1 | 1081115 | goat _GLEAN_10004966_gene | T | C | T |
| 1 | 1081142 | goat _GLEAN_10004966_gene | T | C | T |
| 1 | 1081174 | goat_GLEAN_10004966_gene | A | G | F |
| 1 | 1256778 | goat _ENSBTAP00000011175_gene | C | G | F |
| 1 | 1256789 | goat _ENSBTAP00000011175_gene | A | C | F |
| 1 | 1276221 | goat _ENSBTAP00000011175_gene | G | A | T |
| 1 | 1276265 | goat _ENSBTAP00000011175_gene | T | G | T |
| 7 | 8917558 | goat _ENSBTAP00000003278_gene | A | C | T |
| 7 | 8929918 | goat _ENSBTAP00000003278_gene | T | C | NA |
| 7 | 8932917 | goat _ENSBTAP00000003278_gene | A | G | T |
| 7 | 8934111 | goat _ENSBTAP00000003278_gene | A | G | F |
| 10 | 83061080 | goat _ENSBTAP00000035548_gene | G | A | F |
| 10 | 83123999 | goat _ENSBTAP00000035548_gene | G | A | F |
| 10 | 83129984 | goat _ENSBTAP00000035548_gene | C | T | NA |
| 10 | 83130002 | goat _ENSBTAP00000035548_gene | A | C | NA |
| 10 | 83134259 | goat _ENSBTAP00000035548_gene | C | T | T |
| 10 | 83224601 | goat _ENSBTAP00000035548_gene | C | G | T |
| 14 | 80780872 | goat _ENSP00000389998_gene | C | T | T |
| 14 | 80783883 | goat _ENSP00000389998_gene | C | A | NA |
| 14 | 80784499 | goat _ENSP00000389998_gene | A | T | NA |
| 14 | 80794826 | goat _ENSP00000389998_gene | A | G | T |
| 14 | 80800830 | goat _ENSP00000389998_gene | A | G | NA |
| 14 | 80809897 | goat _ENSP00000389998_gene | C | G | NA |
NA, no Sanger sequencing results are obtained; T, the result of Sanger sequencing is consistent with RNA-sequencing; F, the result of Sanger sequencing is inconsistent with RNA-sequencing.