| Literature DB >> 35462974 |
Xiufeng Yang1, Guolei Sun1, Tian Xia1, Muha Cha2, Lei Zhang1, Bo Pang2, Qingming Tang3, Huashan Dou2, Honghai Zhang1.
Abstract
Vulpesare widely distributed throughout the world and have undergone drastic physiological and phenotypic changes in response to their environment. However, little is known about the underlying genetic causes of these traits, especially Vulpes corsac. In this study, RNA-Seq was used to obtain a comprehensive dataset for multiple pooled tissues of corsac fox, and selection analysis of orthologous genes was performed to identify the genes that may be influenced by the low-temperature environment. More than 6.32 Gb clean reads were obtained and assembled into a total of 173,353 unigenes with an average length of 557 bp for corsac fox. Selective pressure analysis showed that 16 positively selected genes (PSGs) were identified in corsac fox, red fox, and arctic fox. Enrichment analysis of PSGs showed that the LRP11 gene was enriched in several pathways related to the low-temperature response and might play a key role in response to environmental stimuli of foxes. In addition, several positively selected genes were related to DNA damage repair (ELP2 and CHAF1A), innate immunity (ARRDC4 and S100A12), and the respiratory chain (NDUFA5), and these positively selected genes might play a role in adaptation to harsh wild fox environments. The results of common orthologous gene analysis showed that gene flow or convergent evolution might be an important factor in promoting regional differentiation of foxes. Our study provides a valuable transcriptomic resource for the evolutionary history of the corsac fox and the adaptations to the extreme environments.Entities:
Keywords: Vulpes corsac; cold adaptation; gene tree discordance; selective pressure analysis; transcriptome
Year: 2022 PMID: 35462974 PMCID: PMC9019142 DOI: 10.1002/ece3.8866
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 3.167
FIGURE 1Corsac fox in HulunBuir grassland, Inner Mongolia
Summary of sequencing results
| Sample | Raw reads | Clean reads | Clean bases (Gb) | Error (%) | Q20 (%) | Q30 (%) | GC (%) |
|---|---|---|---|---|---|---|---|
| CF | 76,627,518 | 73,370,220 | 11.01 | 0.01 | 97.69 | 94.13 | 50.99 |
| RF | 87,641,852 | 81,623,324 | 7.67 | 0.08 | 93.82 | 84.17 | 50.21 |
| AF1 | 72,243,730 | 67,227,516 | 6.32 | 0.09 | 93.45 | 83.75 | 50.39 |
| AF2 | 92,135,172 | 86,128,572 | 8.1 | 0.08 | 93.95 | 84.33 | 49.84 |
Abbreviations: AF, arctic fox; CF, corsac fox; RF, red fox.
The data which download from NCBI database. Error: sequencing error rate. Q20/Q30: Percentage of bases with a Phred value of at least 20/30. GC: The content of G and C.
Length interval and distribution of transcripts and unigenes
| Length interval & distribution | Transcripts | Unigenes | ||||||
|---|---|---|---|---|---|---|---|---|
| CF | RF | AF1 | AF2 | CF | RF | AF1 | AF2 | |
| 200–500 bp | 184,053 | 59,822 | 52,954 | 64,584 | 173,353 | 53,894 | 47,236 | 58,544 |
| 500–1 kbp | 37,021 | 16,303 | 15,419 | 16,436 | 29,632 | 12,367 | 11,671 | 12,416 |
| 1k‐2 kbp | 21,320 | 12,073 | 11,054 | 12,436 | 12,876 | 8,022 | 7,296 | 8,120 |
| >2 kbp | 23,138 | 9,391 | 8,347 | 11,598 | 10,548 | 5,617 | 4,984 | 6,759 |
| Total Number | 265,532 | 97,589 | 87,774 | 105,054 | 226,409 | 79,900 | 71,187 | 85,839 |
| Mean length | 738 | 786 | 836 | 787 | 557 | 667 | 672 | 690 |
| N50 | 1,596 | 1,532 | 1,765 | 1,497 | 758 | 1,196 | 1,185 | 1,352 |
| N90 | 262 | 277 | 280 | 281 | 243 | 252 | 255 | 251 |
| Total Nucleotides | 195,990,946 | 76,748,930 | 87,804,175 | 69,105,503 | 126,079,486 | 53,321,702 | 47,842,732 | 59,207,490 |
N50/N90: The transcript obtained by splicing was arranged from long to short and then accumulated. When the cumulative length>=50%/90% of the total length, then the transcript length is considered N50/N90.
Statistics of blast results for Unigene against databases
| Database | CF | RF | AF1 | AF2 |
|---|---|---|---|---|
| Annotated in Nr | 30,901 (13.65%) | 27,843 (34.85%) | 28,043 (39.39%) | 27,298 (31.80%) |
| Annotated in Nt | 85,294 (37.67%) | 53,036 (66.38%) | 51,205 (71.93%) | 53,220 (62.00%) |
| Annotated in KO | 16,636 (7.35%) | 15,288 (19.13%) | 15,348 (21.56%) | 16,116 (18.77%) |
| Annotated in Swiss‐Prot | 26,122 (11.54%) | 25,661 (32.12%) | 26,000 (36.52%) | 24,866 (28.97%) |
| Annotated in PFAM | 27,231 (12.03%) | 18,866 (23.61%) | 18,210 (25.58%) | 19,069 (22.21%) |
| Annotated in GO | 27,450 (12.12%) | 19,047 (23.84%) | 18,377 (25.82%) | 19,246 (22.42%) |
| Annotated in KOG | 9,040 (3.99%) | 9,303 (11.64%) | 9,330 (13.11%) | 9,014 (10.50%) |
| Annotated in all Databases | 5,926 (2.62%) | 5,365 (6.71%) | 5,328 (7.48%) | 5,803 (6.76%) |
| Annotated in at least one Database | 92,697 (40.94%) | 54,455 (68.15%) | 52,204 (73.33%) | 54,888 (63.94%) |
| Total Unigenes | 226,409 | 79,900 | 71,187 | 85,839 |
FIGURE 2Venn diagram of annotation results from top five databases. Number of annotated unigenes were marked in each circle
Statistics of CDS
| Sample | Number of Blast to Protein database | Number of prediction by Estscan | ||
|---|---|---|---|---|
| Total (Percent) | >300 | Total (Percent) | >300 | |
| CF | 18,476 (16.4%) | 13,800 | 94,005 (83.6%) | 21,074 |
| RF | 13,333 (25.8%) | 10,256 | 38,328 (74.2%) | 13,056 |
| AF1 | 12,967 (27.4%) | 10,013 | 34,356 (72.6%) | 12,473 |
| AF2 | 14,066 (26.1%) | 11,000 | 39,824 (73.9%) | 12,490 |
>300: Number of CDS longer than 300 nucleotides.
FIGURE 3Scatter diagram of KEGG functional classifications for orthologous genes using the KOBAS (K), DAVID (D), and ClusterProfiler (C) methods. The x‐axis shows the percentage of gene number/background of orthologous genes, and the y‐axis shows the enriched KEGG pathway
FIGURE 4Ka/Ks distribution diagram of genes under positive and purifying selection. The x‐axis shows the value of Ka/Ks. We set the y‐axis to ‐log10(Ks) to highlight the positively selected genes
FIGURE 5Homologous gene analysis of three species of fox. (a) Phylogenetic tree based on orthologous genes; (b) phylogenetic tree based on mitochondrial DNA. The values at each branch are the posterior probabilities of PAUP, Bayes, and RAxML. (c) and (d) Schematic diagram of the differences between species trees and gene trees. The purple dotted line represents the results of the species tree, and the red dotted line represents the results of the gene tree