| Literature DB >> 32292659 |
Wen Wang1, Fang Wang2, Rongkai Hao3, Aizhen Wang4, Kirill Sharshov5, Alexey Druzyaka6, Zhuoma Lancuo7, Yuetong Shi8, Shuo Feng1.
Abstract
BACKGROUND: The bar-headed goose (Anser indicus) mainly inhabits the plateau wetlands of Asia. As a specialized high-altitude species, bar-headed geese can migrate between South and Central Asia and annually fly twice over the Himalayan mountains along the central Asian flyway. The physiological, biochemical and behavioral adaptations of bar-headed geese to high-altitude living and flying have raised much interest. However, to date, there is still no genome assembly information publicly available for bar-headed geese.Entities:
Keywords: 10X Genomics Chromium; Anser indicus; Avian genomes; Bar-headed goose; Comparative genomics; Conservation genomics; High-altitude adaptation; Hypoxia; Positive selection; Qinghai-Tibetan Plateau
Year: 2020 PMID: 32292659 PMCID: PMC7144584 DOI: 10.7717/peerj.8914
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
The de novo assembled genome of Bar-headed goose.
| Length (bp) | Numbers | |||
|---|---|---|---|---|
| Contigs | Scaffolds | Contigs | Scaffolds | |
| Total | 1,114,495,510 | 1,143,097,520 | 30,886 | 10,528 |
| Max | 1,384,698 | 33,819,004 | – | – |
| Number ≥ 2000 | – | – | 22,660 | 5,540 |
| N50 | 120,377 | 10,094,206 | 2,510 | 35 |
| N60 | 91,372 | 8,026,496 | 3,576 | 48 |
| N70 | 65,644 | 5,883,601 | 5,019 | 64 |
| N80 | 44,047 | 3,435,270 | 7,086 | 90 |
| N90 | 23,142 | 1,267,474 | 10,506 | 144 |
The BUSCO assessment results of the completeness of genome assembly.
| Species | BUSCO assessment results |
|---|---|
| Bar-headed goose | C: 97.5%, [D: 0.4%], F: 1.7%, M: 0.8%, n: 2586 |
Notes.
Complete Single-Copy BUSCOs
Complete Duplicated BUSCOs
Fragmented BUSCOs
Missing BUSCOs
Total BUSCO groups searched
Annotation of non-coding RNA genes.
| Type | Copy | Average length (bp) | Total length (bp) | % of genome | |
|---|---|---|---|---|---|
| miRNA | 342 | 85.98 | 29,406 | 0.003 | |
| tRNA | 282 | 75.09 | 21,175 | 0.002 | |
| rRNA | 49 | 234.67 | 11,499 | 0.001 | |
| 18S | 9 | 426.89 | 3,842 | 0.000 | |
| 28S | 33 | 211.39 | 6,976 | 0.001 | |
| 5.8S | 2 | 156.00 | 312 | 0.000 | |
| 5S | 5 | 73.80 | 369 | 0.000 | |
| snRNA | 288 | 118.40 | 34,099 | 0.003 | |
| CD-box | 106 | 86.92 | 9,214 | 0.001 | |
| HACA-box | 80 | 139.41 | 11,153 | 0.001 | |
| splicing | 84 | 128.12 | 10,762 | 0.001 | |
Prediction of protein-coding genes.
| Average length (bp) | |||||||
|---|---|---|---|---|---|---|---|
| Methods | Gene number | Exons per gene | Gene | CDS | Exon | Intron | |
| De novo | Augustus | 18,318 | 8.14 | 16,402.87 | 1,402.91 | 172.29 | 2,099.99 |
| GlimmerHMM | 172,168 | 2.90 | 5,695.00 | 474.06 | 163.45 | 2,747.47 | |
| SNAP | 61,271 | 6.31 | 29,996.61 | 809.23 | 128.21 | 5,495.13 | |
| Geneid | 38,214 | 5.48 | 20,392.89 | 1,006.18 | 183.58 | 4,326.65 | |
| Genscan | 43,335 | 7.61 | 19,762.90 | 1,294.69 | 170.05 | 2,792.52 | |
| Homologous comparison | 26,141 | 5.64 | 13,827.51 | 1,049.69 | 186.24 | 2,756.00 | |
| 36,335 | 4.65 | 9,142.51 | 930.19 | 199.91 | 2,248.05 | ||
| 25,943 | 6.03 | 13,297.66 | 1,171.56 | 194.16 | 2,408.92 | ||
| 36,315 | 4.71 | 9,177.85 | 913.73 | 194.05 | 2,228.25 | ||
| 34,171 | 5.14 | 10,493.05 | 1,004.18 | 195.43 | 2,292.94 | ||
| 15,498 | 9.06 | 21,225.26 | 1,648.10 | 181.90 | 2,428.77 | ||
| RNA-seq | Cufflinks | 48,064 | 7.93 | 24,121.46 | 3,494.75 | 440.7 | 2,976.44 |
| PASA | 65,640 | 6.78 | 15,650.31 | 1,168.16 | 172.35 | 2,506.53 | |
| EVM | 24,169 | 7.12 | 16,340.12 | 1,225.82 | 172.14 | 2,469.26 | |
| PASA-update | 24,010 | 7.20 | 17,772.08 | 1,249.46 | 173.43 | 2,663.08 | |
| Final set | 16,428 | 9.88 | 25,274.02 | 1,641.91 | 166.26 | 2,662.69 | |
Functional annotation of the predicted protein-coding genes.
| Database | Number | Percent (%) | |
|---|---|---|---|
| RefSeq | 15,780 | 96.1 | |
| Swiss-Prot | 15,287 | 93.1 | |
| KEGG | 13,802 | 84.0 | |
| InterPro | All | 15,106 | 92.0 |
| Pfam | 13,875 | 84.5 | |
| GO | 11,272 | 68.6 | |
| Annotated | 15,790 | 96.1 | |
| Total | 16,428 | – | |
Figure 1Orthologous genes in bar-headed goose and other birds.
The number of unique or shared orthologous genes are listed in each diagram component. Ain, bar-headed goose; Acy, swan goose; Apl, mallard; Gga, red junglefowl.
Figure 2Gene family expansion and contraction in the bar-headed goose genome.
The number of expanded (blue) and contracted (red) gene families are shown along branches and nodes. MRCA, most recent common ancestor; Ain, bar-headed goose; Acy, swan goose; Nni, crested ibis; Apl, mallard; Gga, red junglefowl; Mga, turkey; Cca, common cuckoo; Cli, rock pigeon; Phu, ground tit; Cfl, bananaquit; Pma, great tit; Tgu, zebra finch; Ppu, ruff; Fpe, peregrine falcon; Bmu, yak; Pho, tibetan antelope.
Figure 3Functional distribution of positively selected genes (PSGs) according to the Gene Ontology (GO) database.
The y axis reveals the GO functional categories, including (A) biological process, (B) cellular component, and (C) molecular function, while the number of genes in each category is plotted on the x axis.