| Literature DB >> 25214207 |
Matthew Hobbs, Ana Pavasovic, Andrew G King, Peter J Prentis, Mark D B Eldridge, Zhiliang Chen, Donald J Colgan, Adam Polkinghorne, Marc R Wilkins, Cheyne Flanagan, Amber Gillett, Jon Hanger, Rebecca N Johnson1, Peter Timms.
Abstract
BACKGROUND: The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25214207 PMCID: PMC4247155 DOI: 10.1186/1471-2164-15-786
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Characteristics of marsupial species commonly referenced in this study
| Common name | Binomial name | Order | Time since divergence from koala (mya) 1 | Occurrence | Genome sequence | |||
|---|---|---|---|---|---|---|---|---|
| Assembly | N50 2 | NCBI assembly ID | No. Ensembl r.72 coding genes | |||||
| koala |
| Diprotodontia | - | Australia | - | - | - | - |
| tammar wallaby |
| Diprotodontia | 55 | Australia | Meug_1.0 3 | 14.5 kb | GCA_000004035.1 | 15290 |
| Tasmanian devil |
| Dasyuromorphia | 60 | Australia | DEVIL7.0 | 1.8 Mb | GCA_000189315.1 | 18788 |
| gray short-tailed opossum |
| Didelphimorphia | 75 | South America | MonDom5 | 59 Mb | GCF_000002295.2 | 21327 |
1According to [16, 17].
2For supercontigs/scaffolds.
3The Meug_2.0 assembly is available but at the time of writing has not been annotated by either Ensembl or NCBI.
Sequence reads
| Animal | Tissue | Sampling comment | Library | No. raw read pairs | No. trimmed read pairs | No. trimmed unpaired reads | Total trimmed sequence (Gb) |
|---|---|---|---|---|---|---|---|
| PC | Spleen | Cross section of organ | PC001 | 58334597 | 55363111 | 2335398 | 10.9 |
| Liver | Cross section of organ | PC004 | 113366107 | 96407712 | 12468062 | 18.8 | |
| Uterus | Cross section of organ | PC005 | 50835874 | 48175221 | 2068163 | 9.5 | |
| Kidney | Mainly renal cortex | PC006 | 52894485 | 50152880 | 2190771 | 9.9 | |
| Lung | Cross section of organ | PC008 | 57722900 | 54667996 | 2369470 | 10.8 | |
| Heart | Heart muscle limited penetration | PC009 | 108773370 | 103882594 | 13127007 | 20.4 | |
| Brain | Neocortex frontal lobe (olfactory lobe not available) | PC010 | 60678718 | 57283405 | 2660545 | 11.3 | |
| Adrenal gland | Cross section of organ | PC011 | 54127792 | 51165338 | 2338177 | 10.1 | |
| Total | - | - | 556733843 | 517098257 | 39557593 | 101.8 | |
| Bi | Bone marrow | - | - | 78316465 | 78163966 | 152426 | 14.1 |
| Kidney | - | - | 82942780 | 82852879 | 89882 | 14.9 | |
| Liver | - | - | 76971566 | 76062051 | 908596 | 13.8 | |
| Lymph node | - | - | 74560669 | 74384819 | 175789 | 13.4 | |
| Salivary gland | - | - | 78748770 | 78670628 | 78101 | 14.2 | |
| Spleen | - | - | 83176067 | 83083468 | 92385 | 15.0 | |
| Testes | - | - | 81482406 | 81302672 | 179622 | 14.6 | |
| Total | - | - | 556198723 | 554520483 | 1676801 | 99.9 |
Koala transcriptome assemblies
| Input sequences | No. transcripts | Max. length | Mean length | N50 | No. proteins hit 1 | No. genes 2 |
|---|---|---|---|---|---|---|
| All PC libraries | 370030 | 82521 | 1688 | 3381 | 17357 | 15490 |
| All Bi libraries | 381958 | 22721 | 1124 | 2531 | 17078 | 15328 |
1Total number of distinct protein sequences in best-hit translated BLAST alignments of koala transcripts with Ensembl Tasmanian devil proteins.
2Total number of distinct Tasmanian devil genes.
Alignment of koala transcripts to two marsupial genomes
| Animal | Opossum genome | Tasmanian devil genome | ||
|---|---|---|---|---|
| No. mapped transcripts | No. unmapped transcripts | No. mapped transcripts | No. unmapped transcripts | |
| PC | 204041 (55%) | 165989 (45%) | 267830 (72%) | 102200 (28%) |
| Bi | 172910 (45%) | 209048 (55%) | 195308 (51%)) | 186650 (49%) |
Ensembl genes which overlap with aligned koala transcripts
| Gene category | Total | Overlapping PC transcript(s) | Overlapping Bi transcript(s) | Overlapping either PC or Bi transcript(s) |
|---|---|---|---|---|
| Opossum | ||||
| protein_coding | 21327 | 14624 | 14170 | 15584 |
| pseudogene | 722 | 214 | 211 | 288 |
| snRNA | 843 | 118 | 98 | 138 |
| snoRNA | 319 | 206 | 184 | 227 |
| miRNA | 412 | 107 | 82 | 127 |
| rRNA | 176 | 22 | 19 | 25 |
| misc_RNA | 77 | 10 | 8 | 10 |
| Mt_tRNA | 21 | 3 | 0 | 3 |
| Mt_rRNA | 2 | 1 | 0 | 1 |
| total | 23899 | 15305 | 14772 | 16403 |
| Tasmanian devil | ||||
| protein_coding | 18788 | 14971 | 14480 | 15773 |
| pseudogene | 178 | 114 | 115 | 128 |
| snRNA | 503 | 94 | 57 | 102 |
| snoRNA | 277 | 202 | 185 | 213 |
| miRNA | 486 | 166 | 113 | 182 |
| rRNA | 87 | 14 | 10 | 14 |
| misc_RNA | 113 | 20 | 20 | 21 |
| Mt_tRNA | 22 | 0 | 0 | 0 |
| Mt_rRNA | 2 | 0 | 0 | 0 |
| total | 20456 | 15581 | 14980 | 16433 |
Similarity searching and blast2go annotation of koala NR proteins
| Animal | No. proteins | With nr database protein BLAST hit | With GO term(s) assigned |
|---|---|---|---|
| PC | 78208 | 56872 | 23658 |
| Bi | 63554 | 47161 | 34524 |
Number of sequences (and genes, in parentheses) with which koala protein sequences were aligned in protein BLAST searches of three marsupial proteomes
| Animal |
|
|
| |||
|---|---|---|---|---|---|---|
| BH | RBH | BH | RBH | BH | RBH | |
|
| 12842 (12820) | 12206 (12198) | 15882 (14548) | 14065 (13761) | 15391 (15045) | 14203 (14101) |
|
| 12554 (12538) | 11770 (11763) | 15379 (14259) | 13494 (13,269) | 14981 (1473) | 13589 (13507) |
|
| 13344 (13318) | 12673 (12663) | 16957 (15285) | 14931 (14457) | 16233 (15799) | 14922 (14774) |
BH: best hit. RBH: reciprocal BH.
Figure 1KoRV expression profile in PC and Bi.
Figure 2Transcription of KoRV. A. Organisation of the KoRV genome [GenBank:AF151794]. B. structure of transcripts predicted by Cufflinks. C. two transcripts from reassembled KoRV transcriptome (PC spleen). Note LTR sequences excluded from this assembly. D. Depth of coverage with reads aligned with bwa (PC spleen).
Figure 3KoRV splice sites. Alignment of env transcript splice donor (A) and acceptor (B) regions from Moloney murine retrovirus (MMLV; database accession [GenBank:AF462057]) and koala retrovirus (KoRV; database accession [GenBank:AF151794]). Positions with identical nucleotides are shaded blue. Intronic sequences are in lower case. Red triangles indicate splice sites supported by experimental evidence.
Figure 4Alignment of three PC KoRV protein sequences. The red line marks variable region A.