| Literature DB >> 29186423 |
Mark F Richardson1,2, Fernando Sequeira3, Daniel Selechnik4, Miguel Carneiro3,5, Marcelo Vallinoto6, Jack G Reid2, Andrea J West2, Michael R Crossland4, Richard Shine4, Lee A Rollins2.
Abstract
Background: Cane toads (Rhinella marina) are an iconic invasive species introduced to 4 continents and well utilized for studies of rapid evolution in introduced environments. Despite the long introduction history of this species, its profound ecological impacts, and its utility for demonstrating evolutionary principles, genetic information is sparse. Here we produce a de novo transcriptome spanning multiple tissues and life stages to enable investigation of the genetic basis of previously identified rapid phenotypic change over the introduced range. Findings: Using approximately 1.9 billion reads from developing tadpoles and 6 adult tissue-specific cDNA libraries, as well as a transcriptome assembly pipeline encompassing 100 separate de novo assemblies, we constructed 62 202 transcripts, of which we functionally annotated ∼50%. Our transcriptome assembly exhibits 90% full-length completeness of the Benchmarking Universal Single-Copy Orthologs data set. Robust assembly metrics and comparisons with several available anuran transcriptomes and genomes indicate that our cane toad assembly is one of the most complete anuran genomic resources available. Conclusions: This comprehensive anuran transcriptome will provide a valuable resource for investigation of genes under selection during invasion in cane toads, but will also greatly expand our general knowledge of anuran genomes, which are underrepresented in the literature. The data set is publically available in NCBI and GigaDB to serve as a resource for other researchers.Entities:
Keywords: Bufo marinus; RNA-Seq; Rhinella marina; amphibian; anuran; cane toad; de novo assembly; invasive species; transcriptome
Mesh:
Year: 2018 PMID: 29186423 PMCID: PMC5765561 DOI: 10.1093/gigascience/gix114
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:The cane toad, Rhinella marina. NCBI Taxonomy ID: 8386. Photographer credit: Matt Greenlees. Source: Matt Greenlees.
Cane toad samples used to generate the de novo reference transcriptome
| Tissue | Origin | Platform | Sample ID (library size) | Sampling location | Sex | SRA |
|---|---|---|---|---|---|---|
| Brain | Australia | HiSeq 2500 | B19 (23.9 M) | Durack | F | SRR5446736 |
| (2 × 125 bp) | B20 (27.7 M) | Durack | F | SRR5446735 | ||
| B31 (24.8 M) | Gordonvale | F | SRR5446734 | |||
| B32 (22.3 M) | Gordonvale | F | SRR5446733 | |||
| Spleen | Australia | HiSeq 2500 | S1 (23.8 M) | Gordonvale | F | SRR5446732 |
| (2 × 125 bp) | S2 (25.0 M) | Gordonvale | F | SRR5446732 | ||
| S18 (24.7 M) | Durack | F | SRR5446732 | |||
| S19 (23.6 M) | Durack | F | SRR5446732 | |||
| Muscle | Australia | HiSeq 2000 | RM0021M (93.8 M) | El Questro | F | SRR1910534 |
| (2 × 100 bp) | SRR1910535 | |||||
| RM0094M (88.2 M) | Purnululu | F | SRR1910543 | |||
| RM0108M (97.6 M) | Innisfail | F | SRR1910545 | |||
| RM0169M (80.0 M) | Rossville | F | SRR1910549 | |||
| Tadpole | Australia | HiSeq 2500 | T1 (26.4 M) | Oombulgurri | Both | SRR5446728 |
| (2 × 125 bp) | T4 (24.5 M) | Oombulgurri | Both | SRR5446727 | ||
| T7 (23.2 M) | Innisfail | Both | SRR5446726 | |||
| T10 (25.7 M) | Innisfail | Both | SRR5446725 | |||
| Liver | Brazil | HiSeq 2000 | RMTP (536.8 M) | Macapá | NA | SRR1514601 |
| (2 × 75bp) | ||||||
| Ovary | Brazil | HiSeq 1500 | AR19 (434.1 M) | Macapá | F | SRR5446724 |
| (2 × 125 bp) | ||||||
| Testes | Brazil | HiSeq 1500 | AR05 (410.5 M) | Macapá | M | SRR5446723 |
| (2×125 bp) |
Library size is given as raw sequenced reads in millions (M), sex denoted as female (F) and male (M). Both: sample contains mixed individuals of both sexes; NA: information unknown.
De novo assembler parameters used to produce the “over-assembly”
| Assembler |
| Parameter combinations | No. of assemblies |
|---|---|---|---|
| Trinity | 25 | Default | Aus 1, Brazil 1 |
| SOAPdenovo-Trans | 21, 25, 29, 33, 37, 41, 45, 49, 59, 69, 79, 89, 99 (No. 99 for the Brazil input set) |
| Aus 13, Brazil 12; Aus 13, Brazil 12 |
| Velvet/Oases | 21, 25, 29, 33, 37, 41, 45, 49, 59, 69, 79, 89 |
| Aus 12, Brazil 12; Aus 12, Brazil 12 |
| Total: 100 |
Summary of transcriptome assembly and annotation statistics compared with previous cane toad transcriptomes
| This study | Muscle[ | Liver[ | |
|---|---|---|---|
| Assembly | |||
| Filtered read pairs | 945 348 780 | 99 462 214 | 265 684 605 |
| | 129 051 008 | 18 713 526 | - |
| Assembly size, bp | 83 724 193 | 60 388 685 | 80 251 892 |
| Number of transcripts | 62 202 | 57 580 | 131 020 |
| N50 | 2377 | 1871 | 916 |
| Average length, bp | 1346 | 1049 | 613 |
| Minimum length, bp | 297 | 201 | 201 |
| Maximum length, bp | 99 438 | 40 546 | 17 369 |
| Median length, bp | 698 | 577 | 331 |
| GC, % | 46.05 | 45.06 | 44.32 |
| Transcripts with CDS | 62 202 | 19 751 | – |
| Annotation | |||
| Transcripts with BLASTx hit | 31 103 | 21 533 | – |
| Transcripts with BLASTp hit | 28 560 | 16 754 | – |
| Transcripts with GO terms | 28 399 | 19 500 | – |
aRollins, Richardson, and Shine [9].
bArthofer et al. [11].
BUSCO analysis of transcriptome completeness
| Complete | Complete and duplicated | Fragmented | Missing | |
|---|---|---|---|---|
| BUSCOs, % | BUSCOs, % | BUSCOs, % | BUSCOs, % | |
|
| ||||
| This study | 90 | 4.7 | 1.7 | 7.8 |
| Muscle[ | 60 | 4.6 | 5.7 | 33 |
| Liver[ | 69 | 0.6 | 4.1 | 26 |
| Select anuran transcriptomes | ||||
| | 26 | 0.3 | 15 | 57 |
| | 79 | 42 | 2.8 | 17 |
| | 50 | 0.4 | 7.8 | 41 |
| | 73 | 1.2 | 4.7 | 21 |
| Select anuran genomes | ||||
| | 97 | 51 | 1.4 | 1.4 |
| | 91 | 4.1 | 3.7 | 4.9 |
| | 76 | 2.8 | 9.0 | 14 |
“Complete BUSCOs” refers to those with a full-length match in the assembly. “Complete and duplicated” refers to those BUSCOs that are complete within an assembly but have multiple matches present. “Fragmented” are those BUSCOs that only have a partial match in the assembly, and “Missing” refers to those BUSCOs that do not have a corresponding match in the assembly.
aRollins, Richardson, and Shine [9].
bArthofer et al. [11].
cGerhchen et al. [7].
dBirol et al. [28].
eHuang et al. [29].
fZhao et al. [30].
gSession et al. [5].
hHellsten et al. [4].
iSun et al. [6].