| Literature DB >> 29506476 |
Jack C H Ip1,2, Huawei Mu1, Qian Chen3, Jin Sun4, Santiago Ituarte5, Horacio Heras5,6, Bert Van Bocxlaer7,8, Monthon Ganmanee9, Xin Huang10, Jian-Wen Qiu11,12.
Abstract
BACKGROUND: Gastropoda, with approximately 80,000 living species, is the largest class of Mollusca. Among gastropods, apple snails (family Ampullariidae) are globally distributed in tropical and subtropical freshwater ecosystems and many species are ecologically and economically important. Ampullariids exhibit various morphological and physiological adaptations to their respective habitats, which make them ideal candidates for studying adaptation, population divergence, speciation, and larger-scale patterns of diversity, including the biogeography of native and invasive populations. The limited availability of genomic data, however, hinders in-depth ecological and evolutionary studies of these non-model organisms.Entities:
Keywords: (3 to 10) biological invasion; Asolene; Caenogastropoda; Genomic database; Lanistes; Marisa; Pila; Pomacea; RNA-Seq
Mesh:
Year: 2018 PMID: 29506476 PMCID: PMC5839033 DOI: 10.1186/s12864-018-4553-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Geographical distribution and phylogeny of apple snails used in the present study. a Rough native distribution ranges of the Old World (Lanistes and Pila) and New World (Asolene, Marisa and Pomacea) genera/species [7, 56]. b A maximum likelihood tree showing the phylogenetic relationship among the eight species of ampullariids based on sequences of three genes used in previous phylogenetic studies of ampullariids [6, 52]. Methodological details for the phylogenetic analysis can be found in Additional file 1. Bootstrap support values are shown, as is a scale bar of 0.05 substitution per site. Photo credit: L. nyassanus, Pila ampullacea and M. cornuarietis (JCHI); A. platae and P. scalaris (SI); P. canaliculata, P. maculata and P. diffusa (HM)
A summary of transcriptome data from eight apple snails used for database construction. Tissues: albumen gland (AG), digestive gland (DG), foot (F), gill (G), lung (L), mantle (M), kidney (K), stomach (S), testis (T) and other tissues (OT; including DG, F, M and T)
| Species (SRA accession No.) | Sampling location | Tissue | Platform | Length (bp) | Clean read (bp) | Q20 (%) | GC (%) |
|---|---|---|---|---|---|---|---|
| Old World | |||||||
| | F4 or F5 offspring from a lab inbred population; originally collected from Lake Malawi, Africa | AG | Hiseq2000 | 100 | 36,892,514 | 97.89 | 47.34 |
| OT without T | Hiseq2000 | 100 | 39,555,832 | 98.04 | 45.12 | ||
| | Wild-caught from Nong Phok District, Roi Et Province, Thailand | AG | Hiseq4000 | 100 | 78,216,048 | 98.66 | 46.44 |
| OT | Hiseq4000 | 100 | 82,268,586 | 98.76 | 44.34 | ||
| New World | |||||||
| | Wild-caught from Lago de Regatas, Buenos Aires, Argentina | AG | Hiseq2000 | 90 | 47,404,352 | 96.8 | 46.08 |
| AG | Hiseq4000 | 100 | 69,830,648 | 98.89 | 45.95 | ||
| OT without T | Hiseq4000 | 100 | 97,420,524 | 99.18 | 45.42 | ||
| | Aquarium shop, Mong Kok, Hong Kong | AG | Hiseq2000 | 90 | 51,889,926 | 97.55 | 46.11 |
| OT | Hiseq2000 | 90 | 53,590,040 | 96.62 | 45.24 | ||
| | Aquarium shop, Mong Kok, Hong Kong | AG | Hiseq2000 | 90 | 54,266,010 | 97.71 | 44.11 |
| OT | Hiseq2000 | 90 | 54,579,594 | 96.91 | 44.91 | ||
| | Wild-caught from Lago de Regatas, Buenos Aires, Argentina | AG | Hiseq2000 | 90 | 72,341,892 | 98.43 | 43.05 |
| | Wild-caught from Sheung Shui, Hong Kong | AG | Hiseq2500 | 125 | 50,399,554 | 97.90 | 45.04 |
| DG | Hiseq2500 | 125 | 45,063,414 | 97.78 | 49.34 | ||
| F | Hiseq2500 | 125 | 54,307,040 | 98.17 | 43.78 | ||
| G | Hiseq2500 | 125 | 49,217,508 | 98.01 | 45.20 | ||
| K | Hiseq2500 | 125 | 50,518,406 | 98.04 | 45.33 | ||
| L | Hiseq2500 | 125 | 40,886,322 | 97.97 | 45.30 | ||
| M | Hiseq2500 | 125 | 48,951,426 | 98.09 | 46.47 | ||
| S | Hiseq2500 | 125 | 44,860,264 | 97.65 | 45.28 | ||
| T | Hiseq2500 | 125 | 52,304,178 | 97.70 | 45.71 | ||
| (SRA030614.2) | Wild-caught from Yuen Long, Hong Kong | OT without T | Hiseq2000 | 90 | 25,723,522 | 95.65 | 46.83 |
| | Wild-caught from Paraná River, Argentina | AG | Hiseq2000 | 100 | 52,732,156 | 98.20 | 44.94 |
| OT | Hiseq2000 | 100 | 54,961,478 | 98.26 | 45.05 | ||
Transcriptome assembly and annotation statistics. To avoid confusion between Pomacea and Pila, the latter taxon is not abbreviated as “P.”
| Items |
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
| De novo assembly | ||||||||
| Assembled bases | 164,160,894 | 238,879,002 | 214,102,711 | 159,734,791 | 168,090,829 | 141,684,727 | 536,808,768 | 145,979,415 |
| Assembled transcripts | 152,931 | 277,864 | 203,935 | 187,959 | 204,576 | 126,582 | 499,932 | 200,397 |
| Assembled unigenes | 122,779 | 212,935 | 156,912 | 161,143 | 171,676 | 98,100 | 215,456 | 154,712 |
| Clustered transcripts | 129,455 | 221,653 | 165,023 | 161,069 | 173,606 | 105,046 | 355,408 | 154,700 |
| Clustered unigenes | 114,869 | 192,301 | 142,773 | 147,375 | 157,064 | 89,910 | 211,621 | 136,742 |
| Unigenes (transcripts) | 22,204 (29,317) | 35,828 (46,232) | 20,730 (28,927) | 29,400 (35,994) | 28,408 (36,112) | 20,829 (28,847) | 28,755 (57,048) | 28,782 (35,063) |
| Unigene N50 (bp) | 1740 | 1683 | 1803 | 1440 | 1485 | 1629 | 1509 | 1320 |
| Unigene length (bp) - average (min - max) | 1222 (300–31,476) | 1182 (300–19,023) | 1281 (300–15,984) | 1054 (300–23,508) | 1076 (300–25,756) | 1163 (300–13,624) | 1074 (300–40,192) | 974 (300–17,707) |
| BUSCO | ||||||||
| Complete (%) | 86.83 | 92.41 | 82.09 | 77.82 | 79.95 | 80.43 | 80.07 | 77.46 |
| Fragmented (%) | 4.15 | 3.68 | 4.74 | 13.52 | 11.63 | 8.78 | 7.47 | 12.81 |
| Annotation (unigenes) | ||||||||
| NCBI nr | 17,065 (76.86%) | 27,254 (76.07%) | 16,051 (77.43%) | 22,579 (76.80%) | 21,405 (75.35%) | 16,705 (80.20%) | 20,051 (69.73%) | 21,625 (75.13%) |
| GO | 10,697 (48.18%) | 18,717 (52.24%) | 9852 (47.53%) | 14,274 (48.55%) | 13,519 (47.59%) | 10,394 (49.90%) | 12,216 (42.48%) | 13,671 (47.50%) |
| KEGG | 3783 (17.04%) | 5467 (15.26%) | 3546 (17.11%) | 4215 (14.34%) | 4061 (14.30%) | 3801 (18.25%) | 3693 (12.84%) | 4059 (14.10%) |
Comparison of transcriptome assembly metrics between this study and some other studies of mollusks
| Items | This study (mean) | |||||||
|---|---|---|---|---|---|---|---|---|
| De novo assembly | ||||||||
| transcripts | 37,193 | 128,436 | 105,349 | 38,466 | 62,862 | 34,794 | 273,272 | 49,501 |
| Unigenes | 26,867 | – | – | 32,798 | – | – | – | – |
| N50 (bp) | 1576 | 283 | 1332 | 2236 | 690 | 817 | 2100 | 1046 |
| Mean length (bp) | 1128 | 420 | 878 | 1709 | 999 | – | – | 679 |
| BUSCO | ||||||||
| Complete genes | 82.13% | 40.21% | 71.89% | 93.00% | 89.09% | 33.93% | 83.63% | 66% |
| Fragmented | 8.35% | 39.38% | 18.86% | 3.56% | 6.80% | 34.48% | 11.39% | 10% |
| Annotation | ||||||||
| Protein database | 75.95% | 24.04% | 33.79% | 74.40% | 25.13% | 48.23% | 14.11% | 25% |
| GO | 48.00% | 6.83% | 15.30% | 45.42% | (overall) | 25.22% | 8.75% | (overall) |
| KEGG | 15.41% | 10.07% | 23.61% | 15.66% | 27.04% | 6.78% | ||
Fig. 2The web interface of AmpuBase. a Illustration of the Basic and Advanced BLAST search options. b An example of the search result of a BLAST search, showing matched sequences, each with their BLAST statistics. c Illustration of the search functions in AmpuBase based on annotation