| Literature DB >> 29764375 |
Chih-Hung Chou1,2, Hsi-Yuan Huang1,2, Wei-Chih Huang1,2, Sheng-Da Hsu1, Chung-Der Hsiao3, Chia-Yu Liu1, Yu-Hung Chen1, Yu-Chen Liu1,2, Wei-Yun Huang2, Meng-Lin Lee2, Yi-Chang Chen4, Hsien-Da Huang5,6.
Abstract
BACKGROUND: Aquatic animals have great economic and ecological importance. Among them, non-model organisms have been studied regarding eco-toxicity, stress biology, and environmental adaptation. Due to recent advances in next-generation sequencing techniques, large amounts of RNA-seq data for aquatic animals are publicly available. However, currently there is no comprehensive resource exist for the analysis, unification, and integration of these datasets. This study utilizes computational approaches to build a new resource of transcriptomic maps for aquatic animals. This aquatic animal transcriptome map database dbATM provides de novo assembly of transcriptome, gene annotation and comparative analysis of more than twenty aquatic organisms without draft genome.Entities:
Mesh:
Year: 2018 PMID: 29764375 PMCID: PMC5954267 DOI: 10.1186/s12864-018-4463-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Aquatic animal transcriptomes maps database. a The database collects RNA-seq data for more than 20 different aquatic animals. The database consists of three parts: b Detailed information for individual genes such as gene name and description, length, expression (FPKM), and sequence. c Functional annotation of all gene groups in individual species. d Evolutionary studies of all species transcriptomic data by constructing a comparative analysis system for homologous genes
Fig. 2dbATM database system and analysis. First, publicly available RNA-seq data were collected and combined with additional data produced for this study. Next, the NGS raw data were trimmed and filtered to remove low quality reads. Third, NGS reads were assembled based on the de Bruijn graphs and clustered into transcripts or unigenes. Finally, assembled reads were represented to genes and gene functions by BLAST against external database
Gene annotation statistics in dbATM
| Species | No. of transcripts | No. of proteins | No. of UniGene | No. of homologous genes |
|---|---|---|---|---|
|
| 99,696 | 38,815 | 13,600 | 9893 |
|
| 59,723 | 23,085 | 10,443 | 7838 |
|
| 28,360 | 15,975 | 7713 | 5773 |
|
| 47,472 | 33,491 | 1149 | 377 |
|
| 32,405 | 17,620 | 7388 | 4707 |
|
| 77,510 | 43,170 | 14,035 | 6394 |
|
| 57,435 | 28,850 | 13,400 | 10,452 |
|
| 27,075 | 15,130 | 6502 | 4227 |
|
| 50,827 | 31,923 | 14,839 | 11,295 |
|
| 10,738 | 5944 | 815 | 141 |
|
| 59,513 | 23,033 | 6593 | 1406 |
|
| 39,062 | 11,894 | 4127 | 1065 |
|
| 36,979 | 14,740 | 4796 | 1162 |
|
| 61,364 | 20,951 | 3802 | 1004 |
|
| 43,524 | 22,836 | 9115 | 6544 |
|
| 47,579 | 22,728 | 8271 | 3836 |
|
| 39,598 | 27,485 | 10,322 | 6423 |
|
| 26,187 | 14,498 | 6050 | 2463 |
|
| 19,205 | 13,762 | 5287 | 2749 |
|
| 51,104 | 41,391 | 11,505 | 8168 |
|
| 81,704 | 48,526 | 13,583 | 10,452 |
|
| 53,394 | 32,410 | 11,781 | 4238 |
| Total | 1,023,379 | 533,127 | 185,166 | 19,363a |
aThe total number of homologous gene groups show here is the real homologous gene groups in the database. Not the summation from individual species
Fig. 3Species in dbATM and their biological classification. We categorize organisms into sub-groups and represent their rough evolutionary distances for further evolutionary and phylogenetic investigation
Fig. 4dbATM web interface. The dbATM provides various query interfaces and graphical visualizations to facilitate access to aquatic transcriptomic data. a Summary table of the reads, assembly quality, and annotation statistics. b A web interface to facilitate unigene mining by entering either sequence or keyword. c The functional annotation interface including annotated species distribution, KEGG pathway, Gene Ontology, and cluster of ortholog groups. d The homologous gene search and browsing interface for comparative analysis
Homologous genes statistics in each clade
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|
| Numbers | 27 | 582 | 3027 | 85 | 19 | 8 |
aMollusca clade: Planorbarius corneus, Crassostrea gigas, Mytilus galloprovincialis
bArthropoda clade: Penaeus monodon, Pandalus latirostris, Neocaridina denticulate
cOstariophysi clade: Ictalurus punctatus, Astyanax mexicanus, Sinocyclocheilus angustiporus, Sinocyclocheilus anophthalmus, Microphysogo biobrevirostris
dEuteleostei clade: Plecoglossus altivelis, Gasterosteus aculeatus, Tetraodon nigroviridis, Lateolabrax japonicas, Pundamilia nyererei, Fundulus grandis, Poecilia Formosa, Poecilia mexicana
eActinopterygii clade: Ictalurus punctatus, Astyanax mexicanus, Sinocyclocheilus angustiporus, Sinocyclocheilus anophthalmus, Microphysogo biobrevirostris, Plecoglossus altivelis, Gasterosteus aculeatus, Tetraodon nigroviridis, Lateolabrax japonicas, Pundamilia nyererei, Fundulus grandis, Poecilia Formosa, Poecilia Mexicana. Anguilla japonica, Clupea harengus
fChordate clade: Ictalurus punctatus, Astyanax mexicanus, Sinocyclocheilus angustiporus, Sinocyclocheilus anophthalmus, Microphysogo biobrevirostris, Plecoglossus altivelis, Gasterosteus aculeatus, Tetraodon nigroviridis, Lateolabrax japonicas, Pundamilia nyererei, Fundulus grandis, Poecilia formosa, Poecilia mexicana. Anguilla japonica, Clupea harengus, Protopterus annectens
Fig. 5Gene expression profiles panel. The dbATM provides a function for comparative analysis of different tissues, conditions or species. a All three species samples are taken from the brain, presenting similar expression profile trends. b All four species are taken from different tissue, showing distinct expression profiles. Abbreviation code for species: cha, Clupea harengus; gac, Gasterosteus aculeatus; pan, Protopterus annectens; sang, Sinocyclocheilus angustiporus; sano, Sinocyclocheilus anophthalmus; tni, Tetraodon nigroviridis