| Literature DB >> 27672404 |
Thomas T Schulze1, Jonathan M Ali2, Maggie L Bartlett1, Madalyn M McFarland1, Emalie J Clement1, Harim I Won1, Austin G Sanford1, Elyssa B Monzingo1, Matthew C Martens1, Ryan M Hemsley1, Sidharta Kumar1, Nicolas Gouin3, Alan S Kolok4, Paul H Davis1.
Abstract
Trichomycterus areolatus is an endemic species of pencil catfish that inhabits the riffles and rapids of many freshwater ecosystems of Chile. Despite its unique adaptation to Chile's high gradient watersheds and therefore potential application in the investigation of ecosystem integrity and environmental contamination, relatively little is known regarding the molecular biology of this environmental sentinel. Here, we detail the assembly of the Trichomycterus areolatus transcriptome, a molecular resource for the study of this organism and its molecular response to the environment. RNA-Seq reads were obtained by next-generation sequencing with an Illumina® platform and processed using PRINSEQ. The transcriptome assembly was performed using TRINITY assembler. Transcriptome validation was performed by functional characterization with KOG, KEGG, and GO analyses. Additionally, differential expression analysis highlights sex-specific expression patterns, and a list of endocrine and oxidative stress related transcripts are included.Entities:
Keywords: Trichomycterus areolatus; assembly; catfish; de novo transcriptome
Year: 2016 PMID: 27672404 PMCID: PMC5033730 DOI: 10.7150/jgen.16885
Source DB: PubMed Journal: J Genomics
Transcriptome Tissue Sequencing Details. RNA samples were from 4.4 and 5.4ug of RNA for male and female fish, respectively. Sequencing was performed on an Illumina® Hi-Seq 2500. BP: base pair, GC: G-C nucleotide ratio.
| Tissue | Total Reads | Total Output (bp) | GC Content (%) |
|---|---|---|---|
| Whole Female | 328,721,780 | 32,872,178,000 | 48% |
| Whole Male | 88,794,542 | 8,879,454,200 | 47% |
. The representative Trichomycterus areolatus transcriptome assembly was analyzed for general characteristics listed above. Putative protein coding transcripts were included and identified by TransDecoder. Redundant transcripts were removed by CD-HIT which collapses redundant and highly similar sequences into consensus sequences.
| Total Transcripts | Mean Length (bp) | Median Length (bp) | N50 | GC Content |
|---|---|---|---|---|
| 64889 | 1484.85 | 857 | 2671 | 47.5% |
Phylogenetic Comparison to Other Fish. Transcripts produced in this study were concatenated and aligned to published sequences of fish species and a percent identity matrix was computed. This analysis utilized Trichomycterus areolatus transcripts: Ta_155828, Ta_53325, Ta_192266, Ta_56196, and Ta_56194 for cytochrome c oxidase subunit III, HSP70, NADH dehydrogenase subunit 5, estrogen receptor, and glutathione s-transferase kappa 1, respectively.
| Species | Percent Identity |
|---|---|
| 87.6 | |
| 86.5 | |
| 84.5 | |
| 82.5 | |
| 82.5 | |
| 76.4 | |
| 67.1 |
Figure 3Transcript Coverage of Two Model Organisms. Coverage of Salmo salar and Danio rerio predicted proteins by Trichomycterus areolatus predicted proteins. Predicted polypeptide sequences produced in this study were BLASTed against publically available non-redundant Salmo salar proteins (count = 112,089) and Danio rerio proteins (count = 81,931). The length of the local alignment region reported by the BLASTp algorithm was subsequently divided by the length of the query sequence. Compilation of these results indicated that a vast majority of Trichomycterus areolatus predicted protein sequences exhibited greater than 90% coverage of both Danio rerio (64.7%) and Salmo salar (68.9%) protein sequences, suggesting that the assembly produced a high degree of full-length transcripts.
Figure 4Gene Ontology (GO) Analysis of the GO functional analysis was performed on assigned proteins in order to evaluate transcript function and the overall completeness of the isolated transcriptome. GO terms were given for each of the T. areolatus predicted proteins as well as the proteomes of Salmo salar, Cyprinus carpio, and Danio rerio (retrieved from NCBI). The distribution of protein functions closely match one another, suggesting the assembled transcriptome is complete.
Figure 5Eukaryotic Orthologous Groups (KOG) Characterization of Putative transcript functions were assessed and transcriptome completeness was evaluated using KOG analysis. The Trichomycterus areolatus transcriptome and mRNA nucleotide entries from NCBI of Cyprinus carpio, Salmo salar, and Danio rerio were assigned KOG terms. The three transcriptomes have similar distributions, supporting the completeness of the Trichomycterus areolatus transcriptome.
Figure 6Kyoto Encyclopedia of Genes and Genomes (KEGG) Transcriptomic Analysis. KEGG analysis was performed to functionally describe transcript functions and evaluate transcriptome completeness. To serve as comparisons mRNA sequences for Danio rerio, Salmo salar, and Cyprinus carpio were retrieved from NCBI and characterized into KEGG pathways. The percent distribution shows a similar proportion among compared species indicating a complete transcriptome for Trichomycterus areolatus.
Differentially Expressed Transcripts. The resultant male and female Trichomycterus areolatus sequence files were interrogated to assess relative differential transcript expression. Unique transcripts, demonstrating changes greater than or equal to 10-fold, are identified by homology (if available) to known proteins, and only the most differentially expressed isoform is presented. Non-inherited transcripts that are unique to individual organisms (e.g. MHC molecules via rearrangement or similar immune transcripts) were excluded. Notably, many transcriptional differences are related to sex-specific expression.
| Transcriptomic ID | Fold Change | Name |
|---|---|---|
| TRICH01_163992 | 15 | parvalbumin beta-1 |
| TRICH01_133698 | 12 | hemoglobin subunit beta-2 |
| TRICH01_83415 | 11 | sperm acrosome membrane-associated protein 4 |
| TRICH01_57563 | 10 | endonuclease domain-containing 1 protein |
| TRICH01_101761 | 38604 | microsomal triglyceride transfer protein large subunit |
| TRICH01_222696 | 96 | complement C4 |
| TRICH01_180471 | 96 | alpha-2-macroglobulin |
| TRICH01_143497 | 56 | 3-hydroxyacyl-CoA dehydrogenase type-2 |
| TRICH01_54871 | 42 | vitellogenin 4 |
| TRICH01_100604 | 38 | pyruvate dehydrogenase phosphatase regulatory subunit |
| TRICH01_142598 | 35 | CD59 |
| TRICH01_51308 | 29 | ribonucleoside-diphosphate reductase subunit M2 |
| TRICH01_208172 | 22 | coiled-coil domain-containing protein 36 |
| TRICH01_51310 | 21 | Jouberin |
| TRICH01_100599 | 19 | PEX5 |
| TRICH01_100601 | 17 | histone-lysine N-methyltransferase ASH1L |
| TRICH01_100600 | 16 | A-kinase anchor protein |
Genes linked to endocrine disruption and/or oxidative stress were identified within the transcriptome assembly for convenience in developing Trichomycterus areolatus as an environmental sentinel organism. Danio rerio sequences were used as queries to BLAST the full assembly to identify putative homologs. Protein isoforms were differentiated based on query sequence annotation and bitscore.
| Gene Name | Gene Symbol | Transcriptomic ID | Transcript Length (bp) | Query ID | Bit Score |
|---|---|---|---|---|---|
| Androgen Receptor | AR | TRICH01_58265 | 4455 | NP_001076592.1 | 788 |
| Aromatase | CYP19a1 | TRICH01_14384 | 284 | AAB65788.1 | 759 |
| Aryl Hydrocarbon Receptor | AHR | TRICH01_166983 | 3366 | NP_001019987.1 | 593 |
| Aryl Hydrocarbon Receptor 2 | AHR2 | TRICH01_225654 | 2827 | NP_571339.1 | 905 |
| Cytochrome P450 1A1 | CYP1a1 | TRICH01_117246 | 2028 | NP_571954.1 | 820 |
| Estrogen Receptor Alpha | ESRa | TRICH01_56196 | 4380 | AAK16740.1 | 729 |
| Estrogen Receptor Beta 1 | ESRb1 | TRICH01_211240 | 4700 | CAC93848.1 | 673 |
| Estrogen Receptor Beta 2 | ESRb2 | TRICH01_95151 | 3578 | CAC93849.1 | 796 |
| Follicle Stimulating Receptor | FSHR | TRICH01_111037 | 3596 | AAP33512.1 | 996 |
| Forkhead Box L2 | FOXL2 | TRICH01_143206 | 1866 | AAI16586.1 | 370 |
| Heat Shock Protein 70 | HSP70 | TRICH01_53325 | 2625 | AAF70445.1 | 1216 |
| Heat Shock Protein 90 Alpha 1 | HSP90a1 | TRICH01_121146 | 2863 | NP_571403.1 | 1292 |
| Heat Shock Protein 90 Alpha 2 | HSP90a2 | TRICH01_121144 | 2926 | AAI63166.1 | 1278 |
| Metallothionein | MT | TRICH01_130427 | 554 | AAS00513.1 | 53 |
| Superoxide Dismutase | SOD | TRICH01_28552 | 2500 | NP_571369.1 | 261 |
| Thyroid Receptor Alpha | THRa | TRICH01_196629 | 2500 | AAA99811.1 | 760 |
| Thyroid Receptor Beta | THRb | TRICH01_20904 | 2406 | AF109732_1 | 732 |
| Vitellogenin 1 | VTG1 | TRICH01_101739 | 2851 | AF406784_1 | 1224 |
Figure 2Choapa River Basin Tissue Sampling Sites-The male and female fish samples were collected from the downstream site “A” (altitude 243m, Lat; Lon: -31.749639; -71.160722) and upstream site “B” (altitude 792m, Lat; Lon: -31.89675; -70.783056) respectively. This river basin is proximate to heavy agricultural practice and downstream of heavy metals mining (e.g. copper). The stream itself is predominately supplied by glacial melt from the Andean mountains. Historically, this system drained into the Pacific Ocean but is becoming an isolated system due to the deleterious effects of climate change.