| Literature DB >> 23170846 |
Joanna L Kelley1, Courtney N Passow, Martin Plath, Lenin Arias Rodriguez, Muh-Ching Yee, Michael Tobler.
Abstract
BACKGROUND: Elucidating the genomic basis of adaptation and speciation is a major challenge in natural systems with large quantities of environmental and phenotypic data, mostly because of the scarcity of genomic resources for non-model organisms. The Atlantic molly (Poecilia mexicana, Poeciliidae) is a small livebearing fish that has been extensively studied for evolutionary ecology research, particularly because this species has repeatedly colonized extreme environments in the form of caves and toxic hydrogen sulfide containing springs. In such extreme environments, populations show strong patterns of adaptive trait divergence and the emergence of reproductive isolation. Here, we used RNA-sequencing to assemble and annotate the first transcriptome of P. mexicana to facilitate ecological genomics studies in the future and aid the identification of genes underlying adaptation and speciation in the system. DESCRIPTION: We provide the first annotated reference transcriptome of P. mexicana. Our transcriptome shows high congruence with other published fish transcriptomes, including that of the guppy, medaka, zebrafish, and stickleback. Transcriptome annotation uncovered the presence of candidate genes relevant in the study of adaptation to extreme environments. We describe general and oxidative stress response genes as well as genes involved in pathways induced by hypoxia or involved in sulfide metabolism. To facilitate future comparative analyses, we also conducted quantitative comparisons between P. mexicana from different river drainages. 106,524 single nucleotide polymorphisms were detected in our dataset, including potential markers that are putatively fixed across drainages. Furthermore, specimens from different drainages exhibited some consistent differences in gene regulation.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23170846 PMCID: PMC3585874 DOI: 10.1186/1471-2164-13-652
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Sequencing and assembly statistics for Illumina sequencing used for the assembly of the transcriptome
| Average number of reads (± SD) | 11,698,857 (2,363,343) |
|---|---|
| Total number of reads | 70,193,146 |
| Average number of base pairs (± SD) | 1,181,584,624 (238,697,721) |
| Total number of base pairs | 7,089,507,746 |
| Average coverage (± SD) | 23.8 (4.4) |
| Total number of assembled contigs | 80,111 |
| Total number of unique loci | 53,245 |
| Total number of unique loci with predicted ORF | 52,914 |
| Mean contig length (basepairs) (± SD) | 932 (1004) |
| Maximum contig length | 15,623 |
Data presented represents reads from N = 6 bar coded individuals.
Figure 1Frequency distribution of assembled contigs by size.
Results of reciprocal blast searches of the transcriptome to the database of the guppy () [54], and Uniprot databases of medaka (), zebrafish (), and stickleback () [68]
| | ||||||
|---|---|---|---|---|---|---|
| Guppy | 29,671 | 46,634 | 65.6 | 46,953 | 29,337 | 55.2 |
| Medaka | 19,447 | 15,337 | 70.3 | 14,958 | 18,277 | 34.4 |
| Zebrafish | 25,013 | 26,871 | 51.0 | 26,870 | 24,370 | 45.8 |
| Stickleback | 18,037 | 13,807 | 73.9 | 13,554 | 17,024 | 28.8 |
The table lists the number of unique transcripts that map to the reference species’ transcriptome, the number of unique transcripts that received hits in the reference species, and the percent coverage in the references species database.
Figure 2Blast2GO assignment for 17,286 and 9,721 sequences. Significant differences in the frequency of different categories are highlighted with an asterisk. The numbers next to each colored slice of the pie chart represent the number of genes in the respective category.
Candidate genes with annotations from the SwissProt (SP) database that are involved in environmental stress tolerance, hypoxia, and sulfide metabolism
| General stress responses | ||||||
| Stress-activated protein kinase signaling pathways [ | Stress-activated protein kinase (JNK3) | comp49081_c0_seq1 | P53779 | 97 | 2.71E-63 | 1 |
| Stress-activated protein kinase kinase (SAPKK4) | comp30986_c0_seq1 | O14733 | 96 | 2.43E-153 | 1 | |
| Highly inducible heat shock proteins [ | Hsp 70 | comp24087_c0_seq1 | P27541 | 92 | 3.65E-103 | 1 |
| comp34171_c0_seq1 | Q91233 | 98 | 1.48E-25 | 2 | ||
| Hsp A2 | comp24673_c0_seq1 | P54652 | 88 | 1.47E-54 | 1 | |
| Hsp A4 | comp12212_c0_seq1 | Q61316 | 91 | 3.38E-158 | 1 | |
| comp7589_c0_seq1 | P34932 | 79 | <1.00E-180 | 2 | ||
| Hsp 90 | comp48547_c0_seq1 | P07900 | 77 | 7.45E-61 | 2 | |
| comp5586_c0_seq1 | Q4R4P1 | 76 | <1.00E-180 | 1 | ||
| comp32336_c0_seq1 | O61998 | 77 | 8.80E-33 | 3 | ||
| Constitutive heat shock proteins [ | Hsp83 | comp28290_c0_seq1 | P12861 | 95 | 1.42E-43 | 3 |
| Hsc20 | comp17434_c0_seq1 | Q8K3A0 | 64 | 3.26E-60 | 1 | |
| Hsc70 | comp29081_c0_seq1 | Q9U639 | 80 | 3.69E-88 | 1 | |
| Hsc71 | comp7828_c0_seq1 | P08108 | 98 | <1.00E-180 | 4 | |
| Small heat shock proteins [ | Hsp β8 | comp23233_c0_seq1 | Q9UJY1 | 67 | 6.80E-66 | 1 |
| Oxidative stress responses | ||||||
| Antioxidant systems [ | Catalase | comp3837_c0_seq1 | Q9PT92 | 93 | <1.00E-180 | 1 |
| (Cu & Zn) superoxide dismutase | comp21322_c0_seq1 | Q751L8 | 61 | 2.29E-30 | 1 | |
| comp496_c0_seq1 | O73872 | 87 | 3.87E-88 | 2 | ||
| comp17555_c0_seq1 | P82205 | 64 | 2.05E-33 | 1 | ||
| comp28438_c0_seq1 | Q9C0N4 | 46 | 2.83E-09 | 1 | ||
| (Mn) superoxide dismutase | comp29573_c0_seq1 | P41978 | 57 | 6.78E-51 | 1 | |
| comp1897_c0_seq1 | P07895 | 86 | 5.13E-131 | 1 | ||
| Glutathione peroxidases | comp37533_c0_seq1 | Q4AEH3 | 61 | 6.22E-27 | 2 | |
| comp18221_c0_seq1 | Q4RSM6 | 90 | 5.39E-121 | 1 | ||
| comp559_c0_seq1 | Q4AEI2 | 80 | 9.44E-87 | 1 | ||
| comp841_c0_seq1 | P00435 | 81 | 8.33E-86 | 1 | ||
| Thioredoxin and glutathione reductase | comp2236_c0_seq1 | Q86VQ6 | 85 | <1.00E-180 | 1 | |
| Thioredoxin | comp26993_c0_seq1 | Q9BDJ3 | 72 | 4.09E-16 | 1 | |
| comp572_c0_seq1 | Q9DGI3 | 90 | 7.86E-20 | 1 | ||
| Methallothioneins [ | Metal-responsive element-binding transcription factor 2 | comp17857_c0_seq1 | Q02395 | 70 | 3.28E-164 | 1 |
| Hypoxia induced responses | ||||||
| Transcription factors [ | Hypoxia-inducible factor 1α | comp436_c1_seq1 | Q9YIB9 | 76 | 1.65E-110 | 1 |
| comp2126_c0_seq1 | Q98SW2 | 78 | <1.00E-180 | 3 | ||
| Oxygen transport [ | Erythropoietin | comp50800_c0_seq1 | Q5IGQ0 | 90 | 7.05E-47 | 1 |
| Hemoglobin β chain | comp12875_c0_seq1 | P84652 | 87 | 1.01E-78 | 1 | |
| Myoglobin | comp390_c1_seq1 | Q9DGJ1 | 91 | 5.58E-72 | 1 | |
| Aerobic/anaerobic metabolism [ | Malate dehydrogenase | comp11493_c0_seq1 | P11708 | 83 | 3.41E-37 | 1 |
| Succinate dehydrogenase | comp898_c0_seq1 | Q7ZVF3 | 94 | <1.00E-180 | 2 | |
| Citrate synthase | comp44107_c0_seq1 | Q91V92 | 77 | 6.21E-39 | 1 | |
| Phosphoglycerate mutase | comp703_c0_seq1 | P18669 | 94 | 6.82E-163 | 1 | |
| Phosphoglycerate kinase | comp21645_c0_seq1 | Q60HD8 | 76 | 2.87E-163 | 1 | |
| α-enolase | comp324_c0_seq1 | Q9PVK2 | 96 | <1.00E-180 | 1 | |
| Lactate dehydrogenase (A chain, B chain, C chain) | comp4088_c0_seq1 | Q92055 | 99 | <1.00E-180 | 1 | |
| comp192_c0_seq1 | P20373 | 98 | <1.00E-180 | 1 | ||
| comp17481_c0_seq1 | Q06176 | 97 | 7.12E-128 | 2 | ||
| Glycogen phosphorylase | comp28770_c0_seq1 | Q9XTL9 | 79 | 2.43E-178 | 3 | |
| Metabolic rate suppression [ | α-tropomyosin | comp1635_c0_seq2 | P84335 | 99 | 4.56E-56 | 2 |
| comp1488_c0_seq1 | P13105 | 96 | 1.61E-37 | 1 | ||
| Myosin heavy chain | comp37734_c0_seq1 | Q63357 | 87 | 2.22E-41 | 1 | |
| Insulin-like growth factor binding protein 1 | comp12811_c0_seq1 | P24591 | 55 | 5.49E-43 | 1 | |
| Sulfide detoxification | ||||||
| Sulfide metabolism and toxicity [ | Sulfide:quinone oxidoreductase | comp1919_c0_seq1 | Q9Y6N5 | 85 | <1.00E-180 | 1 |
| Sulfite oxidase | comp25579_c0_seq1 | P07850 | 78 | 2.59E-93 | 2 | |
| Sulfur dioxygenase (ETHE1) | comp1681_c0_seq1 | Q9DCM0 | 78 | 3.54E-104 | 2 | |
| Thiosulfate sulfurtransferase (Rhodanese) | comp2624_c0_seq1 | Q8NFU3 | 62 | 2.19E-28 | 2 | |
| comp12736_c0_seq1 | Q3U269 | 71 | 1.22E-176 | 1 | ||
| Mercaptopyruvate sulfurtransferase | comp1051_c1_seq7 | P97532 | 74 | 1.81E-15 | 1 | |
| Cytochrome c oxidase complex (complex III subunit 6, subunit 3) | comp513_c0_seq3 | P07919 | 84 | 1.75E-18 | 1 | |
| comp4_c0_seq1 | Q96133 | 93 | 3.16E-141 | 1 | ||
For each candidate genes, we reported the database and accession number, percent similarity and E-value, as well as the total number of contigs that matched to a specific database record.
Figure 3Plot of the log-fold change between the two drainages versus the log-concentration for each transcript. The most differentially expressed transcripts (P ≤ 0.01) are colored in red. The blue lines are a log-fold change of 2, indicating a fold change of 4.