| Literature DB >> 20174471 |
Abderrahmane Tagmount1, Mei Wang, Erika Lindquist, Yoshihiro Tanaka, Kristen S Teranishi, Shinichi Sunagawa, Mike Wong, Jonathon H Stillman.
Abstract
BACKGROUND: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2010 PMID: 20174471 PMCID: PMC2824831 DOI: 10.1371/journal.pone.0009327
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Library generation information.
| Library Indentifier | Sub-Library identifier | Sub-Library Tissue Type | Pooled tissue types | Sub-Library # clones | Sub-Library experimental treatment conditions |
| CAYC | 1 | Heart | 768 | 1 | |
| 2 | 1,152 | ||||
| 3 | Nerve | 768 | |||
| 4 | Whole Crab | 1,152 | |||
| 5 | 1,152 | ||||
| CAYF | 1 | Heart | 384 | 1 | |
| 2 | 768 | ||||
| 3 | 1,920 | ||||
| 4 | Hepatopancreas | 768 | |||
| 5 | 1,152 | ||||
| 6 | 384 | ||||
| 7 | Gill | 768 | |||
| 8 | 1,536 | ||||
| 9 | Claw muscle | 384 | |||
| 10 | 768 | ||||
| CCAG | 1 | Pooled from multiple tissues and conditions | Heart, Gill, Whole crab “remains” after heart, gill, and hepatopancreas removed. | 47,616 | 2 |
| Larvae, freshly molted whole crabs | 3 | ||||
| Whole crabs | 4 | ||||
| Heart | 5 | ||||
| Heart, gill, muscle | 6 | ||||
| Heart, gill, nerve, claw muscle | 7 |
In CAYC + CAYF there were many different cDNA libraries constructed from seven tissue specific pooled RNA isolates resulting from an array of experiments [19]. These libraries were non-normalized.
In CCAG there was one library made from an RNA sample that was pooled with equal quantity of RNA per treatment for each tissue. This library was normalized to increase EST diversity.
Treatment Conditions Descriptions:
1. Field collected crabs that were acclimatized across latitudinal (north-south) and seasonal (winter-summer) gradients, heat shock to 30°C, cold shock to 0°C, acclimated to 8°, 12°, 15°, 18°, 22° and 25°C for 2 to 60 days.
2. Heat 30°C, 4 h (2 h, 4 h, and 6 h 15°C recovery), Cold 2°C, 4 h (2 h, 4 h, and 6 h 15°C recovery), H2O2, 0.5 mM (18 h), CdCl2, 50 µM (24 h), Selenate, 50 µM (24 h), Selenite, 50 µM (24 h), Hypersalinity, 54‰ (18 h), Hyposalinity, 13‰ (18 h), Desiccation (24 h), Hypoxia, 2 h (20 min normoxia recovery), Starvation, 15d (2 h postprandial), Insecticide, 1spray Pyrethrin/200 ml, 5 min (4 h recovery).
3. Acclimated for 1–7 days in San Francisco Bay water (salinity 25–32‰)
4. Acclimated for 1 month to 8°, 15°, 18°, & 25°C
5. Field acclimatized, north-south, winter-summer
6. Acclimated for 1 month to 7°, 19°C
7. Acclimated to 1 month in a thermally fluctuating condition (8:18°C, 12 h:12 h)
Equal amounts of RNA were mixed from each Pooled tissue/treatment to make the pooled RNA sample used to make the CCAG normalized library.
EST sequencing statistics.
| Library | Attempted ESTs | Attempted Clones | Insertless Clones (%) | Contaminated Clones (%) | Clones Passing to Clustering (%) | ESTs Passing to Clustering (%) | Mean trimmed EST length (± 1 SD) |
| CAYC | 10,752 | 5,376 | 20 (0.4) | 421 (7.8) | 4,508 (83.8) | 7,727 (71.9) | 656.8±165.8 |
| CAYF | 16,896 | 8,448 | 180 (2.1) | 1,139 (13.5) | 6,702 (79.3) | 12,214 (72.3) | 512.5±170.2 |
| CCAG | 94,847 | 47,616 | 419 (0.9) | 209 (0.4) | 43,548 (91.5) | 77,865 (82.1) | 614.8±174 |
| CAYC+CAYF+CCAG | 122,495 | 61,440 | 619 (1.1) | 1,769 (7.3) | 54,758 (84.9) | 97,806 (79.8) |
Joint Genome Institute clustering statistics.
| Library | Number of clusters | Number of contigs | Clusters with one EST (%) | Clusters with one clone (%) | Clusters with > one clone (%) |
| CAYC | 2,583 | 3,461 | 689 (26.7) | 1,307 (50.6) | 587 (22.7) |
| CAYF | 2,629 | 3,058 | 553 (21.0) | 1,449 (55.1) | 627 (23.8) |
| CCAG | 16,309 | 26,625 | 3,703 (22.7) | 7,360 (45.1) | 5,246 (32.3) |
| CAYC+CAYF+CCAG | 19,312 | 30,764 | 4,504 (23.3) | 9,709 (50.3) | 6,001 (31.1) |
EST2uni assembly and clustering statistics.
| Library | ESTs passing quality (%) | UniSeq Type | Number (%) | Cluster of UniSeqs | Number (%) |
| CAYC | 7710 (99.8) | Contig | 14,694 (52) | = 1 UniSeq | 20,554 (89) |
| CAYF | 12197 (99.9) | Singleton | 13,693 (48) | >1 UniSeq | 2,458 (11) |
| CCAG | 77741 (99.8) | Total | 28,333 | Total | 23,012 |
| CAYC+CAYF+CCAG | 97648 (99.8) |
The starting set of ESTs was the ESTs Passing to Clustering from the JGI (Table 2).
UniSeq length distribution, based on EST2uni assembly.
| Sequence Length | Number of UniSeqs | |
| From (bp) | To (bp) | |
| 101 | 548.8 | 5713 |
| 548.8 | 996.6 | 15830 |
| 996.6 | 1444.4 | 4604 |
| 1444.4 | 1892.2 | 1350 |
| 1892.2 | 2340 | 554 |
| 2340 | 2787.8 | 180 |
| 2787.8 | 3235.6 | 60 |
| 3235.6 | 3683.4 | 29 |
| 3683.4 | 4131.2 | 10 |
| 4131.2 | 4579 | 3 |
Homology search summary.
| Homology Search Algorithm | Assembly (query) | Database (date of database searched) | # queries | # Strong Matches (≤1e∧-5) | Percentage strong match |
| BLASTx | Assembly #1,2 | GenBank_v159_aa.fasta (July, 2007) | 33,114 | 12,242 | 36.97% |
| BLASTx | Assembly #1,2 | uniprot_sprot_fasta (September, 2007) | 33,114 | 10,101 | 30.50% |
| BLASTx | Assembly #4 | uniprot_sprot_fasta (September, 2008) | 28,333 | 7,973 | 28.14% |
| InterProScan | Assembly #1,2 | blastprodom, fprintscan, pfam, pir, panther, tigr, smart, superfamily, gene3d, scanregexp, profilescan, seg, coils, tm, signalp, GO (October, 2007) | 33,114 | 8,641 | 26.10% |
| BLASTx | Assembly #3 | daphnia.filtered_models_v1.1.aa.fasta (January, 2008) | 30,764 | 2,997 | 9.74% |
| tBLASTx | Assembly #3 | FrozenGeneCatalog_2007_07_03.na.fasta (September, 2008) | 30,764 | 8,939 | 29.06% |
| Total | BLASTx, tBLASTx, and InterProScan | 55,641 | 36,544 | 65.68% |
Clones with homology search match in at least one of the algorithms and databases used.
The number of hits with bitscores >40 was 7,737 (27.31%)
Figure 1Analysis of BLASTx hit start position for contigs of assembly #4 with unique strong hits.
Contigs (n = 3,993) that matched unique genes in the SwissProt database with bitscores >40 are plotted as the number of hits per hit amino acid start site (open symbols) and fractional total of hits (solid symbols).
Figure 2Paired BLASTx homology search results between Petrolisthes cinctipes and either the SwissProt uniprot database, or the unigene set of a range of species.
(A me = Apis mellifera, N ve = Nematostella vectensis, D me = Drosophila melanogaster, S pu = Strongylocentrus purpuratus, D pu = Daphnia pulex, A ga = Anopheles gambiae, T ca = Tribolium castaneum, H sa = Homo sapiens).
Figure 3Frequency distributions of condensed Gene Ontology terms.
Terms described for A) molecular function, B) biological process, and C) cellular compartment returned in our analyses of the Petrolisthes cinctipes transcriptome. A complete list and abundance of GO terms for each ontology, is given in Table S1.
Selected subset of KEGG KAAS Pathway search results for pathways with high coverage.
| Pathway ID | Pathway Name | Number of KEGG enzymes in pathway | Number of pathways enzymes in | Percentage of Pathway reactions covered by our library |
| 00010 | Glycolysis/Gluconeogenesis (and pyruvate fermentation) | 64 | 23 | 100% |
| 00020 | Citric acid cycle | 50 | 21 | 100% |
| 00030 | Pentose phosphate pathway | 42 | 12 | 93% |
| 00190 | Oxidative phosphorylation | 200 | 88 | 100% |
| 00071 | Fatty acid metabolism | 34 | 18 | 100% |
| 00230 | Purine metabolism | 203 | 46 | 70% |
| 00240 | Pyrimidine metabolism | 150 | 37 | 80% |
| 00280 | Valine, leucine and isoleucine degradation | 52 | 24 | 76% |
| 03010 | Ribosome | 147 | 78 | 75% |
| 03022 | Basal transcription factors | 26 | 16 | n/a |
| 03030 | DNA replication | 50 | 23 | 68% |
| 03420 | Nucleotide excision repair | 48 | 22 | 100% |
| 04010 | MAPK signaling pathway | 177 | 33 | 51% |
| 04020 | Calcium signaling pathway | 130 | 21 | 100% |
| 04120 | Ubiquitin mediated proteolysis | 116 | 45 | 100% |
| 04310 | Wnt signaling pathway | 100 | 18 | 28% |
| 04810 | Regulation of actin cytoskeleton | 133 | 35 | 41% |
| 04110 | Cell cycle | 91 | 29 | n/a |
| 04510 | Focal adhesion | 125 | 39 | 59% |
| 04720 | Long term potentiation | 42 | 14 | 83% |
Data on number of KEGG Enzymes by Gene in each pathway from http://www.genome.jp/dbget-bin/get_linkdb?pathwaymap00010
This reflects the percent coverage in the core pathway reactions, but not all of the side-entry or alternative metabolite routes into the pathway. Data presented for pathways where counting the number of core reactions was possible. In the case of signaling pathways, or multicomponent enzyme systems (e.g., E3 ubiquitin) the number of categories, but not individual components, for which our library covers are given. When prokaryotic and eukaryotic pathways are both presented in the KEGG pathway map, we have only included the eukaryotic set here. Due to complexity of pathways 03022 and 04110, the percentage coverage of these KEGG maps has not been calculated.
Figure 4Overlay of metabolic pathway map generated by the KEGG KAAS for both the Petrolisthes cinctipes and Daphnia pulex analyses.
Enzymes are represented by lines, and compounds are represented by dots. Coloration indicates that genes were found in both both P. cinctipes and D. pulex (purple), only P. cinctipes (blue), only D. pulex (red), or in neither of the data sets (white).
Figure 5Venn diagram representations of 3-way reciprocal tBLASTx results between porcelain crabs, Daphnia and insects.
Comparisons were between unigene sets of Petrolisthes cinctipes (P. cin, blue), Daphnia pulex (D. pul, pink) and either (A) Drosophila melanogaster (D. mel, yellow) or (B) Apis mellifera (A. mel, yellow).