| Literature DB >> 19435504 |
Eli Meyer1, Galina V Aglyamova, Shi Wang, Jade Buchanan-Carter, David Abrego, John K Colbourne, Bette L Willis, Mikhail V Matz.
Abstract
BACKGROUND: New methods are needed for genomic-scale analysis of emerging model organisms that exemplify important biological questions but lack fully sequenced genomes. For example, there is an urgent need to understand the potential for corals to adapt to climate change, but few molecular resources are available for studying these processes in reef-building corals. To facilitate genomics studies in corals and other non-model systems, we describe methods for transcriptome sequencing using 454, as well as strategies for assembling a useful catalog of genes from the output. We have applied these methods to sequence the transcriptome of planulae larvae from the coral Acropora millepora.Entities:
Mesh:
Year: 2009 PMID: 19435504 PMCID: PMC2689275 DOI: 10.1186/1471-2164-10-219
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Diagram of cDNA synthesis and 454 library preparation procedures. Three fragment types are produced by sonication: 5', internal, and 3'. Ligation with the partially double-stranded adaptors A and B produces, for each fragment type, certain adaptor configurations that will be amplified (above the '+'), and others that will be suppressed (below the '+') during the subsequent amplification. The procedure preferentially amplifies constructs that are appropriate for 454 sequencing (shown inside box).
Summary of sequencing, assembly, and analysis
| Sequences (n) | Bases (Mb) | |
| Raw sequencing reads | 628,649 | 146.3 |
| Trimmed & size-selected | 599,248 | 133.6 |
| Contigs | 44,444 | 19.6 |
| Singletons | 62,657 | 13.6 |
| Total | 107,101 | 33.2 |
| Scaffolds | 104,005 | |
| Sequence clusters | 93,466 |
Figure 2Overview of . (A) Size distribution of 454 sequencing read sizes after removal of adaptor sequences and outliers. (B) Size distribution of assembled sequences after assembly and contig joining. Note the logarithmic y-axis. (C) Log-log plot showing the dependence of assembled sequence lengths on the number of sequences assembled into each. (D) Assembled sequences are shown ranked from largest to smallest, with the cumulative percent of assembled bases (dashed line) and total assembly length (solid line) calculated based on those rankings. Sequence rank is shown in units of 10,000.
Validation of singleton sequences and the contig joining procedure by PCR amplification and Sanger sequencing.
| Singleton ID | PCR | Aligned region (bp) | % identity |
| E60BDOM01CLWZX | + | 127 | 95% |
| E60BDOM01ALGIK | + | ||
| E60BDOM01BPLRZ | + | 154 | 94% |
| E60BDOM02F1SCS | + | 148 | 99% |
| E60BDOM01CV0LE | + | ||
| E60BDOM01ETTGL | + | 157 | 99% |
| E60BDOM01BOMBA | + | ||
| E60BDOM01AM229 | |||
| E60BDOM01CMRH7 | |||
| E60BDOM01E1RA8 | + | 127 | 98% |
| Scaffold ID | PCR | 5' match, bp | 3' match, bp |
| EZ002257 | + | 193 (98%) | |
| EZ000984 | + | 248 (99%) | 123 (99%) |
| EZ001217 | + | 722 (96%) | 374 (98%) |
| EZ001302 | + | 174 (97%) | 180 (99%) |
| EZ001324 | + | 361 (97%) | 183 (99%) |
| EZ002268 | + | 249 (99%) | 186 (99%) |
| EZ001475 | + | 323 (99%) | |
| EZ002219 | + | 715 (98%) | 218 (98%) |
| EZ000750 | + | 768 (96%) | 212 (99%) |
| EZ001662 | + | 133 (99%) | 55 (98%) |
Singleton sequences are identified by sequence identifiers for SRA003728, with the size (in bp) and quality (% identity) of the alignment between the 454 sequence and the Sanger sequence. Scaffolds are identified by accession numbers, with sequence identity shown for the 5'-most contig and the 3'-most contig contained in that scaffold.
Summary of annotation of the A. millepora larval transcriptome.
| All sequences | ≥ 300 bp | ≥ 1000 bp | |
| Total number of sequences | 104,005 | 19,210 | 5,039 |
| Sequences with BLAST matches | 24,850 | 11,901 | 4,474 |
| Sequences matching known genes | 15,860 | 9,464 | 3,889 |
| Sequences assigned GO terms | 17,902 | 8,915 | 3,436 |
| Sequences with conserved domains | 12,785 | 8,144 | 3,573 |
Candidate genes identified based on GO annotation of A. millepora larval transcriptome.
| Process | GO Term | Sequences | Example gene (match accession) |
| Response to stress | 0006950 | 320 | Heat shock factor-binding protein (Q5RDI2) |
| Response to heat | 0009408 | 2 | 70 kDa Heat shock protein (P17879) |
| Response to oxidative stress | 0006979 | 42 | Catalase (Q9PWF7) |
| Response to wounding | 0009611 | 4 | Phospholipase A2-activating protein (P27612) |
| Apoptosis | 0006915 | 38 | Caspase (P70677) |
| Exocytosis | 0006887 | 7 | Exocyst complex component 5 (P97878) |
| Immune response | 0006955 | 58 | H-2 class II histocompatibility antigen (P04441) |
| Nitric oxide metabolism | 0046209 | 3 | Nitric oxide synthase (O19132) |
| Protein folding | 0006457 | 123 | 60 kDa heat shock protein (P18687) |
| Vacuolar transport and organization | 0007034, 0007033 | 5 | Vacuolar protein sorting-associated protein 26B-B (Q6DH23) |
Genes from essential metabolic pathways and macromolecular complexes annotated in larval transcriptome.
| Target | Genes found (n) | Known genes (n) |
| Pathways | ||
| Glycolysis | 10 | 10 |
| Gluconeogenesis | 10 | 10 |
| Pentose phosphate | 5 | 5 |
| Citrate cycle | 9 | 9 |
| Urea cycle | 5 | 5 |
| Complexes | ||
| 26S proteosome | 22 | 22 |
| Chaperonin (TCP1) | 8 | 8 |
| Spliceosome | 130 | 143 |
| Ribosome | 76 | 79 |
| Nuclear pore complex | 26 | 28 |
Intercellular signaling pathway genes annotated in larval transcriptome.
| Pathway | Gene name | Sequences (n) |
| Hedgehog | Patched | 15 |
| Fused | 1 | |
| Hedgehog | 6 | |
| Smoothened | 2 | |
| JAK/STAT | Janus kinase | 4 |
| Signal transducer and activator of transcription | 1 | |
| NFKB/Toll | Nuclear factor NF-kappa-B | 3 |
| Toll-interacting protein | 1 | |
| Toll-like receptor | 9 | |
| NHR | Estrogen-related receptor | 1 |
| Hepatocyte nuclear factor 4 | 1 | |
| Retinoid-related orphan receptors | 6 | |
| Notch | Furin | 8 |
| Delta | 12 | |
| Notch | 23 | |
| Presenilin | 4 | |
| TACE | 1 | |
| RTK | Receptor tyrosine kinase | 5 |
| TGF-beta | Activin-like kinase | 0 |
| SMAD | 15 | |
| TGF-beta-receptor | 3 | |
| WNT | Disheveled | 2 |
| Frizzled | 22 | |
| Wnt | 25 |
Major transcription factor families identified by conserved domain annotation of larval transcriptome
| Sequences (n) | Domain ID | Conserved domain description |
| 1 | pfam01722 | BolA-like protein |
| 0 | pfam00313 | Cold-shock DNA-binding domain |
| 1 | pfam01381 | Helix-turn-helix |
| 2 | pfam02229 | Transcriptional Coactivator p15 |
| 0 | pfam02671 | Paired amphipathic helix repeat |
| 1 | pfam02864 | Signal transducer and activator of transcription |
| 1 | pfam01167 | Tub family |
| 4 | pfam00046 | Homeobox domain |
| 1 | pfam00447 | HSF-type DNA-binding |
| 2 | pfam00870 | P53 DNA-binding domain |
| 6 | pfam02257 | RFX DNA-binding domain |
| 2 | pfam01422 | NF-X1-type zinc finger protein |
| 3 | pfam02319 | E2F/DP winged-helix DNA-binding domain |
| 1 | pfam00319 | SRF-type transcription factor |
| 1 | pfam00250 | Fork head |
| 12 | pfam07716, pfam00170 | Basic region leucine zipper & bZIP |
| 9 | pfam00010 | Helix-loop-helix DNA-binding domain |
| 3 | pfam00249 | Myb-like DNA-binding domain |
| 12 | pfam00642 | Zinc finger, CCCH type |
| 5 | pfam00096 | Zinc finger, C2H2 type |
Figure 3Classification of single nucleotide polymorphisms (SNPs) identified from 454 sequences. Overall frequency of these SNP types in the larval transcriptome is one per 207 bp.