| Literature DB >> 22529998 |
Till Bayer1, Manuel Aranda, Shinichi Sunagawa, Lauren K Yum, Michael K Desalvo, Erika Lindquist, Mary Alice Coffroth, Christian R Voolstra, Mónica Medina.
Abstract
Dinoflagellates are unicellular algae that are ubiquitously abundant in aquatic environments. Species of the genus Symbiodinium form symbiotic relationships with reef-building corals and other marine invertebrates. Despite their ecologic importance, little is known about the genetics of dinoflagellates in general and Symbiodinium in particular. Here, we used 454 sequencing to generate transcriptome data from two Symbiodinium species from different clades (clade A and clade B). With more than 56,000 assembled sequences per species, these data represent the largest transcriptomic resource for dinoflagellates to date. Our results corroborate previous observations that dinoflagellates possess the complete nucleosome machinery. We found a complete set of core histones as well as several H3 variants and H2A.Z in one species. Furthermore, transcriptome analysis points toward a low number of transcription factors in Symbiodinium spp. that also differ in the distribution of DNA-binding domains relative to other eukaryotes. In particular the cold shock domain was predominant among transcription factors. Additionally, we found a high number of antioxidative genes in comparison to non-symbiotic but evolutionary related organisms. These findings might be of relevance in the context of the role that Symbiodinium spp. play as coral symbionts.Our data represent the most comprehensive dinoflagellate EST data set to date. This study provides a comprehensive resource to further analyze the genetic makeup, metabolic capacities, and gene repertoire of Symbiodinium and dinoflagellates. Overall, our findings indicate that Symbiodinium possesses some unique characteristics, in particular the transcriptional regulation in Symbiodinium may differ from the currently known mechanisms of eukaryotic gene regulation.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22529998 PMCID: PMC3329448 DOI: 10.1371/journal.pone.0035269
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Overview of the sequencing data, assembly, clustering, and annotation statistics.
| CassKB8 | Mf10.5b | |
|
| ||
| No. of useable reads | 1,103,642 | 940,418 |
| Average read length | 401 | 365 |
| Total no. of bases | 443,465,967 | 343,473,807 |
|
| ||
| No. of contigs | 53,374 | 48,942 |
| No. of singlets | 18,778 | 27,342 |
| Total bases | 61,920,532 | 45,335,163 |
| Average contig length | 1,029 | 769 |
|
| ||
| Clusters (no. contigs and singlets) | 8,483 (22,959) | 11,407 (31,493) |
| Unclustered contigs and singlets | 49,193 | 44,791 |
| Total genes estimate | 57,676 | 56,198 |
|
| ||
| BLASTX (swissprot, trembl, nr) | 41.38% | 31.17% |
| KEGG/KAAS | 15.51% | 11.10% |
| InterProScan | 34.18% | 25.19% |
Annotation of pathways and complexes in the transcriptome data (values are numbers of genes, i.e. contigs and singlets clustered at 90% similarity).
| Pathway/complex | Known genes | Identified genes | |
| CassKB8 | Mf1.05b | ||
| Glycolysis | 10 | 10 | 10 |
| Pentosephosphate pathway | 7 | 7 | 6 |
| TCA cycle | 11 | 10 | 9 |
| Calvin Cycle | 11 | 11 | 11 |
| Proteasome | 33 | 31 | 25 |
| Spliceosome | 72 | 66 | 63 |
| Universal single copy genes | 40 | 38 | 38 |
COG0096 and COG0552 were not identified.
GC content in predicted coding regions of genes with BLASTX e-values<10−10.
| CassKB8 | Mf1.05b | |
| coding %GC | 56.41 | 50.57 |
| 3rd position %GC | 68.90 | 54.96 |
| Nc | 51.36 | 55.56 |
| No of codons | 4,224,266 | 2,525,073 |
Nc = number of effectively used codons.
Figure 1Nc and correspondence analysis of codon usage plots.
(A, B) Plots of the effective number of codons (Nc) plotted versus third codon position GC content (GC3) in CassKB8 and Mf1.5b respectively. The red points are the same genes as in C and D, respectively. The yellow line represents the neutral expectation for Nc. (C, D) Correspondence analysis of codon usage. The genes separated from the main cloud are marked red.
Genes that are outliers in the correspondence analysis of codon usage (red points in Fig. 1).
| Location | CassKB8 | Mf1.05b | |
| photosystem II protein D1 (psbA) | C | 12 | 20 |
| photosystem II CP47 protein (psbB) | C | 25 | 17 |
| cytochrome b6 (petB) | C | 3 | 10 |
| ATP synthase subunit alpha (atpA) | C | 1 | 9 |
| photosystem II CP43 protein (psbC) | C | 7 | 7 |
| ATP synthase subunit beta (atpB) | C | 5 | 6 |
| photosystem II protein D2 (psbD) | C | 1 | 5 |
| cytochrome b6/f complex subunit 4 (petD) | C | 5 | 3 |
| cytochrome oxidase subunit I (COX1) | M | 3 | 2 |
| Peptide-N(4)-(N-acetyl-beta-glucosaminyl)asparagine amidase | N | 0 | 1 |
| Histidinol-phosphate aminotransferase | N | 0 | 1 |
| Probable cysteine desulfurase | N | 0 | 1 |
| Ribosomal RNA small subunit methyltransferase B | N | 0 | 1 |
| Type I iodothyronine deiodinase | N | 0 | 1 |
| Ureide permease 1 | N | 0 | 1 |
| Ankyrin repeat and SAM domain-containing protein 6 | N | 1 | 0 |
| Collagen alpha-1(I) chain | N | 1 | 0 |
| Sensor protein degS | N | 1 | 0 |
The assumed cellular location is noted as follows: C, chloroplast minicircles; M, mitochrondrium; N, nucleus. All genes were grouped according to their BLASTX annotation and the number of genes for each annotation is shown for both species. Genes with less than 100 analyzed codons were not included.
Comparison of histones and nucleosome-associated proteins from this and previous studies (DinoEST).
| CassKB8 | Mf1.05b | DinoEST | Study | ||||
| H2A | 3 | 3 | 2 | ||||
| H2A.X | 2 | 0 | 2 | Lin et al 2010 | |||
| H2A.Z | 1 | 3 | 0 | this study | |||
| H2.B | 0 | 2 | Lin 2010 | ||||
| H2B.2 | 1 | 0 | na | this study | |||
| H2B.4 | 1 | 0 | na | this study | |||
| H3 | 3 | 3 | Okamoto 2003 | ||||
| H3.3 | 2 | 2 | na | this study | |||
| H3.4 | 1 | 0 | na | this study | |||
| H3-like CSE4 | 0 | 1 | na | this study | |||
| H4 | 3 | 1 | 1 | Lin 2010 | |||
| Histone acetyltransferases | 2 | 4 | 0 | this study | |||
| Histone deacetylation | 5 | 8 | 2 | Lin 2010 | |||
| Histone methylation | 9 | 15 | 1 | Lin 2008 | |||
| Histone demethylation | 5 | 5 | 0 | this study | |||
| Histone associated | 3 | 2 | 0 | this study | |||
| Nucleosome assembly | 2 | 3 | 1 | Lin 2010 | |||
| Chromatin remodeling | 11 | 9 | 0 | this study | |||
Subtype not specified.
Figure 2Phylogenetic analysis of histone sequences.
H2A- and H3-like sequences from Symbiodinium sp. CassKB8, Mf1.05b, and other organisms were used to calculate phylogenetic trees. The trees were inferred using contigs and singlets with full-length amino acid sequences of (A) H2A and (B) H3-like genes using Maximum-Likelihood and Bayesian analysis. Bootstrap values and posterior probabilities are provided as ML/MB for nodes with support above 50% or 0.5. The singlet and contig names are provided for Symbiodinium sp. CassKB8 and Mf1.05b sequences (in bold), other taxa are shown as species name followed by GenBank accession number. The H2 tree was rooted for H2A.Bbd sequences whereas the H3 tree was rooted for Homo sapiens H3.
Number of transcription factor domains found in Symbiodinium genes (based on 90% similarity clustering of contigs and singlets) and of all dinoflagellate ESTs available in Genbank dbEST.
| CassKB8 | Mf1.05b | All dino ESTs from dbEST | |
| No. of genes with transcription factor domain | 156 | 87 | 272 |
| Total no. of genes with Pfam annotation | 18,564 | 13,495 | 24,098 |
| % contigs with transcription factor domains of all Pfam annotated | 0.84 | 0.64 | 1.13 |
| Total no. of clusters | 57,676 | 56,198 | 92,308 |
| % contigs with transcription factor domains of all clusters/genes | 0.27 | 0.15 | 0.29 |
Figure 3Transcription factor domain composition.
The relative fraction of the most abundant transcription factor domains in the Symbiodinium transcriptomes, all dinoflagellate ESTs from the NCBI dbEST database, and other eukaryotes. Searches were performed by using HMMER to search domain models for DNA binding domains, with an e-value cutoff of < = 1e−6. Domains which make up less than 5% were grouped in the ‘others’ category.
Comparison of the antioxidant gene repertoire between Arabidopsis thaliana, Phycomitrella patens, Symbiodinium sp. CassKB8, Symbiodinium sp. Mf1.05b, Thalassosira pseudonana, and Phaeodactylum tricornutum based on Pfam domains associated with antioxidant function.
| Function | Type | PFAM |
|
| CassKB8 | Mf1.05b |
|
|
| Sod_Cu | Superoxide dismutase | PF00080.14 | 4 | 7 | 3 | 0 | 0 | 1 |
| Sob_Fe_N | Superoxide dismutase | PF00081.16 | 5 | 4 | 4 | 2 | 4 | 2 |
| Sod_Fe_C | Superoxide dismutase | PF02777.12 | 5 | 3 | 5 | 6 | 3 | 2 |
| Sod_Ni | Superoxide dismutase | PF09055.5 | 0 | 0 | 5 | 10 | 0 | 1 |
| Catalase | Catalase | PF00199.13 | 3 | 7 | 0 | 0 | 0 | 1 |
| Peroxidase | Peroxidase | PF00141.17 | 82 | 65 | 27 | 24 | 16 | 10 |
| GSHPx | Glutathione peroxidase | PF00255.13 | 9 | 4 | 5 | 1 | 2 | 3 |
| Thioredoxin | Thioredoxin | PF00085.14 | 79 | 70 | 106 | 73 | 55 | 41 |
| Glutaredoxin | Glutathione reductase | PF00462.18 | 52 | 28 | 29 | 17 | 13 | 11 |
| Ferritin | Ferritin | PF00210.18 | 6 | 4 | 2 | 0 | 0 | 1 |
| 1-cysPrx_C | peroxiredoxin | PF10417.3 | 3 | 4 | 4 | 2 | 1 | 2 |
| Glutaredoxin2_C | Glutaredoxin2_C | PF04399.7 | 0 | 0 | 2 | 5 | 1 | 1 |
| AhpC-TSA | Alkylhydroperoxide reductase | PF00578.15 | 45 | 43 | 28 | 18 | 19 | 22 |