| Literature DB >> 15921535 |
Jeremiah D Hackett1, Todd E Scheetz, Hwan Su Yoon, Marcelo B Soares, Maria F Bonaldo, Thomas L Casavant, Debashish Bhattacharya.
Abstract
BACKGROUND: Dinoflagellates are important marine primary producers and grazers and cause toxic "red tides". These taxa are characterized by many unique features such as immense genomes, the absence of nucleosomes, and photosynthetic organelles (plastids) that have been gained and lost multiple times. We generated EST sequences from non-normalized and normalized cDNA libraries from a culture of the toxic species Alexandrium tamarense to elucidate dinoflagellate evolution. Previous analyses of these data have clarified plastid origin and here we study the gene content, annotate the ESTs, and analyze the genes that are putatively involved in DNA packaging.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15921535 PMCID: PMC1173104 DOI: 10.1186/1471-2164-6-80
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Cluster size and frequency of the A. tamarense ESTs.
| Cluster Size | Frequency | Cluster Size | Frequency | Best BLAST hit(s) |
| 1 | 4618 | 14 | 7 | |
| 2 | 1249 | 15 | 1 | unknown |
| 3 | 427 | 16 | 1 | HSP90 |
| 4 | 176 | 17 | 4 | peridinin-chl a protein, Cytochrome C6, EF1-alpha, unknown |
| 5 | 81 | 18 | 1 | ATP synthase C chain |
| 6 | 44 | 19 | 2 | Form II Rubisco, unknown putative dino. specific protein |
| 7 | 32 | 21 | 1 | fucoxanthin chlorophyll a/c binding protein like |
| 8 | 15 | 22 | 1 | Unknown putative plastid protein |
| 9 | 21 | 23 | 1 | Unknown |
| 10 | 13 | 24 | 3 | peridinin-chlorophyll a protein, ATP synthase C chain, unknown |
| 11 | 10 | 29 | 1 | luciferin-binding protein |
| 12 | 7 | 46 | 1 | histone-like protein/basic nuclear protein |
| 13 | 6 |
Figure 1Putative dinoflagellate-specific proteins. Amino acid sequence alignments of putative dinoflagellate-specific proteins. A) putative plastid protein that was highly represented in the A. tamarense cDNA library (cluster size = 22). A. tamarense sequences 1, 2, and 3 correspond to clones GC1-aba-e-13, GC1-abh-e14, and GC1-abd-o-22, respectively, and are aligned with highly similar ESTs from the dinoflagellates L. polyedrum (CD809498) and P. lunula (BU582532). The boxed region indicated a possible plastid targeting sequence. B) Putative dinoflagellate specific protein with significant blast hits only to other dinoflagellate ESTs. The Alexandrium sequence corresponds to clone UI-D-GC1-abh-f-23-0-UI.
Codon Usage in the A. tamarense ESTs.
| TTT F | 703 | 23.1% | TCT S | 482 | 10.0% | TAT Y | 372 | 18.8% | TGT C | 251 | 15.6% |
| TTC F | 2335 | 76.9% | TCC S | 1348 | 27.9% | TAC Y | 1612 | 81.2% | TGC C | 1356 | 84.4% |
| TTA L | 61 | 0.9% | TCA S | 413 | 8.6% | TAA * | 29 | 5.6% | TGA * | 411 | 79.8% |
| TTG L | 1118 | 15.7% | TCG S | 926 | 19.2% | TAG * | 75 | 14.6% | TGG W | 1051 | 100.0% |
| CTT L | 902 | 12.7% | CCT P | 751 | 17.6% | CAT H | 464 | 25.7% | CGT R | 475 | 9.6% |
| CTC L | 2296 | 32.3% | CCC P | 1382 | 32.4% | CAC H | 1340 | 74.1% | CGC R | 1779 | 35.8% |
| CTA L | 139 | 2.0% | CCA P | 829 | 19.4% | CAA Q | 433 | 14.6% | CGA R | 426 | 8.6% |
| CTG L | 2596 | 36.5% | CCG P | 1307 | 30.6% | CAG Q | 2535 | 85.4% | CGG R | 1128 | 22.7% |
| ATT I | 715 | 19.1% | ACT T | 542 | 13.1% | AAT N | 508 | 21.0% | AGT S | 344 | 7.1% |
| ATC I | 2770 | 74.1% | ACC T | 1442 | 34.9% | AAC N | 1915 | 79.0% | AGC S | 1310 | 27.2% |
| ATA I | 253 | 6.8% | ACA T | 638 | 15.4% | AAA K | 415 | 8.5% | AGA R | 253 | 5.1% |
| ATG M | 2096 | 100.0% | ACG T | 1510 | 36.5% | AAG K | 4485 | 91.5% | AGG R | 910 | 18.3% |
| GTT V | 686 | 11.2% | GCT A | 1195 | 15.2% | GAT D | 1117 | 24.9% | GGT G | 943 | 13.8% |
| GTC V | 2214 | 37.8% | GCC A | 2899 | 36.8% | GAC D | 3371 | 75.1% | GGC G | 3957 | 58.1% |
| GTA V | 268 | 4.6% | GCA A | 1559 | 19.8% | GAA E | 750 | 13.8% | GGA G | 767 | 11.3% |
| GTG V | 2694 | 46.0% | GCG A | 2218 | 28.2% | GAG E | 4682 | 86.2% | GGG G | 1142 | 16.8% |
Analysis is of 515 proteins (81,893 codons). Third position nucleotide usage was T = 12.8%, A = 9.3%, C = 40.7%, G = 37.2%. The asterisk (*) indicates a stop codon.
Figure 2GO category assignment of Classification of 1,203 A. tamarense ESTs into the GO categories.
Top 20 A. tamarense EST blast hits against the genome of the apicomplexan P. falciparum.
| UI-D-GC1-aao-m-13-0-UI | 1.00E-112 | 23613558 | α-tubulin |
| UI-D-GC1-aav-f-09-0-UI | 6.00E-86 | 23508137 | flavoprotein subunit of succinate dehydrogenase |
| UI-D-GC1-aad-d-15-0-UI | 9.00E-86 | 23509363 | serine/threonine protein phosphatase |
| UI-D-GC0-aae-b-08-0-UI | 2.00E-85 | 23509135 | actin |
| UI-D-GC1-aaz-h-12-0-UI | 3.00E-85 | 23507885 | 26S proteasome regulatory subunit 4 |
| UI-D-GC1-abe-o-23-0-UI | 8.00E-85 | 23510155 | bifunctional dihydrofolate reductase-thymidylate synthase |
| UI-D-GC1-abh-e-16-0-UI | 1.00E-84 | 23612827 | hsp70 |
| UI-D-GC0-aae-p-02-0-UI | 2.00E-84 | 23613232 | adenosylhomocysteinase |
| UI-D-GC1-aay-i-10-0-UI | 3.00E-82 | 16804988 | helicase |
| UI-D-GC1-aau-b-16-0-UI | 1.00E-80 | 23509325 | eukaryotic translation initiation factor 2 gamma subunit |
| UI-D-GC0-aae-h-03-0-UI | 8.00E-78 | 23509820 | glyceraldehyde-3-phosphate dehydrogenase |
| UI-D-GC1-aao-o-20-0-UI | 4.00E-77 | 23508006 | ADP ribosylation factor 1 |
| UI-D-GC0-aae-f-01-0-UI | 1.00E-76 | 23509545 | calmodulin |
| UI-D-GC1-abb-n-18-0-UI | 2.00E-76 | 23510206 | eukaryotic initiation factor |
| UI-D-GC1-abf-g-07-0-UI | 4.00E-75 | 23612467 | HSP86 |
| UI-D-GC1-abd-m-07-0-UI | 2.00E-74 | 23612587 | 40S ribosomal protein S5 |
| UI-D-GC0-aae-b-08-0-UI | 3.00E-74 | 23509345 | actin II |
| UI-D-GC1-aab-m-24-0-UI | 4.00E-74 | 23509670 | ribosomal protein S2 |
| UI-D-GC1-aar-f-11-0-UI | 3.00E-72 | 23509852 | protein serine/threonine phosphatase |
| UI-D-GC1-aao-b-16-0-UI | 1.00E-69 | 23509877 | RNA helicase 1 |
Top 20 hits of the A. tamarense ESTs to the GenBank nr database.
| UI-D-GC1-abg-i-22-0-UI | 1.00E-110 | 845405 | ribulose 1,5-bisphosphate carboxylase | |
| UI-D-GC1-aao-m-13-0-UI | 1.00E-109 | 135433 | alpha tubulin | |
| UI-D-GC1-abh-e-16-0-UI | 2.00E-98 | 20143982 | hsp70 | |
| UI-D-GC1-abe-o-23-0-UI | 1.00E-96 | 1169423 | bifunctional dihydrofolate reductase-thymidylate synthase | |
| UI-D-GC0-aae-p-02-0-UI | 1.00E-91 | 4416330 | S-adenosyl-homocysteine hydrolase like protein | |
| UI-D-GC0-aae-h-11-0-UI | 2.00E-91 | 21913167 | oxygen evolving enhancer 1 precursor | |
| UI-D-GC1-abh-d-23-0-UI | 4.00E-91 | 32307578 | glutamate 1-semialdehyde 2,1-aminomutase | |
| UI-D-GC1-abe-e-15-0-UI | 1.00E-89 | 27450753 | proliferating cell nuclear antigen | |
| UI-D-GC1-abb-n-18-0-UI | 3.00E-88 | 28277876 | Similar to DEAD box polypeptide 48 | |
| UI-D-GC1-aav-f-09-0-UI | 3.00E-87 | 15240075 | succinate dehydrogenase flavoprotein subunit, mitochondrial | |
| UI-D-GC1-abc-o-16-0-UI | 8.00E-85 | 13560096 | ALA dehydratase | |
| UI-D-GC1-aao-o-20-0-UI | 1.00E-83 | 7025460 | ADP ribosylation factor 1 | |
| UI-D-GC0-aae-b-23-0-UI | 5.00E-83 | 1076185 | luciferin-binding protein | |
| UI-D-GC1-aay-i-10-0-UI | 9.00E-83 | 18416493 | DEAD/DEAH box helicase, putative | |
| UI-D-GC1-aau-b-16-0-UI | 1.00E-82 | 4503507 | eukaryotic translation initiation factor 2, subunit 3 gamma | |
| UI-D-GC1-aad-d-15-0-UI | 5.00E-81 | 1346753 | Serine/threonine protein phosphatase PP1 isozyme 2 | |
| UI-D-GC1-aaz-h-12-0-UI | 1.00E-77 | 23507885 | 26S proteasome regulatory subunit 4, putative | |
| UI-D-GC1-abc-m-19-0-UI | 1.00E-77 | 32307576 | geranyl-geranyl reductase | |
| UI-D-GC1-abj-e-13-0-UI | 2.00E-76 | 4033509 | Calmodulin | |
| UI-D-GC1-abd-m-07-0-UI | 3.00E-75 | 6831665 | 40S ribosomal protein S5 |
Figure 3Analyses of A) Alignment of A. tamarense H2A.X with eukaryotic homologs. The alignment is shaded according to the level of conservation. The symbols above the alignment indicate the location of functional residues (T = trypsin cleavage site, ^ = arginines that contact the DNA helix, * = H2A-H2B interaction sites, U = ubiquitination site). The annotation below the alignment indicates conserved structural features including the α-helices, loops, and the SQ(E/D)Φgotif. B) A ML tree of H2A and H2A.X. The numbers above and below the branches are the results of ML and NJ bootstrap analyses, respectively. The thick branches indicate > 0.95 posterior probability from Bayesian inference. Only bootstrap values ≥ 50% are shown. Branch lengths are proportional to the number of substitutions per site (see scale bar).
Figure 4Analysis of dinoflagellate HLPs. A) HLPs from dinoflagellates (red taxa names) and bacteria (blue) and HU proteins from bacteria (black). B) The predicted secondary structure of HLPs from A. tamarense and B. pertussis aligned with the known secondary structure of E. coli HU. Curled lines indicate α-helices and jagged lines indicate β-strands. The arrow indicates the position of a conserved lysine. The asterisk indicates the proline that intercalates into the DNA in HU proteins. C) An ML tree of HU and HLP proteins from bacteria and eukaryotes. The numbers above and below the branches result from ML and NJ bootstrap analyses, respectively. The thick branches indicate > 0.95 posterior probability from Bayesian inference. Only bootstrap values ≥ 50% are shown. Branch lengths are proportional to the number of substitutions per site (see scale bar).