| Literature DB >> 27067009 |
Stanislas Thiriet-Rupert1, Grégory Carrier2, Benoît Chénais3, Camille Trottier2, Gaël Bougaran2, Jean-Paul Cadoret2, Benoît Schoefs3, Bruno Saint-Jean2.
Abstract
BACKGROUND: Studying transcription factors, which are some of the key players in gene expression, is of outstanding interest for the investigation of the evolutionary history of organisms through lineage-specific features. In this study we performed the first genome-wide TF identification and comparison between haptophytes and other algal lineages.Entities:
Keywords: Algae; Endosymbiotic gene transfer; Haptophytes; Prediction pipeline; Stramenopiles; Tisochrysis lutea; Transcription factors
Mesh:
Substances:
Year: 2016 PMID: 27067009 PMCID: PMC4827209 DOI: 10.1186/s12864-016-2610-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Identification pipeline. The pipeline is divided into three steps. Step One uses two strategies: i) a similarity search against an algae-based self-built database of known TFs with BLAST software; ii) functional domain annotation with InterProScan and HMMER software. The protein list obtained is the subject of the Step Two: the filtration of false positives according to specific parameters (see Methods). The last step consists in the classification of the putative TF list obtained in Step Two using a homemade perl script followed by manual curation for specific cases (see Methods)
Evaluation of the pipeline accuracy for each TF family for plant TFs. A sensitivity value less than one means inclusion of false negatives, and a PPV value less than one means inclusion of false positives
|
| ||||
|---|---|---|---|---|
| This study | Riaño-Pachón et al., 2007 [ | |||
| TF family | sensitivity | PPV | sensitivity | PPV |
| AP2/ERF | 169/169 = 1 | 169/169 = 1 | 0.99 | 1 |
| ARF | 37/37 = 1 | 37/37 = 1 | 0.91 | 0.95 |
| bZIP | 127/127 = 1 | 127/127 = 1 | 0.92 | 0.97 |
| C2C2-Dof | 47/47 = 1 | 47/47 = 1 | 0.97 | 0.97 |
| C2C2-GATA | 41/41 = 1 | 41/41 = 1 | 1 | 1 |
| GARP | 85/85 = 1 | 85/85 = 1 | NA | NA |
| GRAS | 37/37 = 1 | 37/37 = 1 | 0.97 | 0.97 |
| MADS | 145/146 = 0.99 | 145/145 = 1 | 0.92 | 0.95 |
| NAC | 138/138 = 1 | 138/138 = 1 | 1 | 0.99 |
| WRKY | 90/90 = 1 | 90/90 = 1 | 0.99 | 0.99 |
| bHLH | 225/225 = 1 | 224/225 = 0.99 | 0.80 | 0.92 |
Evaluation of the pipeline accuracy for each TF family for cyanobacterial TFs. A sensitivity value less than one means inclusion of false negatives, and a PPV value less than one means inclusion of false positives
| Cyanobacteria | ||
|---|---|---|
| TF family | sensitivity | PPV |
| arsR | 12/12 = 1 | 12/12 = 1 |
| Bac_DNA_binding | 6/6 = 1 | 6/6 = 1 |
| BolA | 3/3 = 1 | 3/3 = 1 |
| Crp | 15/15 = 1 | 15/17 = 0.88 |
| FUR | 9/9 = 1 | 9/9 = 1 |
| GerE | 34/34 = 1 | 34/34 = 1 |
| GntR | 5/5 = 1 | 5/6 = 0.83 |
| LysR | 15/15 = 1 | 15/15 = 1 |
| SfsA | 3/3 = 1 | 3/3 = 1 |
Transcription factor families identified and their proportions in seven microalgae
| TF family |
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|
| B3 | ABI3/VP1 | 1 (0.65) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) |
| AP2/ERF | AP2 | 1 (0.65) | 1 (0.78) | 58 (12.13) | 0 (0) | 2 (2.15) | 0 (0) | 6 (2.83) |
| ERF | 1 (0.65) | 6 (4.69) | 99 (20.71) | 2 (1.02) | 2 (2.15) | 0 (0) | 9 (4.25) | |
| bHLH | 0 (0) | 0 (0) | 0 (0) | 8 (4.08) | 3 (3.23) | 3 (1.51) | 8 (3.77) | |
| bZIP | 3 (1.94) | 3 (2.34) | 6 (1.26) | 25 (12.76) | 11 (11.83) | 21 (10.55) | 20 (9.43) | |
| C2C2 | CO-like | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) |
| Dof | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) | |
| GATA | 5 (3.23) | 1 (0.78) | 4 (0.84) | 0 (0) | 0 (0) | 2 (1.01) | 12 (0.66) | |
| LSD | 1 (0.65) | 1 (0.78) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) | |
| C2H2 | 8 (5.16) | 8 (6.25) | 37 (7.74) | 4 (2.04) | 5 (5.38) | 60 (30.15) | 5 (2.36) | |
| C3H | 13 (8.39) | 7 (5.47) | 47 (9.83) | 11 (5.61) | 5 (5.38) | 8 (4.02) | 22 (10.38) | |
| CCAAT | 3 (1.94) | 0 (0) | 2 (0.42) | 3 (1.53) | 3 (3.23) | 3 (1.51) | 1 (0.47) | |
| CPP | 1 (0.65) | 0 (0) | 4 (0.84) | 5 (2.55) | 1 (1.08) | 2 (1.01) | 3 (1.42) | |
| CSD | 3 (1,94) | 4 (3.13) | 25 (5.23) | 5 (2.55) | 1 (1.08) | 3 (1.51) | 2 (0.94) | |
| DBB | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.50) | 0 (0) | |
| E2F/DP | 2 (1.29) | 3 (2.34) | 3 (0.63) | 5 (2.55) | 1 (1.08) | 3 (1.51) | 3 (1.42) | |
| Fungal TRF | 14 (9.03) | 8 (6.25) | 27 (5.65) | 1 (0.51) | 10 (10.75) | 0 (0) | 0 (0) | |
| GARP | G2-like | 4 (2.58) | 4 (3.13) | 5 (1.05) | 2 (1.02) | 0 (0) | 2 (1.01) | 4 (1.89) |
| ARR-B | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) | |
| Homeobox | HB-other | 16 (10.32) | 14 (10.94) | 28 (5.86) | 0 (0) | 0 (0) | 2 (1.01) | 1 (0.47) |
| TALE | 1 (0.65) | 1 (0.78) | 0 (0) | 4 (2.04) | 0 (0) | 9 (4.52) | 3 (1.42) | |
| HSF | 9 (5.81) | 8 (6.25) | 8 (1.67) | 67 (34.18) | 4 (4.30) | 1 (0.50) | 2 (0.94) | |
| LIM | 2 (1.29) | 3 (2.34) | 11 (2.30) | 0 (0) | 0 (0) | 2 (1.01) | 1 (0.47) | |
| MADS-box | M-type | 3 (1.94) | 1 (0.78) | 1 (0.21) | 0 (0) | 0 (0) | 2 (1.01) | 2 (0.94) |
| mTERF | 5 (3.23) | 0 (0) | 6 (1.26) | 5 (2.55) | 2 (2.15) | 5 (2.51) | 4 (1.89) | |
| MYB | MYB (3R) | 1 (0.65) | 0 (0) | 3 (0.63) | 2 (1.02) | 5 (5.38) | 1 (0.50) | 1 (0.47) |
| MYB (2R) | 25 (16.13) | 20 (15.63) | 39 (8.16) | 11 (5.61) | 8 (8.60) | 23 (11.56) | 10 (4.72) | |
| MYB-rel | 21 (13.55) | 15 (11.90) | 51 (10.69) | 7 (3.57) | 7 (7.53) | 7 (3.52) | 18 (8.65) | |
| MYB-SHAQKYF | 1 (0.65) | 2 (1.56) | 1 (0.21) | 7 (3.57) | 8 (8.60) | 16 (8.04) | 4 (1.89) | |
| NF-X1 | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) | |
| NF-Y | NF-YA | 0 (0) | 1 (0.78) | 1 (0.21) | 1 (0.51) | 1 (1.08) | 1 (0.50) | 0 (0) |
| NF-YB | 1 (0.65) | 1 (0.78) | 4 (0.84) | 2 (1.02) | 2 (2.15) | 3 (1.51) | 3 (1.42) | |
| NF-YC | 3 (1.94) | 4 (3.13) | 1 (0.21) | 8 (4.08) | 6 (6.45) | 6 (3.02) | 2 (0.94) | |
| Nin-like | 0 (0) | 1 (0.78) | 0 (0) | 0 (0) | 1 (1.08) | 4 (2.01) | 15 (7.08) | |
| S1Fa-like | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) | |
| SBP | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 23 (10.85) | |
| Sigma-70 | 4 (2.58) | 4 (3.13) | 2 (0.42) | 8 (4.08) | 4 (4.30) | 8 (4.02) | 1 (0.47) | |
| TUB | 3 (1.94) | 7 (5.47) | 5 (1.05) | 3 (1.53) | 1 (1.08) | 0 (0) | 6 (2.83) | |
| VARL | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 12 (5.66) | |
| Whirly | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) | |
| WRKY | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.47) | |
| Total | 155 | 128 | 478 | 196 | 93 | 199 | 212 | |
ERF Ethylene Response Factor, bHLH basic helix-loop-helix, bZIP basic leucine zipper, CSD Cold Shock Domain, DBB Double B-box, TRF Transcriptional Regulatory Factor, HSF Heat Shock Factor, mTERF mitochondrial transcription termination factor, SBP SQUAMOSA promotor binding protein, VARL Volvocine Algal RegA Like. Numbers in parentheses correspond to percentage of each family for each species. For the total number of TFs, number in parentheses corresponds to percentage of the predicted proteome dedicated to TFs
Fig. 2Percentages of the predicted proteomes dedicated to transcription factors in the 7 algae
Fig. 3Dendrogram representing the repartition of the four lineages according to the presence/absence of TF families. The green lineage is colored in green, stramenopiles in orange, red lineage in red and haptophytes in purple. The scale indicates distance measurement
Fig. 4Heatmap showing the clustering of TF families according to their proportion in the algal genomes. Cluster 1 comprises TF families described as specific to the green lineage. Cluster 2 is composed of families with equivalent proportions across algal genomes. Cluster 3 is composed of families present in the 7 algae but in different proportions. Cluster 4 is composed of 3 families that are absent in stramenopiles
Fig. 5Percentages of MYB-SHAQKYF among MYB-related TFs in algae
Number of cyanobacterial transcription factors (TFs) identified in the seven algae for each TF family
| TF family |
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| arsR | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| Bac_DNA_binding | 1 | 1 | 1 | 1 | 1 | 0 | 1 |
| BolA | 2 | 2 | 7 | 4 | 4 | 3 | 5 |
| GerE | 1 | 0 | 2 | 2 | 1 | 0 | 0 |
| LysR | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| SfsA | 1 | 3 | 2 | 0 | 1 | 2 | 2 |
BolA TFs were previously identified in the chlorophyte C. reinhardtii, the rhodophyte Cyanidoschyzon merolae, the diatom Thalassiosira pseudonana and the cryptophyte Guillardia theta [24]
Fig. 6Expansion, gain and loss of TF families during the evolutionary history of microalgae