| Literature DB >> 16472383 |
Georgii A Bazykin1, Alexey S Kondrashov.
Abstract
BACKGROUND: Phylogenetic conservation at the DNA level is routinely used as evidence of molecular function, under the assumption that locations and sequences of functional DNA segments remain invariant in evolution. In particular, short DNA segments participating in initiation and regulation of transcription are often conserved between related species. However, transcription of a gene can evolve, and this evolution may involve changes of even such conservative DNA segments. Genes of yeast Saccharomyces have promoters of two classes, class 1 (TATA-containing) and class 2 (non-TATA-containing).Entities:
Mesh:
Year: 2006 PMID: 16472383 PMCID: PMC1457003 DOI: 10.1186/1471-2148-6-14
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Average per-nucleotide conservation of TATA box and of 10 nucleotides to its left and right. Conservation of all four non-cerevisiae species is pooled together. Grey shading, TATA box; blue solid line, genes sensitive to mutations in DNA binding surface of TBP (N = 213); red dashed line, genes insensitive to mutations in DNA binding surface of TBP (N = 34).
Genes with class 1 (functional TATA box-containing) promoters in S. cerevisiae having orthologs which lack a TATA box in one or more other species of sensu stricto group.
| Description | Presence of TATA boxa | |||||||
| ORF | Gene name | Switch eventsb | Ancestral statec | |||||
| YLR109W | Thiol-specific peroxiredoxin | + | + | + | - (2) | 1 | ? | |
| YPL221W | Unknown function | + | + | + | - (2) | 1 | ? | |
| YBR298C | Maltose permease | + | + | + | - (5) | 1 | ? | |
| YBR147W | Hypothetical ORF | + | + | - (2) | + | 1 | TATA | |
| YCL035C | Oxidoreductase | + | + | - (2) | + | 1 | TATA | |
| YDR005C | Mod5 protein sorting, negative effector of Pol III synthesis. | + | GA | - (2) | + | 1 | TATA | |
| YPR193C | Tetrameric histone acetyltransferase | + | + | - (2) | + | 1 | TATA | |
| YDR533C | Possible chaperone and cysteine protease | + | + | - (3) | + | 1 | TATA | |
| YMR315W | Hypothetical ORF | + | + | - (3) | GA | 1 | ? | |
| YDR282C | Hypothetical ORF | + | - (2) | - (4) | + | 2 | ? | |
| YKL216W | Catalyzes the conversion of dihydroorotic acid to orotic acid | + | - (2) | + | + | 1 | TATA | |
| YNR033W | Para-aminobenzoate (PABA) synthase | + | - (2) | + | + | 1 | TATA | |
| YOR186W | Hypothetical ORF | + | - (2) | + | + | 1 | TATA | |
| YOL143C | Catalyzes synthesis of riboflavin | GA | - (3) | + | + | 1 | TATA | |
| YPR119W | Involved in mitotic induction | + | - (2) | GA | + | ? | ? | |
| YLR346C | Unknown function | - (gap) | GA | + | + | 1 | TATA | |
| YPL269W | Karyogamy protein | - (gap) | - (gap) | - (gap) | - (gap) | 1 | non-TATA | |
| Total | 2 | 7 | 8 | 4 | ||||
a For non-conserved TATA boxes, the minimal number of nucleotides different from the consensus sequence of TATA box (TATA(A/T)A(A/T)(A/G)) is shown in parentheses. "Gap" indicates an alignment gap at the site of the TATA box; "GA" means that ORF was not present in the given species.
b Parsimonious number of switch events in sensu stricto group; i.e., the minimum number of mutations within sensu stricto group necessary to produce this pattern. The number of events is unknown for ORF YPR119W due to lack of data for S. kudriavtsevii.
c Inferred state in last common ancestor. In cases when more than one equally parsimonious ancestral state was possible, it could not be inferred reliably (marked as '?').
Figure 2Switch of promoter type by . Red, TATA consensus sequence; green, ATG start codon. S. cerevisiae carries the consensus TATA(T/A)A(T/A)(T/G) sequence in position -73 relative to the ATG start codon. The consensus is also conserved in S. paradoxus, S. kudriavtsevii and S. bayanus. In S. mikatae, at least two nucleotides are substituted, eliminating the TATA box.
Evolution of class 1 (TATA box-containing) promoters between S. cerevisiae and other species of sensu stricto group
| Species | Conserved ORFs | Switches of promoter type | Switches of promoter type per gene of this class per Ksa | TATA boxes shifted | TATA box shift events per TATA-containing gene per Ksa | Average conservation of upstream intergenic region |
| 1. TBP-sensitive, TATA-containing genes (N = 212) | ||||||
| 200 (94.3%) | 2 | 0.05 | 0 | 0.00 | 0.85 | |
| 180 (84.9%) | 7 | 0.13 | 0 | 0.00 | 0.74 | |
| 179 (84.4%) | 8 | 0.13 | 0 | 0.00 | 0.71 | |
| 178 (84.0%) | 4 | 0.06 | 2 | 0.03 | 0.66 | |
| 2. Non-TBP-sensitive, TATA-containing genes (N = 34) | ||||||
| 30 (88.2%) | 8 | 1.40 | 0 | 0.00 | 0.79 | |
| 31 (91.2%) | 10 | 1.08 | 0 | 0.00 | 0.73 | |
| 26 (76.5%) | 14 | 1.58 | 0 | 0.00 | 0.65 | |
| 23 (67.6%) | 9 | 1.09 | 2 | 0.24 | 0.63 | |
| 3. Non-TBP-sensitive, non-TATA-containing genes (N = 322) | ||||||
| 278 (86.3%) | 14 | 0.27 | - | - | 0.82 | |
| 241 (74.8%) | 22 | 0.30 | - | - | 0.71 | |
| 218 (67.7%) | 11 | 0.15 | - | - | 0.67 | |
| 238 (73.9%) | 9 | 0.11 | - | - | 0.64 | |
a Number of events per time required for the accumulation of one nucleotide substitution at a non-coding site. The average number of substitutions per nucleotide site in intergenic regions, relative to S. cerevisiae, is 0.19 in S. paradoxus, 0.30 in S. mikatae, 0.34 in S. kudriavtsevii, and 0.36 in S. bayanus.
Figure 3Genes with functional class 1 (TATA box-containing) promoters in . Red, TATA consensus sequence; green, ATG start codon. In ORF YFR055W (hypothetical ORF), the distance in alignment between starts of TATA consensus sequences in S. cerevisiae and S. bayanus is 17 nucleotides. In ORF YBR145W (ADH5, alcohol dehydrogenase isoenzyme), the distance in alignment between starts of TATA consensus sequences in S. cerevisiae and S. bayanus is 19 nucleotides.
Genes with class 2 (non-TATA box-containing) promoters in S. cerevisiae having orthologs which have TATA box in one or more other species of sensu stricto group.
| Presence of TATA box | |||||||||
| ORF | Gene name | Description | Minimal distance from consensusa | Switch eventsb | Ancestral statec | ||||
| YDL139C | Suppressor of chromosome missegregation | 1 | + | - | - | - | |||
| YDR159W | Component of nuclear pore | 2 | + | - | - | - | 1 | non-TATA | |
| YGL091C | 1 | + | - | - | - | 1 | non-TATA | ||
| YIR002C | Helicase | 1 | + | - | - | - | 1 | non-TATA | |
| YOL149W | Decapping enzyme | 1 | + | - | - | - | 1 | non-TATA | |
| YOR125C | 1 | + | - | - | - | 1 | non-TATA | ||
| YLR011W | 1 | + | GA | - | - | 1 | non-TATA | ||
| YOR154W | Hypothetical ORF | 2 | + | GA | - | - | 1 | non-TATA | |
| YKL207W | Hypothetical ORF | 1 | + | GA | GA | GA | 1 | non-TATA | |
| YDL005C | RNA Polymerase II transcriptional regulation mediator | 4 | - | + | - | - | 1 | non-TATA | |
| YDL207W | Polyadenylated-RNA-export factor | 1 | - | + | - | - | 1 | non-TATA | |
| YDR459C | 2 | - | + | - | - | 1 | non-TATA | ||
| YER099C | 5-phospho-ribosyl-1(alpha)-pyrophosphate synthetase | 2 | - | + | - | - | 1 | non-TATA | |
| YIL002C | Phosphatidylinositol 4,5-bisphosphate 5-phosphatase, synaptojanin-like protein | 3 | - | + | - | - | 1 | non-TATA | |
| YNL125C | 1 | - | + | - | - | 1 | non-TATA | ||
| YOR201C | Ribose methyltransferase | 1 | - | + | - | - | 1 | non-TATA | |
| YOR238W | Hypothetical ORF | 5 | - | + | - | - | 1 | non-TATA | |
| YOR280C | Serine hydrolase | 1 | - | + | - | - | 1 | non-TATA | |
| YPL034W | Hypothetical ORF | 2 | - | + | - | - | 1 | non-TATA | |
| YPL047W | 1 | - | + | - | - | 1 | non-TATA | ||
| YPL096W | De-N-glycosylation enzyme | 2 | - | + | - | - | 1 | non-TATA | |
| YKL038W | Transcriptional activator | 4 | - | + | GA | - | 1 | non-TATA | |
| YDR422C | Protein kinase complex component | 1 | - | + | - | GA | 2 | ? | |
| YOR211C | 3 | - | + | - | GA | 1 | non-TATA | ||
| YPL112C | 2 | - | + | GA | GA | 1 | non-TATA | ||
| YKL012W | U1 snRNP protein involved in splicing | 1 | + | + | - | - | 2 | non-TATA | |
| YGR134W | CCR4 Associated Factor | 1 | + | + | - | GA | 2 | non-TATA | |
| YBL074C | Component of the U5 snRNP | 1 | - | - | + | - | 1 | non-TATA | |
| YFR042W | 2 | - | - | + | - | 1 | non-TATA | ||
| YML065W | Largest subunit of the origin recognition complex | 4 | - | - | + | - | 1 | non-TATA | |
| YOR160W | 1 | - | - | + | - | 1 | non-TATA | ||
| YOR228C | Hypothetical ORF | 6 | - | - | + | - | 1 | non-TATA | |
| YPL091W | 1 | - | - | + | - | 1 | non-TATA | ||
| YLR165C | 2 | - | GA | + | - | 1 | non-TATA | ||
| YDR160W | Component of the SPS plasma membrane amino acid sensor system (Ssy1p-Ptr3p-Ssy5p) | 3 | GA | - | + | GA | 1 | non-TATA | |
| YBR108W | Hypothetical ORF | 1 | - | - | - | + | 2 | ? | |
| YHR105W | Hypothetical ORF | 2 | - | - | - | + | 1 | ? | |
| YNL119W | 4 | - | - | - | + | 1 | ? | ||
| YKR053C | Dihydrosphingosine 1-phosphate phosphatase | 1 | - | GA | GA | + | 1 | ? | |
| YOL020W | Tryptophan permease | 1 | + | + | - | + | 2 | TATA | |
| YMR169C | 1 | - | + | GA | + | 2 | ? | ||
| YMR170C | 1 | - | - | + | + | 1 | TATA | ||
| YPR073C | 2 | + | + | + | + | 1 | TATA | ||
| YDL054C | 5 | + | + | + | + | 1 | TATA | ||
| Total | 14 | 22 | 11 | 9 | |||||
a The minimal distance from TATA consensus is the minimal number of nucleotides different from the consensus sequence (TATA(A/T)A(A/T)(A/G)) in the octanucleotide of S. cerevisiae which is orthologous to the TATA box in the non-cerevisiae species. "Gap" indicates an alignment gap at the site of the TATA box; "GA" means that ORF was not conserved in the given species.
b Parsimonious number of switch events in sensu stricto group; i.e., the minimum number of mutations within sensu stricto group necessary to produce this pattern.
c Inferred state in last common ancestor. In cases when more than one equally parsimonious ancestral state was possible, it could not be inferred reliably (marked as '?').