| Literature DB >> 16961928 |
Vinsensius B Vega1, Chin-Yo Lin, Koon Siew Lai, Say Li Kong, Min Xie, Xiaodi Su, Huey Fang Teh, Jane S Thomsen, Ai Li Yeo, Wing Kin Sung, Guillaume Bourque, Edison T Liu.
Abstract
BACKGROUND: Transcription factor binding sites (TFBS) impart specificity to cellular transcriptional responses and have largely been defined by consensus motifs derived from a handful of validated sites. The low specificity of the computational predictions of TFBSs has been attributed to ubiquity of the motifs and the relaxed sequence requirements for binding. We posited that the inadequacy is due to limited input of empirically verified sites, and demonstrated a multiplatform approach to constructing a robust model.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16961928 PMCID: PMC1794554 DOI: 10.1186/gb-2006-7-9-r82
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Schematics of ERE discovery and validation for model training and testing. ERE, estrogen response element; ChIP, chromatin immunoprecipitation; qPCR, quantitative polymerase chain reaction.
Genomic coordinates of ERE-like sequences that have been experimentally validated or rejected as ER-binding
| Name | Genomic location | Pattern | Validation | Reference |
| chr1:143,215,756-143,215,768 | GGTCAccc | Binding | This study | |
| chr1:199,790,269-199,790,281 | GGT | Binding | [10] and this study | |
| chr1:199,790,414-199,790,426 | GGT | Binding | This study | |
| chr1:227,156,613-227,156,625 | GG | Binding | [4] | |
| chr2:11,603,634-11,603,646 | GGTCAaaaTGACC | Binding | [10] | |
| chr2:11,615,324-11,615,336 | GGTCAtcaTGACC | Binding | [10] | |
| chr2:11,621,861-11,621,873 | Binding | This study | ||
| chr2:11,623,258-11,623,270 | GGTCAttcTGACC | Binding | [8,10] | |
| chr2:38,214,993-38,215,005 | GGTC | Binding | This study | |
| chr2:38,215,049-38,215,061 | GGTCAaag | Binding | This study | |
| chr3:46,481,739-46,481,751 | GGTCAagg | Binding | [10] | |
| chr4:75,676,340-75,676,352 | GG | Binding | This study | |
| chr6:11,154,748-11,154,760 | GGTCAtctTGA | Binding | This study | |
| chr6:43,844,381-43,844,393 | Binding | [4] | ||
| chr8:144,170,802-144,170,814 | GG | Binding | [10] | |
| chr9:129,597,654-129,597,666 | GG | Binding | This study | |
| chr10:115,428,398-115,428,410 | GGTCAgggTGA | Binding | [10] | |
| chr10:115,428,492-115,428,504 | GGTC | Binding | [10] | |
| chr10:115,428,572-115,428,584 | GGTCAgggTGA | Binding | [10] | |
| chr10:115,428,612-115,428,624 | GGTCAgggTGA | Binding | [10] | |
| chr10:115,428,652-115,428,664 | GGTCAgggTGA | Binding | [10] | |
| chr10:115,428,689-115,428,701 | GGTCAgggTGA | Binding | [10] | |
| chr10:115,428,743-115,428,755 | GGTCAgggTGA | Binding | [10] | |
| chr11:1,741,924-1,741,936 | GG | Binding | [4] | |
| chr11:100,504,595-100,504,607 | GGTCAcca | Binding | [4] | |
| chr11:100,505,180-100,505,192 | G | Binding | [4] | |
| chr12:6,355,536-6,355,548 | GGTCAgccT | Binding | [10] | |
| chr12:6,513,208-6,513,220 | GG | Binding | [10] | |
| chr14:63,879,248-63,879,260 | GGTCAggcTG | Binding | [4] | |
| chr15:55,670,850-55,670,862 | GG | Binding | This study | |
| chr15:55,671,545-55,671,557 | GGTCAcccTG | Binding | This study | |
| chr16:2,319,793-2,319,805 | GGTCAcggTG | Binding | [8] | |
| chr17:35,849,113-35,849,125 | GGTCAttgTGAC | Binding | [10] | |
| chr17:52,323,321-52,323,333 | GGTCAtggTGACC | Binding | [4], [10] | |
| chr18:59,136,673-59,136,685 | GGTC | Binding | [4] | |
| chr19:19,035,118-19,035,130 | G | Binding | This study | |
| chr19:40,182,519-40,182,531 | GG | Binding | This study | |
| chr19:43,897,093-43,897,105 | GGTCActgTGAC | Binding | This study | |
| chr19:52,532,131-52,532,143 | GGTCActcTGAC | Binding | This study | |
| chr19:6,671,884-6,671,902 | GGT | Binding | [4] | |
| chr21:15,359,833-15,359,845 | GGTCAaagTGACC | Binding | [8] | |
| chr21:42,659,626-42,659,638 | GGTC | Binding | This study | |
| chr21:42,659,906-42,659,918 | Binding | This study | ||
| chr21:42,660,106-42,660,118 | GGTCAcggTG | Binding | [4] | |
| chr22:19,595,695-19,595,707 | Binding | This study | ||
| chr1:115,283,928-115,283,940 | GGTCAgctTGAC | Nonbinding | [10] | |
| chr1:142,927,222-142,927,234 | GGTCAgtg | Nonbinding | This study | |
| chr1:150,045,850-150,045,862 | GGTC | Nonbinding | This study | |
| chr2:11,622,443-11,622,455 | Nonbinding | This study | ||
| chr2:11,625,143-11,625,155 | Nonbinding | This study | ||
| chr2:119,322,563-119,322,575 | GGT | Nonbinding | This study | |
| chr2:128,563,200-128,563,212 | Nonbinding | This study | ||
| chr2:128,565,292-128,565,304 | Nonbinding | This study | ||
| chr2:87,884,778-87,884,790 | GGTCAgtgTG | Nonbinding | This study | |
| chr3:151,966,545-151,966,557 | G | Nonbinding | This study | |
| chr3:195,656,453-195,656,465 | GGTCAtta | Nonbinding | This study | |
| chr3:50,626,609-50,626,621 | GG | Nonbinding | This study | |
| chr3:8,517,591-8,517,603 | GG | Nonbinding | This study | |
| chr4:673,249-673,261 | GG | Nonbinding | This study | |
| chr4:78,433,176-78,433,188 | GG | Nonbinding | This study | |
| chr5:172,689,912-172,689,924 | GG | Nonbinding | This study | |
| chr5:55,327,909-55,327,921 | GGT | Nonbinding | This study | |
| chr5:57,792,972-57,792,984 | GGT | Nonbinding | This study | |
| chr6:137,857,308-137,857,320 | Nonbinding | This study | ||
| chr6:32,206,228-32,206,240 | GG | Nonbinding | This study | |
| chr6:32,206,311-32,206,323 | Nonbinding | This study | ||
| chr7:100,361,980-100,361,992 | G | Nonbinding | This study | |
| chr7:100,362,938-100,362,950 | GG | Nonbinding | This study | |
| chr7:100,363,852-100,363,864 | Nonbinding | This study | ||
| chr7:16,566,080-16,566,092 | G | Nonbinding | This study | |
| chr7:43,570,289-43,570,301 | GGTCActcTG | Nonbinding | This study | |
| chr7:43,570,774-43,570,786 | Nonbinding | This study | ||
| chr9:33,157,593-33,157,605 | G | Nonbinding | This study | |
| chr9:33,158,622-33,158,634 | G | Nonbinding | This study | |
| chr10:22,333,030-22,333,042 | G | Nonbinding | This study | |
| chr10:26,545,037-26,545,049 | GGTC | Nonbinding | [10] | |
| chr10:44,202,437-44,202,449 | GGTC | Nonbinding | This study | |
| chr10:44,203,283-44,203,295 | Nonbinding | This study | ||
| chr11:100,509,203-100,509,215 | Nonbinding | This study | ||
| chr11:46,321,832-46,321,844 | GG | Nonbinding | This study | |
| chr11:65,403,499-65,403,511 | G | Nonbinding | This study | |
| chr14:101,872,078-101,872,090 | GG | Nonbinding | This study | |
| chr14:54,727,987-54,727,999 | GGTC | Nonbinding | This study | |
| chr14:63,876,354-63,876,366 | G | Nonbinding | This study | |
| chr15:37,657,943-37,657,955 | GGTCAatc | Nonbinding | This study | |
| chr15:69,737,514-69,737,526 | Nonbinding | This study | ||
| chr15:69,738,257-69,738,269 | GGTCAatgTG | Nonbinding | This study | |
| chr15:69,738,459-69,738,471 | G | Nonbinding | This study | |
| chr15:82,077,053-82,077,065 | G | Nonbinding | This study | |
| chr15:89,278,745-89,278,757 | Nonbinding | This study | ||
| chr16:2,321,166-2,321,178 | GGTC | Nonbinding | This study | |
| chr16:3,015,149-3,015,161 | G | Nonbinding | This study | |
| chr16:4,107,737-4,107,749 | GGTCAggcTG | Nonbinding | This study | |
| chr16:4,108,935-4,108,947 | GGT | Nonbinding | This study | |
| chr16:54,100,244-54,100,256 | GGTC | Nonbinding | This study | |
| chr17:2,441,502-2,441,514 | Nonbinding | This study | ||
| chr17:35,851,519-35,851,531 | G | Nonbinding | This study | |
| chr17:35,853,510-35,853,522 | GGTCAtgcTG | Nonbinding | This study | |
| chr18:18,766,140-18,766,152 | GGTCAttcTG | Nonbinding | This study | |
| chr19:2,382,491-2,382,503 | GG | Nonbinding | This study | |
| chr19:52,426,840-52,426,852 | Nonbinding | This study | ||
| chr19:52,427,249-52,427,261 | GGTCAggcTG | Nonbinding | This study | |
| chrY:169,893-169,905 | G | Nonbinding | This study |
Shown in bold and underlined are nucleotides that deviate from the consensus core ERE. ER, estrogen receptor; ERE, estrogen response element.
Figure 2Sequence logos. Shown are sequence logos for (a) the 45 ER-binding loci with 10 bp flanking sequences and (b) 58 ER nonbinding loci with 10 bp flanking sequences. The logo for the binders exhibited additional signal at the third bases upstream and downstream of the core palindromic ERE. bp, base pairs; ER, estrogen receptor; ERE, estrogen response element.
Figure 3Substitution of the conserved guanine outside of the canonical ERE disrupts ER binding. (a) Interactions between ER and wild-type and mutant EREs were measured by SPR. The canonical ERE is underlined, and the conserved guanine is indicated by an arrow. Base substitutions are indicated in bold. (b) Binding of ER to ERE is indicated as a percentage of binding relative to the wild-type sequence. ER, estrogen receptor; ERE, estrogen response element; SPR, surface plasmon resonance.
Figure 4Decision tree for ERE prediction. Group 3 EREs would be predicted to be the highest likelihood binders of ER. ER, estrogen receptor; ERE, estrogen response element; SB, binding score; SNB, nonbinding score.
Validation results on genomic loci containing ERE-like sequences identified by sequencing random ChIP fragment from an ER ChIP library
| chr1:108,492,542-108,492,560 | ttaGGTCAgctTG | Binding | |
| chr1:181,327,606-181,327,624 | ctgGGTCAgcaTGACCttc | Binding | |
| chr11:64,942,548-64,942,566 | ctgGG | Binding | |
| chr2:72,713,948-72,713,966 | ggaGGTC | Binding | |
| chr3:132,571,914-132,571,932 | aggGGTCAtggTGAC | Binding | |
| chr3:14,429,604-14,429,622 | ctgGGTCActgTG | Binding | |
| chr3:151,957,126-151,957,144 | acaGGTCAccaTGACCtgg | Binding | |
| chr5:122,216,372-122,216,390 | cagGGT | Binding | |
| chr6:122,985,938-122,985,956 | tttGGTCAtgt | Binding | |
| chr6:23,720,183-23,720,201 | tcgGGTCAtgcTG | Binding | |
| chr6:38,337,561-38,337,579 | tggGGTCAtggTGAC | Binding | |
| chr9:37,943,504-37,943,522 | gcaGGT | Binding | |
| chr12:44,881,783-44,881,801 | cag | Binding | |
| chr12:44,881,800-44,881,818 | gagGGTCAtcc | Binding | |
| chr16:2,781,142-2,781,160 | ccaGGTC | Binding | |
| chr16:743,678-743,696 | atgGGTCActgTGACCcag | Binding | |
| chr17:46,382,536-46,382,554 | cccGG | Binding | |
| chr17:54,072,183-54,072,201 | cacGGTCAtggTGACCtga | Binding | |
| chr20:54,945,262-54,945,280 | ggg | Binding | |
| chr2:222,089,422-222,089,440 | cagG | Nonbinding | |
| chr5:171,535,283-171,535,301 | tgtGGTC | Nonbinding | |
| chr5:175,712,328-175,712,346 | agaGG | Nonbinding | |
| chr5:179,478,929-179,478,947 | gtgGG | Nonbinding | |
| chr10:108,692,194-108,692,212 | cac | Nonbinding | |
| chr14:38,648,346-38,648,364 | attGGTCAgagTGAC | Nonbinding | |
| chr14:79,636,926-79,636,944 | acc | Nonbinding | |
| chr16:20,819,825-20,819,843 | tggGGTCAcac | Nonbinding | |
| chr16:25,535,373-25,535,391 | ttaG | Nonbinding | |
| chr19:46,954,305-46,954,323 | cagG | Nonbinding |
Shown in bold and underlined are nucleotides that deviate from the consensus core ERE. ChIP, chromatin immunoprecipitation; ER, estrogen receptor; ERE, estrogen response element.
Performance comparison of various prediction algorithms for ER binding using the independent dataset shown in Table 2
| Consensus ERE with ≤2 mismatches | 94.74% | 30% | 45.57% | 0.104838 |
| Consensus ERE with ≤3 mismatches | 94.74% | 0% | 0.00% | 1 |
| Dragon ERE finder v2.0 | 68.42% | 40% | 50.49% | 0.477589 |
| TFAC 8.1 (min FP) | 31.57% | 100% | 47.99% | 0.057117 |
| TFAC 8.1 (min FN) | 78.94% | 40% | 53.10% | 0.255439 |
| h-ERE (stringent) | 42.10% | 90% | 57.37% | 0.084693 |
| h-ERE (medium) | 68.42% | 70% | 69.20% | 0.056272 |
| h-ERE (relaxed) | 73.68% | 70% | 71.79% | 0.03043 |
| h-ERE (loose) | 84.21% | 70% | 76.45% | 0.006199 |
h-ERE outperformed the other algorithms. ERE, estrogen response element.