| Literature DB >> 22730291 |
Yubing Liu1, Soumyadeep Nandi, André Martel, Alen Antoun, Ilya Ioshikhes, Alexandre Blais.
Abstract
The Six1 transcription factor is a homeodomain protein involved in controlling gene expression during embryonic development. Six1 establishes gene expression profiles that enable skeletal myogenesis and nephrogenesis, among others. While several homeodomain factors have been extensively characterized with regards to their DNA-binding properties, relatively little is known of the properties of Six1. We have used the genomic binding profile of Six1 during the myogenic differentiation of myoblasts to obtain a better understanding of its preferences for recognizing certain DNA sequences. DNA sequence analyses on our genomic binding dataset, combined with biochemical characterization using binding assays, reveal that Six1 has a much broader DNA-binding sequence spectrum than had been previously determined. Moreover, using a position weight matrix optimization algorithm, we generated a highly sensitive and specific matrix that can be used to predict novel Six1-binding sites with highest accuracy. Furthermore, our results support the idea of a mode of DNA recognition by this factor where Six1 itself is sufficient for sequence discrimination, and where Six1 domains outside of its homeodomain contribute to binding site selection. Together, our results provide new light on the properties of this important transcription factor, and will enable more accurate modeling of Six1 function in bioinformatic studies.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22730291 PMCID: PMC3458543 DOI: 10.1093/nar/gks587
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 3.EMSA gels using recombinant Six1 on selected DNA sequences. (A) SDS-PAGE gel stained with coomassie blue, showing 200 ng (lane 1) or 2.0 μg (lane 2) of Six1 protein. (B) Increasing amounts of Six1 were incubated with fixed amounts of fluorescently labelled double-stranded DNA probes, and electrophoresed on non-denaturing polyacrylamide gels. The gel shown is from an experiment performed with the wild-type Myog MEF3 site. (C) The fluorescent signals corresponding to the free and shifted probes were measured in each lane, and the proportion of shifted probe over shifted free probe were plotted as a function of the concentration of Six1 present in each lane. The concentration of Six1 to reach half maximal binding represents the value for that probe sequence. The complete set of results is reported in Table 2.
Figure 1.Position weight matrices for Six1 and other homeodomain TFs. (A) PWMs generated in this study. (B) Previously reported PWMs for Six1. (C) The PWMs of other well-characterized homeodomain TFs. The PWMs were represented graphically using sequence logos.
Figure 2.Enrichment of the Six1_MB + MT matrix hits within the loci bound by Six1 in myotubes, as a function of their score in the ChIP-on-chip experiment. The 1853 genomic loci bound by Six1 in myotubes were ranked in decreasing order of ChIP enrichment and subdivided in bins of 50 regions. Matches to the Six1_MB + MT matrix were identified, and the numbers of PWM hit per bin (left-hand y axis) or per base pair (right-hand y axis) were calculated. As controls, five sets of randomly selected genomic regions were also scanned for PWM hits. Additionally, two sets of sequences totaling 108 Mb and originally surveyed in the ChIP-on-chip experiments were also scanned here.
Results of the sensitivity and specificity searches for new and existing Six1 PWMs
| Name | Target list | No. of sites (all) | Enrichment (all sites) | No. of sites (conserved only) | Enrichment (conserved sites only) | ||
|---|---|---|---|---|---|---|---|
| Six1_MB-A_m01.mat | MB_BC | 485 | 4.73 | <1E-16 | 139 | 6.49 | <1E-16 |
| Six1_MB-B_m01.mat | MB_AC | 449 | 4.45 | <1E-16 | 141 | 7.21 | <1E-16 |
| Six1_MB-C_m01.mat | MB_AB | 471 | 5.23 | <1E-16 | 144 | 7.31 | <1E-16 |
| Six1_MT-A_m01.mat | MT_BCDE | 977 | 2.58 | <1E-16 | 246 | 4.60 | <1E-16 |
| Six1_MT-B_m01.mat | MT_ACDE | 1043 | 2.96 | <1E-16 | 260 | 4.52 | <1E-16 |
| Six1_MT-C_m01.mat | MT_ABDE | 888 | 2.73 | 2.2E-16 | 240 | 4.70 | <1E-16 |
| Six1_MT-D_m01.mat | MT_ABCE | 891 | 3.08 | <1E-16 | 248 | 4.92 | 1.3E-15 |
| Six1_MT-E_m01.mat | MT_ABCD | 895 | 2.36 | <1E-16 | 242 | 3.93 | <1E-16 |
| Six1_MB + MT.mat | MB_ABC | 1144 | 3.50 | <1E-16 | 321 | 4.99 | <1E-16 |
| MT_ABCDE | 1873 | 2.93 | 3.3E-16 | 489 | 4.67 | 7.8E-16 | |
| Berger_Six1_0935.mat | MB_ABC | 308 | 1.38 | 2.8E-08 | 51 | 1.52 | 2.6E-03 |
| MT_ABCDE | 544 | 1.25 | 2.5E-07 | 84 | 1.54 | 9.5E-05 | |
| M00319-V$MEF3_B02.mat | MB_ABC | 26 | 2.29 | 1.2E-04 | 9 | 5.02 | 8.6E-05 |
| MT_ABCDE | 51 | 2.29 | 8.5E-08 | 11 | 3.77 | 1.7E-04 | |
| M00510-V$LHX3_01-Lhx3a.mat | MB_ABC | 350 | 0.67 | 1.0E + 00 | 116 | 0.67 | 1.0E + 00 |
| MT_ABCDE | 782 | 0.77 | 1.0E + 00 | 203 | 0.72 | 1.0E + 00 | |
| M00640-V$HOXA4_Q2-HOXA4.mat | MB_ABC | 697 | 0.89 | 1.0E + 00 | 185 | 0.99 | 5.9E-01 |
| MT_ABCDE | 1429 | 0.93 | 1.0E + 00 | 317 | 1.04 | 2.6E-01 | |
| M00241-V$NKX25_02-Nkx2-5.mat | MB_ABC | 472 | 0.77 | 1.0E + 00 | 116 | 0.69 | 1.0E + 00 |
| MT_ABCDE | 866 | 0.72 | 1.0E + 00 | 188 | 0.68 | 1.0E + 00 | |
| M00360-V$PAX3_01-Pax-3.mat | MB_ABC | 27 | 1.13 | 2.9E-01 | 2 | 0.39 | 9.7E-01 |
| MT_ABCDE | 63 | 1.35 | 1.3E-02 | 14 | 1.67 | 4.5E-02 | |
| M00394-V$MSX1_01-Msx-1.mat | MB_ABC | 241 | 1.00 | 5.2E-01 | 52 | 0.80 | 9.5E-01 |
| MT_ABCDE | 528 | 1.12 | 6.4E-03 | 105 | 1.00 | 5.2E-01 | |
| M00096-V$PBX1_01-Pbx1a.mat | MB_ABC | 541 | 0.80 | 1.0E + 00 | 147 | 0.94 | 7.7E-01 |
| MT_ABCDE | 997 | 0.76 | 1.0E + 00 | 211 | 0.83 | 1.0E + 00 |
Enrichment of binding sites predicted by PWMs discovered for Six1, for existing Six1 PWMs and for other homeodomain transcription factors has been illustrated.
aList of target genomic regions scanned with a given PWM. MB indicates Six1-bound targets in myoblasts, and MT those bound in myotubes. Subgroups of targets are given as letters (e.g. MB_AB refers to the combination of myoblast targets subgroups A and B).
bNumber of sites corresponding to ‘hits’ to the PWM, irrespective of their phylogenetic conservation.
cThe enrichment is given as the ratio of hits found in the indicated target set over those found in a fraction of the ChIP-surveyed sequence space, pro-rated by the length of each group of sequences in base pairs.
dThe P-value represents the cumulative hypergeometric probability subtracted from 1.
eSame as for b, but limited to genomic regions among the top 5% most phylogenetically conserved among 45 vertebrate species.
Summary of EMSA experiments
aMyog_WT is the Myog probe with wild-type MEF3 consensus in the center. Myog_mut01 to Myog_mut30 are probes with various mutations in the MEF3 consensus. Myog_mut31 is the same probe as Myog_mut03 cited as a different rationale.
bMutated nucleotides in the MEF3 consensus are highlighted in black. The lower cap ‘g’ nucleotide was added for fluorescent labelling purposes. The natural sequence would be a ‘C’ at that position.
cRationales to choose the corresponding sequences are listed. Mut01 the most frequent MEF3 sequence found in Six1_MB and Six1_MT binding data. Mut02 contains TA at position 2 and 3, which is found in the Berger et al. study. Myog03 has the least frequency of dinucleotides (AT) at position 2 and 3. The MEF3 in Mut05 is found in the Myod core enhancer region. Mut07 to10 are selected with different dinucleotide combination at position 7 and 8. Mut04, 06, and 11 to 16 are chosen based on the frequency of the nucleotide at a certain position. Mut17 to 26 are MEF3 sequences found only using Six1-opti MEF3 motif. Mut27 to 31 are MEF3 sequences found only using Six1_MB + MT MEF3 motif. Of note, Mut03 and Mut31 contain the same MEF3 sequence.
dDissociation constant () and standard error of mean are calculated for each probe based on at least three independent experiments. >350 nM, not accurately determined due to very weak binding.
Figure 4.Substantial differences in DNA sequence selectivity between Six1-HD and Six1. (A) EMSA gels were performed with increasing amounts of Six1 homeodomain (Six1-HD) using the Myogenin WT probe (left), or the mut02 probe (right), which conforms to the consensus reported by Berger et al. using protein-binding microarrays. (B) EMSA experiments performed with the Six1-HD (top row) or Six1 (bottom row) proteins, on three derivatives of the mut02 probe (mutated positions are underlined, compared with the WT probe). Note that because the Six1-HD has a relatively low affinity for DNA in these assays, the amounts of protein used are higher than those used for the full-length Six1 protein. The values (all in nanomolars) for each protein on each probe are given underneath the respective gel images.
Figure 5.Performance comparison of the Six1_MB + MT and Six1-opti PWMs. (A) ROC curves. The y-axis is the sensitivity and the x-axis is the 1-specificity value. Performances of the original TRANSFAC MEF3 and Berger et al. PWMs for Six1 are also given for comparison. (B) Venn diagram indicating the number of hits to each PWM, and their overlap, at their respective optimal thresholds, among loci targeted by Six1 in myotubes. Numbers in parentheses are the number of unique sequences (one sequence can occur more than once). (C) Results for the comparison between predictions made with the Six1-opti and Six1_MB + MT PWMs, on sequences bound by Six1 only at 24 hours post-differentiation.