| Literature DB >> 23471540 |
Metewo Selase Enuameh1, Yuna Asriyan, Adam Richards, Ryan G Christensen, Victoria L Hall, Majid Kazemian, Cong Zhu, Hannah Pham, Qiong Cheng, Charles Blatti, Jessie A Brasefield, Matthew D Basciotta, Jianhong Ou, Joseph C McNulty, Lihua J Zhu, Susan E Celniker, Saurabh Sinha, Gary D Stormo, Michael H Brodsky, Scot A Wolfe.
Abstract
Cys2-His2 zinc finger proteins (ZFPs) are the largest group of transcription factors in higher metazoans. A complete characterization of these ZFPs and their associated target sequences is pivotal to fully annotate transcriptional regulatory networks in metazoan genomes. As a first step in this process, we have characterized the DNA-binding specificities of 129 zinc finger sets from Drosophila using a bacterial one-hybrid system. This data set contains the DNA-binding specificities for at least one encoded ZFP from 70 unique genes and 23 alternate splice isoforms representing the largest set of characterized ZFPs from any organism described to date. These recognition motifs can be used to predict genomic binding sites for these factors within the fruit fly genome. Subsets of fingers from these ZFPs were characterized to define their orientation and register on their recognition sequences, thereby allowing us to define the recognition diversity within this finger set. We find that the characterized fingers can specify 47 of the 64 possible DNA triplets. To confirm the utility of our finger recognition models, we employed subsets of Drosophila fingers in combination with an existing archive of artificial zinc finger modules to create ZFPs with novel DNA-binding specificity. These hybrids of natural and artificial fingers can be used to create functional zinc finger nucleases for editing vertebrate genomes.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23471540 PMCID: PMC3668361 DOI: 10.1101/gr.151472.112
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.Distribution of Cys2-His2 zinc fingers in genes within D. melanogaster genome. (A) Distribution of the number of fingers identified within each zinc-finger-containing gene in the fruit fly genome. (B) A schematic depicting canonical DNA recognition by a Cys2-His2 zinc finger. The numbered spheres on the α-helix represent the residues that are anticipated to contact DNA in the canonical recognition mode. These residues are numbered relative to the start of the α-helix and make contact (arrows) with their respective color-coded DNA bases (boxes). Each finger (in an N-terminal to C-terminal orientation) binds its DNA subsite (labeled 5′ to 3′) in an anti-parallel arrangement. (C) Number of ZFPs attempted and the success rate of these B1H selections. (D) Comparative MatAlign analysis of ZFP motifs determined by B1H and other methods (Hallikas et al. 2006; Robasky and Bulyk 2011). B1H motifs are designated by red ovals.
Predictive value of B1H determined motifs
Figure 2.Comparison of isoform specificities. DNA-binding specificities of 17 Lola isoforms generated through alternate splicing. MatAlign clustergram emphasizing the diversity within the recognition motifs of the various Lola isoforms. All of the characterized ZFPs utilize a pair of zinc fingers to recognize DNA. Identical fingers are present in the lola-PN and -PY isoforms and the lola-PT and -PU isoforms, and both pairs have identical specificity.
Figure 3.Phylogenetic comparison of the B1H-determined recognition motifs for 94 Drosophila ZFPs based on the primary recognition strand. ZFPs conserved across the Drosophila and human genomes are specified with their family labels.
Figure 4.Diversity of triplet recognition sequences. Coverage of the 64 possible triplet sequences based on the specificity of the extracted single finger–DNA subsites combinations. Each panel represents 16 different triplets, where the 5′ base is fixed (e.g., upper left is the ANN triplets). The height of the buttons at each position reflects that number of fingers that prefer this triplet within the data set, where those triplets without complementary fingers are white.
Figure 5.Amino acid–base correlations. Frequency logo displaying the average base preference for each amino acid at each recognition position on the recognition helix (RH) assuming canonical recognition. The total number of recognition helices and the number of unique recognition helices (having a unique set of residues at positions −1, 2, 3, and 6) that contain the amino acid at that position are indicated above each logo. Base position nomenclature is defined in Figure 1B.
Figure 6.Drosophila finger sets maintain their specificity when incorporated into artificial arrays. The left column displays the B1H-determined recognition motif for each assembled ZFA. For each motif, the subsite recognized by the utilized fingers in the ZFA and Drosophila ZFP (middle column) is boxed, and where these are similar, the assembly was deemed a success (check; right column). In some cases fingers from more than one Drosophila ZFP were used in the artificial finger assembly. In the case of 3p_nr3c1, due to the initial failure (X), two additional variants were constructed (3p_nr3c1_n and 3p_nr3c1_nn) to achieve the desired DNA-binding specificity. The complements for some of these ZFN pairs are entirely artificial in construction and are thus shown in Supplemental Figure 15.