Gurmukh Sahota1, Gary D Stormo. 1. Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63108, USA.
Abstract
MOTIVATION: Computational techniques for microbial genomic sequence analysis are becoming increasingly important. With next-generation sequencing technology and the human microbiome project underway, current sequencing capacity is significantly greater than the speed at which organisms of interest can be studied experimentally. Most related computational work has been focused on sequence assembly, gene annotation and metabolic network reconstruction. We have developed a method that will primarily use available sequence data in order to determine prokaryotic transcription factor (TF) binding specificities. RESULTS: Specificity determining residues (critical residues) were identified from crystal structures of DNA-protein complexes and TFs with the same critical residues were grouped into specificity classes. The putative binding regions for each class were defined as the set of promoters for each TF itself (autoregulatory) and the immediately upstream and downstream operons. MEME was used to find putative motifs within each separate class. Tests on the LacI and TetR TF families, using RegulonDB annotated sites, showed the sensitivity of prediction 86% and 80%, respectively. AVAILABILITY: http://ural.wustl.edu/∼gsahota/HTHmotif/
MOTIVATION: Computational techniques for microbial genomic sequence analysis are becoming increasingly important. With next-generation sequencing technology and the human microbiome project underway, current sequencing capacity is significantly greater than the speed at which organisms of interest can be studied experimentally. Most related computational work has been focused on sequence assembly, gene annotation and metabolic network reconstruction. We have developed a method that will primarily use available sequence data in order to determine prokaryotic transcription factor (TF) binding specificities. RESULTS: Specificity determining residues (critical residues) were identified from crystal structures of DNA-protein complexes and TFs with the same critical residues were grouped into specificity classes. The putative binding regions for each class were defined as the set of promoters for each TF itself (autoregulatory) and the immediately upstream and downstream operons. MEME was used to find putative motifs within each separate class. Tests on the LacI and TetR TF families, using RegulonDB annotated sites, showed the sensitivity of prediction 86% and 80%, respectively. AVAILABILITY: http://ural.wustl.edu/∼gsahota/HTHmotif/
Authors: Yue Hao; Zhongge J Zhang; David W Erickson; Min Huang; Yingwu Huang; Junbai Li; Terence Hwa; Hualin Shi Journal: Proc Natl Acad Sci U S A Date: 2011-07-08 Impact factor: 11.205
Authors: Alexey E Kazakov; Dmitry A Rodionov; Morgan N Price; Adam P Arkin; Inna Dubchak; Pavel S Novichkov Journal: J Bacteriol Date: 2012-10-19 Impact factor: 3.490
Authors: Daniel Luis Notari; Aurione Molin; Vanessa Davanzo; Douglas Picolotto; Helena Graziottin Ribeiro; Scheila de Avila E Silva Journal: Bioinformation Date: 2014-06-30
Authors: Dmitry A Ravcheev; Matvei S Khoroshkin; Olga N Laikova; Olga V Tsoy; Natalia V Sernova; Svetlana A Petrova; Aleksandra B Rakhmaninova; Pavel S Novichkov; Mikhail S Gelfand; Dmitry A Rodionov Journal: Front Microbiol Date: 2014-06-11 Impact factor: 5.640