Hisanori Kiryu1, Taku Oshima, Kiyoshi Asai. 1. Graduate School of Information Sciences, Nara Institute of Science and Technology 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan. hisano-k@is.aist-nara.ac.jp
Abstract
MOTIVATION: The relations between the promoter sequences and their strengths were extensively studied in the 1980s. Although these studies uncovered strong sequence-strength correlations, the cost of their elaborate experimental methods have been too high to be applied to a large number of promoters. On the contrary, a recent increase in the microarray data allows us to compare thousands of gene expressions with their DNA sequences. RESULTS: We studied the relations between the promoter sequences and their strengths using the Escherichia coli microarray data. We modeled those relations using a simple weight matrix, which was optimized with a novel support vector regression method. It was observed that several non-consensus bases in the '-35' and '-10' regions of promoter sequences act positively on the promoter strength and that certain consensus bases have a minor effect on the strength. We analyzed outliers for which the observed gene expressions deviate from the promoter strength predictions, and identified several genes with enhanced expressions due to multiple promoters and genes under strong regulation by transcription factors. Our method is applicable to other procaryotes for which both the promoter sequences and the microarray data are available.
MOTIVATION: The relations between the promoter sequences and their strengths were extensively studied in the 1980s. Although these studies uncovered strong sequence-strength correlations, the cost of their elaborate experimental methods have been too high to be applied to a large number of promoters. On the contrary, a recent increase in the microarray data allows us to compare thousands of gene expressions with their DNA sequences. RESULTS: We studied the relations between the promoter sequences and their strengths using the Escherichia coli microarray data. We modeled those relations using a simple weight matrix, which was optimized with a novel support vector regression method. It was observed that several non-consensus bases in the '-35' and '-10' regions of promoter sequences act positively on the promoter strength and that certain consensus bases have a minor effect on the strength. We analyzed outliers for which the observed gene expressions deviate from the promoter strength predictions, and identified several genes with enhanced expressions due to multiple promoters and genes under strong regulation by transcription factors. Our method is applicable to other procaryotes for which both the promoter sequences and the microarray data are available.
Authors: Friederike Mey; Jim Clauwaert; Maarten Van Brempt; Michiel Stock; Jo Maertens; Willem Waegeman; Marjan De Mey Journal: Methods Mol Biol Date: 2022
Authors: André Fujita; João Ricardo Sato; Leonardo de Oliveira Rodrigues; Carlos Eduardo Ferreira; Mari Cleide Sogayar Journal: BMC Bioinformatics Date: 2006-10-23 Impact factor: 3.169
Authors: Johanna Weindl; Pavol Hanus; Zaher Dawy; Juergen Zech; Joachim Hagenauer; Jakob C Mueller Journal: Nucleic Acids Res Date: 2007-10-16 Impact factor: 16.971