| Literature DB >> 16614445 |
Michiel Wels1, Christof Francke, Robert Kerkhoven, Michiel Kleerebezem, Roland J Siezen.
Abstract
Cis-acting elements in Lactobacillus plantarum were predicted by comparative analysis of the upstream regions of conserved genes and predicted transcriptional units (TUs) in different bacterial genomes. TUs were predicted for two species sets, with different evolutionary distances to L.plantarum. TUs were designated 'cluster of orthologous transcriptional units' (COT) when >50% of the genes were orthologous in different species. Conserved DNA sequences were detected in the upstream regions of different COTs. Subsequently, conserved motifs were used to scan upstream regions of all TUs. This method revealed 18 regulatory motifs only present in lactic acid bacteria (LAB). The 18 LAB-specific candidate regulatory motifs included 13 that were not described previously. These LAB-specific different motifs were found in front of genes encoding functions varying from cold shock proteins to RNA and DNA polymerases, and many unknown functions. The best-described LAB-specific motif found was the CopR-binding site, regulating expression of copper transport ATPases. Finally, all detected motifs were used to predict co-regulated TUs (regulons) for L.plantarum, and transcriptome profiling data were analyzed to provide regulon prediction validation. It is demonstrated that phylogenetic footprinting using different species sets can identify and distinguish between general regulatory motifs and LAB-specific regulatory motifs.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16614445 PMCID: PMC1435977 DOI: 10.1093/nar/gkl138
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Phylogenetic relation of species gathered from the TaxBrowser at NCBI. Relations are based on 16S rRNA sequence. Different species sets (BAC set and LAB set) were chosen on basis of the phylogenetic distance to L.plantarum. All members of the LAB set are more closely related to L.plantarum than to members of the BAC set.
Figure 2Schematic representation of the motif prediction procedure employed in this study. For each species TUs were predicted and used for a COT prediction. The upstream regions of the COTs were analyzed using MEME. Following MEME analysis, predicted regulatory motifs were compared using COMPASS. The upstream regions containing significantly similar motifs were re-analyzed by MEME. This procedure was iterated until all identified motifs could be considered unique. The unique motifs of both sets were compared and on basis of this comparison, the motifs were divided into three different classes. All motifs were validated using MAST against other genomic sequences. Finally, regulons were predicted by scanning the genome with the identified motifs.
Figure 3Prediction of the COTs. The TUs of the different species were compared. If >50% of the gene content of the smallest TU was shared, the TUs were considered to be orthologous and combined into one cluster. Gene order was allowed to vary in the analysis.
Characteristics for each species set
| BAC | LAB | |
|---|---|---|
| Number of species | 6 | 6 |
| Number of genes | 17 922 | 11 436 |
| Genes/species (mean) | 2987 | 1906 |
| Number of Tus | 9618 | 6464 |
| Genes/TU (mean) | 1.86 | 1.77 |
| TUs/species (mean) | 1603 | 1077 |
| Number of COTs | 775 | 527 |
| TUs/COT (mean) | 5.2 | 5.0 |
| Number of unique motifs ( | 195 | 195 |
aSince a TU could be present in several COTs the mean number of TUs per COT is not equivalent to the total number of TUs divided by the total number of COTs.
Figure 4Distribution of expression correlations for gene pairs of L.plantarum. The correlation of gene pairs within a TU (black), shows a clearly different distribution than the correlation of all gene pairs (gray).
Selected regulatory motifs predicted by both species sets
For a complete list, see the Supplementary Data. The regulated genes shown are those found in L.plantarum; those in other species can be found in Supplementary Data. A dash between genes signifies the same TU, while a comma separates different TUs. Some of the motifs (10, 12 and 13) were not found for COTs using the LAB set. Nevertheless, occurrences of these motifs could still be found in LAB set species using MAST. n/a, not applicable.
aSince T-boxes have different specificities, depending on their specifier codons, genes regulated by a T-box are not co-regulated.
bGenes not in the dataset.
cIf only one TU is found to have the conserved cis-acting element, no correlations for the regulon can be calculated.
Predicted LAB-specific motifs