| Literature DB >> 26538447 |
Xizeng Mao1,2, Qin Ma3,4,5,6, Bingqiang Liu7, Xin Chen8,9, Hanyuan Zhang10,11, Ying Xu12,13,14,15.
Abstract
BACKGROUND: Bacterial operons are considerably more complex than what were thought. At least their components are dynamically rather than statically defined as previously assumed. Here we present a computational study of the landscape of the transcriptional units (TUs) of E. coli K12, revealed by the available genomic and transcriptomic data, providing new understanding about the complexity of TUs as a whole encoded in the genome of E. coli K12. RESULTS ANDEntities:
Mesh:
Substances:
Year: 2015 PMID: 26538447 PMCID: PMC4634151 DOI: 10.1186/s12859-015-0805-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1A diagram of TUC and different TU types: (a) TUs that span the entire DNA sequence covered by a TUC, referred to as full TUs; (b) starting TUs are the ones that begin with the first gene of their parent TUCs excluding (a); (c) terminal TUs are the ones that end with the last gene of their parent TUCs excluding (a); and (d) internal TUs are the ones that contain neither the first nor the last gene of their parent TUCs (see Fig. 1). TUs of (b) and (d) are called non-terminal TUs; and TUs of (c) and (d) are non-starting TUs. Blue bars represent genes, and each solid orange line represents a TU, and the dashed orange line in the bottom is a TUC
Fig. 2Distributions of the number of genes and the number of TUs per multi-gene TUC across all 885 TUCs
Fig. 3Relations between TUCs and directons in the E. coli K12 genome. In (a), A represents the TUCs matching perfectly with their parent directons; B for the TUCs sharing exactly the 5’ boundary genes of their parent directons but; C for the TUCs sharing exactly the 3’ boundary genes of their parent directons; and D the TUCs located properly inside their parent directons. (b) The x-axis represents the number of TUCs per directon and the y-axis represents the number of directons containing k TUCs for k = 1, …, 8
Statistics of 5,430 conserved sequence motifs, 3,307 known plus 2,123 predicted TFBSs, and 3,754 predicted promoters for genes in A, B and C, respectively, with these sets defined above
|
| RegulonDB | |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| Genes with TFBSs in RegulonDB | 233 (39%) | 67 (15%) | 80 (11%) | 178 (41%) | 77 (17%) | 55 (8%) |
| Genes with known promoters in RegulonDB | 229 (40%) | 59 (13%) | 47 (6%) | 173 (40%) | 66 (15%) | 29 (4%) |
Rho-independent terminators for D, E and F genes, as defined above
| Palsson | RegulonDB | |||||
|---|---|---|---|---|---|---|
| Category |
|
|
|
|
|
|
| Genes | 271 (47%) | 46 (14%) | 102 (12%) | 229 (59%) | 70 (19%) | 65 (8%) |
Fig. 4Histograms of TU sizes in the Palsson and RegulonDB datasets
Fig. 5The percentage of genes having Rho-independent terminators with confidence score lower than 76, predicted by the TranstermHP program. X represents 1,149 3’-end genes of TUCs having Rho-independent terminators with confident score no less than 76; and Y represents 1,919 non-3’-end genes of TUCs
Fig. 6Boxplots of gene functional relatedness and conservation levels for the three types of gene pairs. a Gene functional relatedness respectively for the Palsson and RegulonDB data sets; (b) gene location conservation levels respectively for the Palsson and RegulonDB data sets