| Literature DB >> 24551810 |
Abstract
Escherichia coli (E. coli) bacteria can damage DNA of the gut lining cells and may encourage the development of colon cancer according to recent reports. Genetic switches are specific sequence motifs and many of them are drug targets. It is interesting to know motifs and their location in sequences. At the present study, Gibbs sampler algorithm was used in order to predict and find functional motifs in E. coli NC101 contig 1. The whole genomic sequence of Escherichia coli NC101 contig 1 were retrieved from http://www.ncbi.nlm.nih.gov (NCBI Reference sequence: NZ_AEFA01000001.1) in order to be analyzed with DAMBE software and BLAST. The results showed that the 6-mer motif is CUGGAA in most sequences (genes1-3, 8, 9, 12, 14-18, 20-23, 25, 27, 29, 31-34), CUUGUA for gene 4 , CUGUAA for gene 5, CUGAUG for gene 6, CUGAUA for gene7, CUGAAA for genes 10, 11, 13, 26, 28, and CUGGAG for gene 19, and CUGGUA for gene30 in E. coli NC101 contig 1. It is concluded that the 6-mer motif is CUGGAA in most sequences in E. coli NC101 contig1. The present study may help experimental studies on elucidating the pharmacological and phylogenic functions of the motifs in E. coli.Entities:
Keywords: Escherichia coli; functional motifs; gene expression
Year: 2013 PMID: 24551810 PMCID: PMC3927380
Source DB: PubMed Journal: Int J Mol Cell Med ISSN: 2251-9637
Fig 1The sequences of E. coli NC101 contig1. The above panel represents the data input in Gibbs sampler (a). The below part represents the output of the motifs (i.e.,CUGGAA; in red color) through the sequences (b). S1-S34 correspond to sequence 1 to sequence 34
Gibbs sampler output
|
| ||||
|---|---|---|---|---|
| Code | Count | Freq | ||
| A | 7622 | 0.2419 | ||
| C | 7977 | 0.2532 | ||
| G | 8879 | 0.2818 | ||
| U | 7031 | 0.2231 | ||
|
| ||||
| A | C | G | U | |
| 1 | 0 | 34 | 0 | 0 |
| 2 | 0 | 0 | 0 | 34 |
| 3 | 0 | 0 | 33 | 1 |
| 4 | 7 | 0 | 26 | 1 |
| 5 | 29 | 0 | 0 | 5 |
| 6 | 32 | 0 | 2 | 0 |
|
| ||||
| A | C | G | U | |
| 1 | 0.00691 | 0.97866 | 0.00805 | 0.00638 |
| 2 | 0.00691 | 0.00723 | 0.00805 | 0.97780 |
| 3 | 0.00691 | 0.00723 | 095091 | 0.03495 |
| 4 | 020691 | 0.00723 | 0.75091 | 0.03495 |
| 5 | 0.83548 | 0.00723 | 0.00805 | 0.14923 |
| 6 | 0.92120 | 0.00723 | 0.06519 | 0.00638 |
|
| ||||
| A | C | G | U | |
| 1 | 3.55288- | 1.34992 | 3.55495- | 3.55600- |
| 2 | 3.55288- | 3.55757- | 3.55495- | 1.47685 |
| 3 | 3.55288- | 3.55757- | 1.21665 | 1.85463- |
| 4 | 0.15376- | 3.55757- | 0.98051 | 1.85463- |
| 5 | 1.24196 | 3.55757- | 3.55495- | 0.40295- |
| 6 | 1.33962 | 3.55757- | 1.46340- | 3.55600- |
Gibbs sampler results of E. coli NC101 contig1 sequences for motif, start location and PWMS identification
| SeqName | Motif | Start | PWMS |
|---|---|---|---|
| lcl|NZ_AEFA01000001.1_gene_1 | CUGGAA | 20 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_2 | CUGGAA | 414 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_3 | CUGGAA | 225 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_4 | CUUGUA | 31 | 17.9811 |
| lcl|NZ_AEFA01000001.1_gene_5 | CUGUAA | 32 | 117.9619 |
| lcl|NZ_AEFA01000001.1_gene_6 | CUGAUG | 39 | 7.5632 |
| lcl|NZ_AEFA01000001.1_gene_7 | CUGAUA | 1 | 124.7508 |
| lcl|NZ_AEFA01000001.1_gene_8 | CUGGAA | 438 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_9 | CUGGAA | 39 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_10 | CUGAAA | 471 | 646.2746 |
| lcl|NZ_AEFA01000001.1_gene_11 | CUGAAA | 237 | 646.2746 |
| lcl|NZ_AEFA01000001.1_gene_12 | CUGGAA | 564 | 2009.2159 |
| lcl|NZ_AEFA01000001.1_gene_13 | CUGAAA | 213 | 646.2746 |
| lcl|NZ_AEFA01000001.1_gene_14 | CUGGAA | 330 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_15 | CUGGAA | 144 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_16 | CUGGAA | 504 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_17 | CUGGAA | 159 | 2009.2159 |
| lcl|NZ_AEFA01000001.1_gene_18 | CUGGAA | 417 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_19 | CUGGAG | 63 | 121.8117 |
| lcl|NZ_AEFA01000001.1_gene_20 | CUGGAA | 606 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_21 | CUGGAA | 435 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_22 | CUGGAA | 63 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_23 | CUGGAA | 237 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_24 | CUGGUA | 671 | 387.8401 |
| lcl|NZ_AEFA01000001.1_gene_25 | CUGGAA | 765 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_26 | CUGAAA | 153 | 646.2746 |
| lcl|NZ_AEFA01000001.1_gene_27 | CUGGAA | 387 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_28 | CUGAAA | 105 | 646.2746 |
| lcl|NZ_AEFA01000001.1_gene_29 | CUGGAA | 552 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_30 | CUGGUA | 24 | 387.8401 |
| lcl|NZ_AEFA01000001.1_gene_31 | CUGGAA | 147 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_32 | CUGGAA | 348 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_33 | CUGGAA | 47 | 2009.2156 |
| lcl|NZ_AEFA01000001.1_gene_34 | CUGGAA | 354 | 2009.2156 |
Mean 1429.4078
Standard deviation 812.3610
Fig. 2Scatter diagram of S1D and S2D in E. coli NC101 contig1 sequences