| Literature DB >> 19930606 |
Chengwei Luo1, Gang-Qing Hu, Huaiqiu Zhu.
Abstract
BACKGROUND: As one of human pathogens, the genome of Uropathogenic Escherichia coli strain CFT073 was sequenced and published in 2002, which was significant in pathogenetic bacterial genomics research. However, the current RefSeq annotation of this pathogen is now outdated to some degree, due to missing or misannotation of some essential genes associated with its virulence. We carried out a systematic reannotation by combining automated annotation tools with manual efforts to provide a comprehensive understanding of virulence for the CFT073 genome.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19930606 PMCID: PMC2785843 DOI: 10.1186/1471-2164-10-552
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Overview of the differences between the original RefSeq annotation and the reannotation
| Original annotation | Reannotation | |
|---|---|---|
| Genome length | 5,231,426 bp | |
| plasmids | None | |
| G+C% | 50.47% | |
| Protein-coding genes | 5,339 | 5,030 |
| tRNAs | 89 | |
| rRNAs | 21 | |
| Miscellaneous RNAs | 6 | 46 |
| Backbone genes | 4,550 (4,440 protein-coding genes, 85 tRNA genes, 21 rRNA genes, and 4 miscellaneous RNA genes) | 4,328 (4,178 protein-coding genes, 85 tRNA genes, 21 rRNA genes, and 44 miscellaneous RNA genes) |
| Genomic island genes | 905 (899 protein-coding genes, 4 tRNA genes, and 2 miscellaneous RNA genes) | 851 (845 protein-coding genes, 4 tRNA genes, and 2 miscellaneous RNA genes) |
| Cryptic prophages | 5 | |
The concept of miscellaneous RNA here includes tRNAs, rRNAs and all other RNAs;
In this comparison, we define the genes other than those in the genomic island regions as backbone genes;
The genes located in genomic island regions (data conducted by Lloyd and et al [47]).
List of newly added mobile genetic element-related genes
| ID | Start site | Stop site | Strand | Comments |
|---|---|---|---|---|
| c0012r | 131965 | 132090 | Forward | Transposase for insertion sequence |
| c0024r | 255849 | 256166 | Forward | Transposase protein |
| c0027r | 270919 | 270674 | Reverse | Phage integrase family protein |
| c0039r | 331709 | 331584 | Reverse | Predicted integrase protein |
| c0042r | 376998 | 377138 | Forward | Homologue to Iso-IS1-insB protein |
| c0053r | 627463 | 627155 | Reverse | Putative prophase integrase protein, IntD |
| c0054r | 627782 | 627483 | Reverse | Putative integrase |
| c0100r | 1234499 | 1235998 | Forward | R6-like transposase protein |
| c0101r | 1235995 | 1236750 | Forward | Insertion sequence ATP-binding protein |
| c0175r | 2349098 | 2349277 | Forward | Putative transposase |
| c0215r | 3452165 | 3451905 | Reverse | Insertion sequence protein |
| c0250r | 4284055 | 4283780 | Reverse | IS element |
| c0251r | 4291696 | 4292046 | Forward | Transposase |
| c0252r | 4293121 | 4293074 | Reverse | Transposase IS3/IS911 family protein |
the ID code follows the order of gene loci on chromosome in the reannotation;
the comments on functions are conducted from CDD rps-blast results [16] and Swiss-Prot psi-search results [15].
Figure 1Differences between RefSeq annotation and the reannotation in cdiAB region. In the reannotation, three pseudogenes (c0342, c0343 and c0344) are merged into two genes, and are found to be homologues of contact-dependent growth inhibitor encoding genes cdiAB.
Figure 2Conservative structure of mch operon and mcm operon in different E. coli strains. The Fur boxes are marked by the orange lines; the content in brackets under gene name indicates the ID of this novel gene. The line in the bottom shows the partial structure of PAI-CFT073-serX, the numbers on the genes note the positions in the genome.
Figure 3Structural domains of c0139's product in the reannotation and autotransporter virulence factor antigen 43. The domain in blue has the ability to form a β-barrel, also termed autotransporter domain, and is a key component in self-exportation; the domain in green is the passenger domain, which varies widely with different ATs; the segment in red is signal peptide, which guides the whole protein during the translocation; and the dashed line indicates the cleavage site in which protease cuts AT and releases the passenger segment.
The newly added colicin and colicin-related genes
| ID | Start site | Stop site | Strand | Comments |
|---|---|---|---|---|
| c0009r | 127216 | 128997 | Forward | Uropathogenic specific S-type colicin |
| c0010r | 129401 | 129688 | Forward | Putative colicin |
| c0011r | 129691 | 129984 | Forward | Putative colicin immunity protein |
| c0094r | 1176363 | 1176527 | Forward | Protein MchX |
| c0095r | 1176596 | 1176805 | Forward | Microcin immunity protein, MchI |
| c0096r | 1182901 | 1183122 | Forward | Microcin immunity protein, McmI |
| c0097r | 1183119 | 1183397 | Forward | McmA protein |