| Literature DB >> 18366731 |
Mark J Mandel1, Eric V Stabb, Edward G Ruby.
Abstract
BACKGROUND: Sequence closure often represents the end-point of a genome project, without a system in place for subsequent improvement and refinement. Building on the genome project of Vibrio fischeri ES114, we used a comparative approach to identify and investigate genes that had a high likelihood of sequence error.Entities:
Mesh:
Year: 2008 PMID: 18366731 PMCID: PMC2330054 DOI: 10.1186/1471-2164-9-138
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
V. fischeri ES114 loci modified due to sequence changes.
| VF_0040 | transcriptional regulator, LysR family | fs | s | fusion | VF0039 | 101 | |
| VF_0044 | predicted recombination limiting protein | fs | s | fusion | VF0045 | 102 | |
| VF_0056 | ATP-dependent RNA helicase | fs | s | fusion | VF0055 | 103 | |
| VF_0093 | adenosine deaminase | dl | m | fusion | VF0092 | 104 | |
| VF_0124 | division inhibitor | fs | s | fusion | VF0123 | 105 | |
| VF_0157 | WbfB protein | fs, ms, n | m | fusion | VF0156 | 106 | |
| VF_0160 | WbfD protein | fs | s | fusion | VF0159 | 107 | |
| VF_0214 | phosphoribulokinase | fs | s | fusion | VF0213 | 109 | |
| VF_0220 | potassium:proton antiporter | fs, ns | m | fusion | VF0221 | 110 | |
| VF_0235 | 50S ribosomal subunit protein L3 | fs | s | fusion | VF0236 | 111 | |
| VF_0246 | 50S ribosomal subunit protein L14 | fs | s | fusion | VF0247 | 112 | |
| VF_0256 | 50S ribosomal subunit protein L15 | fs | s | 3' extension | 168 | ||
| VF_0281 | predicted inner membrane protein | fs | s | fusion | VF0282 | 113 | |
| VF_0300 | putative salt-induced outer membrane protein | fs | s | fusion | VF0299 | 114 | |
| VF_0397 | predicted ABC-type organic solvent transporter | fs | s | fusion | VF0398 | 116 | |
| VF_0418 | diacylglycerol kinase | fs | m | 3' extension | 169 | ||
| VF_0420 | membrane-bound lytic murein transglycosylase C | fs, ms | m | fusion | VF0419 | 117 | |
| VF_0481 | phosphoglucosamine mutase | fs | m | fusion | VF0482 | 118 | |
| VF_0651 | amino-acid abc transporter binding protein | fs | s | 3' extension | 170 | ||
| VF_0657 | succinylglutamate desuccinylase/aspartoacylase family protein | n | s | ambiguous residue clarified | 179 | ||
| VF_0729 | sodium-translocating NADH:quinone oxidoreductase, subunit E | fs | s | fusion | VF0730 | 119 | |
| VF_0762 | predicted GTP-binding protein | fs, ms | m | fusion | VF0761 | 120 | |
| VF_0960 | membrane anchored protein in TolA-TolQ-TolR complex | dl | m | fusion | VF0961 | 171 | |
| VF_0993 | secretion protein IcmF | dl | m | fusion | VF0992 | 182 | |
| VF_1031 | anthranilate phosphoribosyltransferase | fs | s | fusion | VF1030 | 122 | |
| VF_1214 | threonyl-tRNA synthetase | fs | s | fusion | VF1215 | 123 | |
| VF_1304 | copper-exporting ATPase | fs | s | fusion | VF1305 | 125 | |
| VF_1308 | transcriptional regulatory protein Fnr, global regulator of anaerobic growth | ms, ns | m | fusion | VF1309 | 183 | |
| VF_1358 | formate dehydrogenase N, gamma subunit | fs | m | fusion | VF1357 | 126 | |
| VF_1515 | GGDEF domain protein | fs | s | 3' extension | 185 | ||
| VF_1669 | dihydroxynaphthoic acid synthetase | fs | s | fusion | VF1668 | 127 | |
| VF_1771 | serine kinase PrkA | dl | m | fusion | VF1772 | 128 | |
| VF_2633 | lipoprotein, putative | dl | m | none | 172 | ||
| VF_1828 | C-terminal CheW domain, putative chemotaxis coupling protein | fs | s | fusion, 3' extension | VF1827 | 129 | |
| None | Intergenic: VF_1856 – VF_1858 | dl | m | deletion | VF1857 | 130 | |
| VF_1895 | PEP-protein phosphotransferase of PTS system (enzyme I) | fs | s | fusion | VF1896 | 184 | |
| VF_1932 | acyl coenzyme A dehydrogenase | fs | s | fusion | VF1933 | 131 | |
| VF_1938 | hydroxyacylglutathione hydrolase | ms, n | m | amino acid substitutions | 121 | ||
| VF_1945 | tRNA(Ile)-lysidine synthetase | dl | m | fusion | VF1944 | 132 | |
| VF_2049 | maltodextrin glucosidase | fs, ns | m | fusion | VF2050 | 133 | |
| VF_2078 | nucleoside triphosphate pyrophosphohydrolase | fs | s | fusion | VF2077 | 134 | |
| VF_2152 | ammonium transporter | fs | s | 3' extension | 173 | ||
| VF_2166 | poly(A) polymerase I | fs, ms, ns | m | fusion | VF2167 | 135 | |
| VF_2181 | pyruvate dehydrogenase, decarboxylase component E1, thiamin-binding | dl | m | fusion | VF2180 | 136 | |
| VF_2199 | cell division protein FtsQ | fs | m | fusion | VF2198 | 137 | |
| VF_2220 | ubiquinol-cytochrome c reductase iron-sulfur subunit | fs | s | fusion | VF2219 | 138 | |
| VF_2252 | DNA primase | fs, ms, ns | m | fusion | VF2253 | 139 | |
| VF_2347 | serine acetyltransferase | fs, ms | m | fusion | VF2346 | 140 | |
| VF_2366 | high-affinity zinc uptake system protein ZnuA2 | fs | s | fusion | VF2365 | 141 | |
| VF_2370 | predicted enzyme | fs | s | fusion | VF2371 | 142 | |
| VF_2377 | hypothetical protein | dl | m | fusion | VF2378 | 143 | |
| VF_2383 | acetyl-CoA synthetase | fs | m | fusion | VF2384 | 144 | |
| VF_2389 | tRNA-dihydrouridine synthase B | fs, ms | m | fusion | VF2390 | 145 | |
| VF_2412 | RNA polymerase, beta prime subunit | fs | m | fusion | VF2411 | 146 | |
| VF_2414 | RNA polymerase, beta subunit | fs | m | fusion, 5' extension | VF2413 | 147–148 | |
| VF_2418 | 50S ribosomal subunit protein L1 | fs | m | fusion | VF2417 | 149 | |
| VF_2421 | transcription termination factor NusG | fs | m | fusion | VF2420 | 150 | |
| VF_2450 | RNA polymerase, sigma-32 (sigma-H) factor | fs, ms, n | m | fusion | VF2449 | 151 | |
| VF_2463 | ADP-ribose diphosphatase | fs | s | 5' extension | 174 | ||
| VF_2528 | ketol-acid reductoisomerase, NAD(P)-binding | dl | m | fusion | VF2526, VF2527 | 152 | |
| VF_A0046 | acriflavin resistance plasma membrane protein | fs | m | fusion | VFA0047 | 153 | |
| VF_A0244 | GGDEF/EAL domains protein | fs, dl | m | fusion | VFA0242, VFA0243 | 154–155 | |
| VF_A0251 | formate dehydrogenase-H | fs | m | fusion | VFA0252 | 156 | |
| VF_A0304 | hypothetical protein | fs | s | 5' extension | 176 | ||
| VF_A0338 | putative glucosyl hydrolase precursor | fs | m | fusion | VFA0337 | 158 | |
| VF_A0353 | galactose-1-phosphate uridylyltransferase | fs | m | fusion | VFA0354 | 159 | |
| VF_A0432 | fused chromosome partitioning protein: predicted nucleotide hydrolase | fs, ms | m | fusion | VFA0433 | 160 | |
| VF_A0460 | transcription-repair coupling factor | fs | s | fusion | VFA0459 | 161 | |
| None | Intergenic: VF_A0655-VF_A0666 | fs, ms, n | m | 178 | |||
| VF_A0832 | proline dehydrogenase | dl | m | fusion | VFA0831 | 162 | |
| VF_A0856 | hypothetical protein | dl | m | fusion | VFA0855 | 163 | |
| VF_A1008 | hypothetical protein | fs, ms | m | fusion | VFA1009 | 165 | |
| VF_A1152 | multidrug efflux system | fs | m | fusion | VFA1151, VFA1150 | 166 | |
| VF_A1156 | ATP-dependent DEXH-box helicase | dl | m | fusion | VFA1157 | 167 |
Correction types: dl, large deletion; fs, frameshift; ms, missense; ns, nonsense; n, ambiguous nucleotide. s/m indicates whether (s)ingle or (m)ultiple nucleotides were affected by the sequence change.
Figure 1Types of genomic changes described. Examples of the types of chromosomal corrections (A-C) and annotation corrections (D-F) described throughout the paper. The case in (B) shows the artefactual expansions that were removed in this analysis. v1 refers to the previously published version 1.0 release, and v2 refers to the version 2.0 release reported here.
Figure 2Evidence of expansions at multiple chromosomal sites. The fourteen resequencing targets examined had extraneous sequence in the published version. In each case, correction of the error required large deletions (over 300 bp). For each of the targets examined, the closed arrowhead indicates the band observed upon amplification with the PCR primers listed, whereas the open arrowhead indicates the size of the product expected by the sequence in the published version 1.0. Marker sizes are indicated in kb.
Pseudogenes described in ES114 version 2.0.
| VF_0198 | VF0198, VF0199 | +1 frameshift | 108 | |
| VF_1268 | VF1267, VF1268 | amber nonsense codon and 5 bp repeat expansion | 124 | |
| VF_A0141 | VFA0141 | putative transporter, NadC family protein | -1 frameshift | 175 |
| VF_A0270 | VFA0270, VFA0271 | transcriptional regulator, LysR family | amber nonsense codon | 157 |
| VF_A0466 | VFA0466 | N-acetylglucosaminyltransferase | -1 frameshift | 177 |
Summary of 113 new gene features in ES114 version 2.0.
| Regulatory RNAs | Operon leader peptides | Protein-coding genes | ||
| Chromosome I | 9 (9) | 6 (6) | 73 (13) | |
| Chromosome II | 1 (1) | 0 (0) | 22 (3) | |
| Plasmid pES100 | 0 (0) | 0 (0) | 2 (2) |
Numbers in parentheses indicate subset of features that have an annotation other than "hypothetical."
Includes csrB1 and csrB2 [36].
Includes two genes predicted from [61].
ES114 genes encoding transcriptional machinery.
| VF_0262 | α subunit | ||
| VF_2414 | β subunit | ||
| VF_2412 | β' subunit | ||
| VF_0105 | ω subunit | ||
| VF_2254 | σD/σ70 | Group 1: σ70-type | |
| VF_2067 | σS | Group 2: σ70-type, σ38-subtype | |
| VF_A1015 | σQ | Group 2: σ70-type, σ38-subtype | |
| VF_2450 | σH | Group 3: σ70-type, σ32-subtype | |
| VF_1834 | σF | Group 3: σ70-type, σ28-subtype | |
| VF_2093 | σE | Group 4: σ70-type, σ24-subtype | |
| VF_0972 | σE2 | Group 4: σ70-type, σ24-subtype | |
| VF_A0820 | σE3 | Group 4: σ70-type, σ24-subtype | |
| VF_A0766 | σE4 | Group 4: σ70-type, σ24-subtype | |
| VF_2498 | σE5 | Group 4: σ70-type, σ24-subtype | |
| VF_0387 | σN | σ54-type | |