| Literature DB >> 28421066 |
Haeyoung Jeong1,2, Young Mi Sim3, Hyun Ju Kim2, Sang Jun Lee4.
Abstract
There have been extensive genome sequencing studies for Escherichia coli strains, particularly for pathogenic isolates, because fast determination of pathogenic potential and/or drug resistance and their propagation routes is crucial. For laboratory E. coli strains, however, genome sequence information is limited except for several well-known strains. We determined the complete genome sequence of laboratory E. coli strain RR1 (HB101 RecA+), which has long been used as a general cloning host. A hybrid genome sequence of K-12 MG1655 and B BL21(DE3) was constructed based on the initial mapping of Illumina HiSeq reads to each reference, and iterative rounds of read mapping, variant detection, and consensus extraction were carried out. Finally, PCR and Sanger sequencing-based finishing were applied to resolve non-single nucleotide variant regions with aberrant read depths and breakpoints, most of them resulting from prophages and insertion sequence transpositions that are not present in the reference genome sequence. We found that 96.9% of the RR1 genome is derived from K-12, and identified exact crossover junctions between K-12 and B genomic fragments. However, because RR1 has experienced a series of genetic manipulations since branching from the common ancestor, it has a set of mutations different from those found in K-12 MG1655. As well as identifying all known genotypes of RR1 on the basis of genomic context, we found novel mutations. Our results extend current knowledge of the genotype of RR1 and its relatives, and provide insights into the pedigree, genomic background, and physiology of common laboratory strains.Entities:
Keywords: Illumina HiSeq2000; K-12; evolution; laboratory strain; pedigree
Year: 2017 PMID: 28421066 PMCID: PMC5379014 DOI: 10.3389/fmicb.2017.00585
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Primer sequences and their information.
| Target location | Product size (bp) | Primer ID and sequence in 5′ to 3′ direction |
|---|---|---|
| 4,578,424–4,579,228 | 805 | P1 (CAGCGATGGCAGAACA) and P2 (GCTGGCGCACGλT) |
| 4,080,807–4,082,929 | 2,123 | P3 (CCATCAATTTGCTTGGTG) and P4 (GCGCCATTGTTCCTG) |
| 4,323,223–4,325,315 | 2,093 | P5 (TTλATCATCTGCACTTCGTA) and P6 (CCAGCACCTTCλGCAG) |
| 347,834–349,362 | 1,529 | P7 (GCCTGCTCTTATTCTTTCG) and P8 (GGTGCCAACCATTCGG) |
| 2,200,850–2,202,426 | 1,577 | P9 (TCGGTTCATCGAGCATTA) and P10 (CGCGλATTGTGATTATG) |
| 803,901–805,915 | 2,015 | P11 (TGGCGCGTTAACCTTG) and P12 (CCATGCGAGATAATGCCT) |
| 1,547,272–1,549,296 | 2,015 | P13 (CCGCAGCCTCAAGCTC) and P14 (GTCACTCTAATGCGTAATGGA) |
| 1,089,926–1,091,475 | 1,550 | P15 (GCTGCGAATCAGCCAA) and P16 (GCλAGCTGGTCTTCGT) |
| 1,617,797–1,619,902 | 2,016 | P17 (GTλCACGCCCACTCG) and P18 (GCGTTATTGTCGAGTTGATG) |
| 1,942,024–1,942,247 | 224 | P19 (TTTCCTλTCGACGCAAC) and P20 (TGCGCAACATCCCATT) |
| 1,284,742–1,284,976 | 235 | P21 (TTTCCTTAACTGCTTCTCCTC) and P22 (TGCCTTAACλCATCTTTCA) |
| 4,526,800–4,527,718 Δ( | 919 | L-outer (CAACACAGGGAGCGAATA) and R-outer (ACAAGATGATGGCGATGG) Inner primers L-inner (TCTGCGTAGTCTTCCTGT) and R-inner (GTTTGCGTTGCGTTTGAG)a |
| 2,779,472–2,780,734 ( | 1,272 | recA_F (TGTTGATTCTGTCATGGCATATCCTTAC) and recA_R (GCGTATGCATTGCAGACCTTGTGGCAAC) |
| 1,630,858–1,632,327 ( | 1,450 | mlc_F (TCACTAACTCCACCGTTATGCTTC) and mlc_R (GTGCTGTTAATCACATGCCTAAG) |
| 3,718,901–3,721,216 ( | 2,316 | rhsA-L (GGATGAGλTGAGCGGA) and rhsA-R (ATGCTACCAGAGCAGTGCTT) rhsA-I (TGAGCTTCACCGACTGTT)b |
Large-scale insertions and deletions.
| Evidence | Category | Locus | Description | PCR primer paira |
|---|---|---|---|---|
| Zero-coverage region | Deletion | CPZ-55 prophage exists only in K-12 MG1655 | Noneb | |
| CP4–6 prophage (exclusive of IS | ||||
| Within DLP12 prophage region | ||||
| Rac prophage exists only in K-12 MG1655; First eight amino acids of TtcA protein are not identical between RR1 and K-12 MG1655 | ||||
| Breakpoint analysis | Insertion (IS) | IS | P3-P4 | |
| IS | P5-P6 | |||
| IS | P7-P8 | |||
| IS | P9-P10 | |||
| IS | P11-P12 | |||
| IS | P13-P14 | |||
| IS | P15-P16 | |||
| IS | P17-P18 | |||
| Deletion (IS-mediated) | IS | P19-P20 | ||
| IS | P21-P22 |
Genotypes and characteristics of Escherichia coli HB101 and RR1 strains reported in the literature or referenced on websites.
| Characteristic descriptiona | Reference or website |
|---|---|
| F- Pro- Gal- StrR Rec- r-B m-B | |
| F-
| |
| F-
| The Coli Genetics Stock Centerb |
| F-
| OpenWetWarec |
| F- Δ( | NEBd |
| F- Δ( | Sigma-Aldriche |
| F-
| |
| F-; the same as HB101 except | |
| HB101 | OpenWetWare |
| HB101 RecA+ | Sigma-Aldrich |
Genotype of the RR1 strain based on its complete genome sequence.
| Genotype | Mutations revealed by comparisons with wild-type gene sequences |
|---|---|
| Ser262 → Pro (SR35_00345) | |
| Δ( | Δ( |
| Glu134 → STOP (SR35_03810) | |
| The presence of tRNA-Gln(CUG) (SR35_03325) | |
| Incompatible with | |
| IS | |
| Ser286 → Leu (SR35_00395) | |
| Gln369 → STOP (SR35_08090) | |
| Multiple mutations in | |
| Wild-type | |
| IS | |
| Frameshift (SR35_10405) | |
| Gln33 → STOP (SR35_13990) | |
| Lys43 → Thr (SR35_17055) | |
| Asp70 → Ala (SR35_20475) | |
| Trp69 → STOP (SR35_18240) |