| Literature DB >> 21733163 |
Scott A Jackson1, Isha R Patel, Tammy Barnaba, Joseph E LeClerc, Thomas A Cebula.
Abstract
BACKGROUND: The gene content of a diverse group of 183 unique Escherichia coli and Shigella isolates was determined using the Affymetrix GeneChip® E. coli Genome 2.0 Array, originally designed for transcriptome analysis, as a genotyping tool. The probe set design utilized by this array provided the opportunity to determine the gene content of each strain very accurately and reliably. This array constitutes 10,112 independent genes representing four individual E. coli genomes, therefore providing the ability to survey genes of several different pathogen types. The entire ECOR collection, 80 EHEC-like isolates, and a diverse set of isolates from our FDA strain repository were included in our analysis.Entities:
Mesh:
Year: 2011 PMID: 21733163 PMCID: PMC3146454 DOI: 10.1186/1471-2164-12-349
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Strains Interrogated in this Study
| EC1427 | 493/89 | O157:H- | EHEC1 | Human, Germany, 1989 |
| EC510 | 4936i | O157:H- | STEC | Human |
| EC1231 | G5101 | O157:H7 | EHEC1 | Human, WA, 1995 |
| EC506 | ATCC43888 | O157:H7 | EHEC | Human |
| EC873 | 0015 | O157:H7 | EHEC | Human |
| EC1220 | CAN28 | O157:H7 | EHEC | Human, Canada |
| 93-111 | 93-111 | O157:H7 | EHEC1 | Human, WA, 1993 |
| 95-0001A | 95-0001A | O157:H7 | NA | NA |
| EC867 | 0004 | O157:H7 | EHEC | Salami |
| EC4501 | E2006002641 | O157:H7 | EHEC | Human, Taco John |
| EC1276 | ATCC BAA-460 | O157:H7 | EHEC | ATCC BAA-460 |
| EC866 | 0003 | O157:H7 | EHEC | WA |
| EC874 | 0016 | O157:H7 | EHEC | Apple Cider |
| EC877 | 0019 | O157:H7 | EHEC | Jack-in-the-box, 1993 |
| EC868 | 0005 | O157:H7 | EHEC | NA |
| EC871 | 0012 | O157:H7 | EHEC | Human, AK, 1983 |
| EC876 | 0018 | O157:H7 | EHEC | NA |
| EC533 | 86-24 | O157:H7 | NA | Human, WA, 1986 |
| EC879 | 0023 | O157:H7 | EHEC | NA |
| EC878 | 0022 | O157:H7 | EHEC | derived from 86-24 |
| EC883 | 0027 | O157:H7 | EHEC | NA |
| 86-24 | 86-24 | O157:H7 | EHEC1 | Human, WA, 1986 |
| EC887 | 0032 | O157:H7 | EHEC | NA |
| EC881 | 0025 | O157:H7 | EHEC | mutant of 86-24 |
| EC882 | 0026 | O157:H7 | EHEC | NA |
| EC535 | 86-01 | O157:H7 | EHEC | Human, WA, 1986 |
| EC552 | 491 | O157:H7 | EHEC | Human Sizzler Steak House |
| EC1422 | DEC3A | O157:H7 | EHEC1 | Human, WA, 1985 |
| EC507 | ATCC35150 | O157:H7 | EHEC | Human |
| EC1221 | CAN110 | O157:H7 | EHEC | Human, Canada |
| EC1219 | CAN12 | O157:H7 | EHEC | Human, Canada |
| EC1218 | WETH | O157:H7 | EHEC | Human, 2003 |
| EC1222 | CAN146 | O157:H7 | EHEC | Human, Canada |
| EC1217 | MUS | O157:H7 | EHEC | Human, 2003 |
| EC1425 | DEC3D | O157:H7 | EHEC1 | Human, MI, 1988 |
| EC870 | 0009 | O157:H7 | EHEC | NA |
| EC516 | EC269 | O157:H7 | EHEC | Human |
| EC1215 | DIRKA | O157:H7 | EHEC | Human, 2000 |
| EC1226 | OK-1 | O157:H7 | EHEC1 | Human, Japan, 1996 |
| EC512 | EC262 | O157:H7 | EHEC | Hamburger |
| EC518 | EC267 | O157:H7 | EHEC | Human |
| EC514 | EC260 | O157:H7 | EHEC | PAH, CA Dept Health |
| EC515 | EC261 | O157:H7 | EHEC | PAH, CA Dept Health |
| EC502 | EC121 | O157:H7 | EHEC | PAH, CA Dept Health |
| EC1274 | ATCC 43895 | O157:H7 | EHEC | ATCC 43895 |
| EC1423 | DEC3B | O157:H7 | EHEC1 | Human, WA, 1988 |
| EC423 | #260 | O157:H7 | NA | NA |
| EC503 | EC177 | O157:H7 | EHEC | Human |
| EC885 | 0029 | O157:H7 | EHEC | NA |
| EC872 | 0013 | O157:H7 | EHEC | NA |
| EC504 | ATCC43894 | O157:H7 | EHEC | Human |
| EC509 | ATCC43890 | O157:H7 | EHEC | Human |
| EC875 | 0017 | O157:H7 | EHEC | Human |
| EC1214 | CAI | O157:H7 | EHEC | Human, 2002 |
| EC869 | 0006 | O157:H7 | EHEC | NA |
| EC1212 | EC536-ΔmutS | O157:H7 | EHEC | EC536-ΔmutS |
| EC1242 | 48 | O157:H7 | EHEC | Human, GA 1992 |
| EC536 | 86-17 | O157:H7 | EHEC | |
| EC513 | EC263 | O157:H7 | EHEC | Human |
| EC517 | EC266 | O157:H7 | EHEC | Human |
| EC508 | ATCC43889 | O157:H7 | EHEC | Human |
| EC4401 | 06E02109 | O157:H7 | EHEC | Human, PA, 2006 |
| EC1429 | DEC4B | O157:H7 | EHEC1 | Human, Denmark, 1987 |
| EC4001 | KY 06-830 | O157:H7 | EHEC | Human, 2006 |
| EC4002 | KY 06-831 | O157:H7 | NA | Human, 2006 |
| EC886 | 0031 | O55:H7 | EPEC | Human, WA, 1991 |
| DEC5A | DEC5A | O55:H7 | EPEC | Human, NY |
| ECOR37 | ECOR37 | ON:HN | NA | Marmoset, WA |
| EC1364 | DEC2A | O55:H6 | EPEC1 | Human, Congo, 1962 |
| EC1521 | CFT073 | O6:H1:K2 | UPEC | ATCC 700928 |
| ECOR56 | ECOR56 | O6:H1 | NA | Human, Sweden |
| ECOR55 | ECOR55 | O25:H1 | UPEC | Human, Sweden |
| EC591 | ATCC35376 | ON:NM | NA | Gorilla, WA |
| EC699 | V27 | O2:K5:H1 | ExPEC | Human, WA |
| ECOR51 | ECOR51 | O25:HN | NA | Human, MA |
| ECOR23 | ECOR23 | O86:H43 | NA | Elephant, WA |
| ECOR52 | ECOR52 | O25:H1 | NA | Orangutan, WA |
| ECOR54 | ECOR54 | O25:H1 | NA | Human, IA |
| ECOR32 | ECOR32 | O7:H21 | NA | Giraffe, WA |
| EC678 | H38-2906 | O1:K1:H7 | ExPEC | Human, WA |
| EC669 | H15-2267 | O2:K1:H7 | ExPEC | Human, WA |
| EC674 | H25-2916 | O2:K1:H7 | ExPEC | Human, WA |
| EC715 | PM6 | O2:K1:H7 | ExPEC | Human, WA |
| EC728 | 168-2P6(B) | O2:K1:H7 | ExPEC | Human, WA |
| ECOR61 | ECOR61 | O2:NM | NA | Human, Sweden |
| ECOR62 | ECOR62 | O2:NM | UPEC | Human, Sweden |
| ECOR59 | ECOR59 | O4:H40 | NA | Human, MA, 1979 |
| EC665 | H5-2631 | O18ac:K5:H- | ExPEC | Human, WA |
| ECOR64 | ECOR64 | O75:NM | UPEC | Human, Sweden |
| ECOR65 | ECOR65 | ON:H10 | NA | Celebese ape, WA |
| EC1381 | 536 | O6:H31 | UPEC | Human, Model UTI, PAI |
| ECOR53 | ECOR53 | O4:HN | NA | Human, IA |
| ECOR60 | ECOR60 | O4:HN | UPEC | Human, Sweden |
| ECOR42 | ECOR42 | ON:H26 | NA | Human, MA, 1979 |
| ECOR31 | ECOR31 | O79:H43 | NA | Leopard, WA |
| ECOR43 | ECOR43 | ON:HN | NA | Human, Sweden |
| ECOR35 | ECOR35 | O1:NM | NA | Human, IA |
| ECOR36 | ECOR36 | O79:H25 | NA | Human, IA |
| EC716 | PM7 | O7:H- | ExPEC | Human, WA |
| ECOR40 | ECOR40 | O7:NM | UPEC | Human, Sweden |
| EC590 | ATCC35360 | O7:NM | NA | Human, Tonga, 1982 |
| ECOR38 | ECOR38 | O7:NM | NA | Human, IA |
| ECOR39 | ECOR39 | O7:NM | NA | Human, Sweden |
| EC689 | V14 | O2:K5:H- | ExPEC | Human, WA |
| ECOR49 | ECOR49 | O2:NM | NA | Human, Sweden |
| ECOR50 | ECOR50 | O2:HN | UPEC | Human, Sweden |
| ECOR46 | ECOR46 | O1:H6 | NA | Ape, WA |
| ECOR48 | ECOR48 | ON:HM | UPEC | Human, Sweden |
| EC1522 | NBFAC05.034.01 | O157 | NA | Thailand, 1986 |
| ECOR44 | ECOR44 | ON:HN | NA | Cougar, WA |
| ECOR47 | ECOR47 | OM:H18 | NA | Sheep, New Guinea |
| SH20011 | SH20011 | dysenteriae | dysenteriae | W. Reed |
| SH20008 | ATCC 9207 | boydii | boydii | W. Reed |
| SH20009 | 53G | sonnei | sonnei | W. Reed |
| SH20010 | 2457T | flexneri | flexneri | W. Reed |
| EC1517 | E110019 | O111:H9 | EPEC | Human, Finland |
| EC1410 | MT#80 | O103:H2 | NA | Human; MT |
| EC1375 | DEC12F | O111:NM | EPEC2 | Human, WA, 1983 |
| EC1370 | DEC8B | O111:H8 | EHEC2 | Human, ID, 1986 |
| EC1400 | 3007-85 | O111:NM | EHEC2 | Human, NE, 1985 |
| EC1449 | DEC8A | O111a:NM | EHEC2 | Human, MD, 1977 |
| EC1460 | DEC10B | O26:H11 | EHEC2 | Human, Australia, 1986 |
| EC400 | NA | O26:H11 | EHEC | Human |
| EC1495 | H19 | O26:11 | EHEC2 | Human |
| EC1497 | VP30 | O26:H- | EHEC2 | Human, Chile, 1989 |
| EC1496 | TB285C | O26:H- | EHEC2 | Human, WA, 1991 |
| EC1395 | TB285A | O26:H2 | EHEC2 | Human, WA, 1991 |
| EC1459 | H30 | O26:H11 | EHEC2 | Human, UK |
| EC1464 | RDEC-1 | O15:NM | EHEC2 | Rabbit, SC, 1970 |
| EC1454 | DEC9A | O26:H11 | EHEC2 | Human, WI, 1961 |
| EC1457 | DEC9D | O26:H11 | EHEC2 | Human, Denmark, 1967 |
| ECOR66 | ECOR66 | O4:H40 | NA | Celebese ape, WA |
| ECOR63 | ECOR63 | ON:NM | NA | Human, Sweden |
| EC592 | ATCC35386 | O4:H43 | NA | Goat, Indonesia |
| ECOR24 | ECOR24 | O15:NM | NA | Human, Sweden |
| ECOR70 | ECOR70 | O78:NM | NA | Gorilla, WA |
| ECOR72 | ECOR72 | O144:H8 | UPEC | Human, Sweden |
| EC718 | PM9 | O9:K34:H- | ExPEC | Human, WA |
| ECOR71 | ECOR71 | O78:NM | NA | Human, Sweden |
| ECOR58 | ECOR58 | O112:H8 | NA | Lion, WA |
| ECOR69 | ECOR69 | ON:NM | NA | Celebese ape, WA |
| ECOR68 | ECOR68 | ON:NM | NA | Giraffe, WA |
| EC319 | B7A | O148:H28 | ETEC | NA |
| ECOR7 | ECOR7 | O85:HN | NA | Orangutan, WA |
| EC1523 | NBFAC05.034.02 | O157 | NA | Thailand, 1986 |
| ECOR34 | ECOR34 | O88:NM | NA | Dog, MA |
| ECOR29 | ECOR29 | O150:H21 | NA | Kangaroo rat, NV |
| ECOR33 | ECOR33 | O7:H21 | NA | Sheep, CA |
| EC589 | ATCC35349 | O113:H21 | NA | Bison, Canada |
| ECOR26 | ECOR26 | O104:H21 | NA | Human, MA |
| ECOR27 | ECOR27 | O104:NM | NA | Giraffe, WA |
| ECOR28 | ECOR28 | O104:NM | NA | Human, IA |
| ECOR45 | ECOR45 | ON:HM | NA | Pig, Indonesia |
| EC1490 | MG1655 | OR:H48:K- | NA | ATCC 47076 |
| MG1655-mutS | MG1655-ΔmutS | OR:H48:K- | NA | MG1655-ΔmutS |
| EC1216 | FULLE | NA | NA | Human, 2003 |
| ECOR6 | ECOR6 | ON:HM | NA | Human, IA |
| ECOR25 | ECOR25 | ON:HN | NA | Dog, NY |
| ECOR10 | ECOR10 | O6:H10 | NA | Human, Sweden |
| ECOR8 | ECOR8 | O86:NM | NA | Human, IA |
| ECOR1 | ECOR1 | ON:HN | NA | Human, IA |
| ECOR3 | ECOR3 | O1:NM | NA | Dog, MA |
| ECOR18 | ECOR18 | O5:NM | NA | Celebese ape, WA |
| EC1223 | CAN9139 | NA | NA | Human, Canada |
| ECOR14 | ECOR14 | OM:HN | UPEC | Human, Sweden |
| ECOR9 | ECOR9 | ON:NM | NA | Human, Sweden |
| ECOR12 | ECOR12 | O7:H32 | NA | Human, Sweden |
| ECOR5 | ECOR5 | O79:NM | NA | Human, IA |
| ECOR11 | ECOR11 | O6:H10 | UPEC | Human, Sweden |
| ECOR2 | ECOR2 | ON:H32 | NA | Human, NY, 1979 |
| ECOR13 | ECOR13 | ON:HN | NA | Human, Sweden |
| ECOR20 | ECOR20 | O89:HN | NA | Steer, Bali |
| ECOR21 | ECOR21 | O121:HN | NA | Steer, Bali |
| EC563 | ATCC43886 | O25:K98:NM | ETEC | Human |
| ECOR19 | ECOR19 | O5:NM | NA | Celebese ape, WA |
| EC164 | 4608-58 | O143 | EIEC | NA |
| EC568 | ATCC43893 | O124:NM | EIEC | ATCC 43893 |
| EC884 | 0028 | O55:H7 | EPEC | Human, WA, 1991 |
| ECOR15 | ECOR15 | O25:NM | NA | Human, Sweden |
| ECOR16 | ECOR16 | ON:H10 | NA | Leopard, WA |
| ECOR22 | ECOR22 | ON:HN | NA | Steer, Bali |
| ECOR17 | ECOR17 | O106:NM | NA | Pig, Indonesia |
| ECOR4 | ECOR4 | ON:HN | NA | Human, IA |
When a strain history is unknown, we used a "NA" designation.
Figure 1Comparing microarray probe set summarization methods: RMA . (A.) Scatter plots showing RMA summarized probe set intensities from strains EDL933 (y-axis) and Sakai (x-axis). (B.) Scatter plots showing MAS 5.0 summarized probe set intensities from strains EDL933 (y-axis) and Sakai (x-axis). In both A. and B., data points are color-coded based on their intensities in EDL933. (C.) Line plot showing EDL933 RMA intensity relative to Sakai RMA intensity (log2[EDL933]/[Sakai]). (D.) Line plot showing EDL933 MAS 5.0 intensity relative to Sakai MAS 5.0 intensity (log2[EDL933]/[Sakai]).
RefMax vs. MAS 5.0: A Validation Study
| Present | Absent | |||||
|---|---|---|---|---|---|---|
| Genome | Homology Bin | Genes Present | MAS5 | RefMax | MAS5 | RefMax |
| NC_000913.2 | 100% | 5654 | 5651/5651 | 5653/5654 | 3/3 | 1/0 |
| NC_000913.2 | >98% | 391 | 385/385 | 382/382 | 6/6 | 9/9 |
| NC_000913.2 | >96% | 410 | 263/268 | 225/226 | 147/142 | 185/184 |
| NC_000913.2 | >94% | 347 | 92/102 | 49/48 | 255/245 | 298/299 |
| NC_000913.2 | >92% | 202 | 36/35 | 13/13 | 166/167 | 189/189 |
| NC_000913.2 | >90% | 143 | 32/36 | 8/9 | 111/107 | 135/134 |
| NC_000913.2 | <90% | 12 | 3/3 | 1/1 | 9/9 | 11/11 |
| NC_002655.2 | 100% | 3569 | 3565/3564 | 3558/3566 | 4/5 | 11/3 |
| NC_002655.2 | >98% | 2656 | 2653/2653 | 2644/2648 | 3/3 | 12/8 |
| NC_002655.2 | >96% | 1178 | 1053/1054 | 963/963 | 125/124 | 215/215 |
| NC_002655.2 | >94% | 502 | 302/287 | 159/162 | 200/215 | 343/340 |
| NC_002655.2 | >92% | 288 | 143/138 | 59/60 | 145/150 | 229/228 |
| NC_002655.2 | >90% | 231 | 132/132 | 70/69 | 99/99 | 161/162 |
| NC_002655.2 | <90% | 20 | 8/9 | 6/6 | 12/11 | 14/14 |
| NC_002695.1 | 100% | 3473 | 3471/3472 | 3466/3464 | 2/1 | 7/9 |
| NC_002695.1 | >98% | 2655 | 2652/2652 | 2646/2641 | 3/3 | 9/14 |
| NC_002695.1 | >96% | 1164 | 1036/1031 | 945/936 | 128/133 | 219/228 |
| NC_002695.1 | >94% | 511 | 302/291 | 169/167 | 209/220 | 342/344 |
| NC_002695.1 | >92% | 291 | 140/138 | 66/66 | 151/153 | 225/225 |
| NC_002695.1 | >90% | 208 | 108/102 | 52/52 | 100/106 | 156/156 |
| NC_002695.1 | <90% | 18 | 7/6 | 5/5 | 11/12 | 13/13 |
| NC_004431.1 | 100% | 3112 | 3111/- | 3111/- | 1/- | 1/- |
| NC_004431.1 | >98% | 1816 | 1816/- | 1814/- | -/- | 2/- |
| NC_004431.1 | >96% | 1585 | 1494/- | 1450/- | 91/- | 135/- |
| NC_004431.1 | >94% | 509 | 304/- | 247/- | 205/- | 262/- |
| NC_004431.1 | >92% | 261 | 96/- | 55/- | 165/- | 206/- |
| NC_004431.1 | >90% | 198 | 90/- | 48/- | 108/- | 150/- |
| NC_004431.1 | <90% | 16 | 7/- | 6/- | 9/- | 10/- |
Gene present/absent calls were determined for the 4 sequenced reference strains represented on the array using either the RefMax or MAS 5.0 gene detection methods. Genome corresponds to the accession number of the genome/strain being interrogated. Homology Bin corresponds to the percentage by which a probe set consensus sequence matches the target genome sequence. Genes Present corresponds to the number of genes present on the array which fall into a particular Homology Bin (this is also the maximum number of correct "present" calls). Present or Absent calls were determined from either the MAS 5.0 or RefMax method and are shown under the "MAS5" and "RefMax" headings. Strains MG1655, EDL933, and Sakai were each performed in duplicate to show the reproducibility of each method. Independent measurements are indicated by a "/" under the Present and Absent headers.
Figure 2Principle Component Analysis (PCA): The MADE4 package of R-Bioconductor was used to perform PCA on RMA-summarized probe set intensities. The first 3 components were plotted using Spotfire. A. All 207 isolates are shown and color-coded based on their serotype. B. 128 isolates are plotted and color-coded based on their known pathotype.
O157:H7-Specific Gene Targets
| 1766456_s_at | NC_002655.2 | Escherichia coli O157:H7 EDL933 | 2849958-2851076 | Z3198-RC | - | EDL933 | 16445223 |
| 1759686_s_at | NC_002655.2 | Escherichia coli O157:H7 EDL933 | 2848990-2849955 | Z3197 | fcI | fucose synthetase | 962092 |
| 1759686_s_at | NC_002695.1 | Escherichia coli O157:H7 Sakai | 2778776-2779741 | ECs2838 | - | fucose synthetase | 912293 |
| 1766456_s_at | NC_002695.1 | Escherichia coli O157:H7 Sakai | 2779744-2780862 | ECs2839 | - | GDP-D-mannose dehydratase | 912548 |
| 1766456_s_at | NC_002655.2 | Escherichia coli O157:H7 EDL933 | 2849958-2851076 | Z3198 | - | GDP-mannose dehydratase | 962093 |
| 1764806_s_at | NC_002655.2 | Escherichia coli O157:H7 EDL933 | 2848478-2848987 | Z3196 | wbdQ | GDP-mannose mannosylhydrolase | 962091 |
| 1762793_s_at | NC_002695.1 | Escherichia coli O157:H7 Sakai | 2780369-2780601 | ECs5479 | - | hypothetical protein | 2693774 |
| 1759443_s_at | NC_002695.1 | Escherichia coli O157:H7 Sakai | 2776834-2778282 | ECs2836 | - | mannose-1-P guanosyltransferase | 912820 |
| 1759443_s_at | NC_002655.2 | Escherichia coli O157:H7 EDL933 | 2847048-2848496 | Z3195 | manC | mannose-1-P guanosyltransferase | 962090 |
| 1762953_s_at | NC_002695.1 | Escherichia coli O157:H7 Sakai | 2783218-2784603 | ECs2842 | - | O antigen flippase | 912601 |
| 1762953_s_at | NC_002655.2 | Escherichia coli O157:H7 EDL933 | 2853432-2854823 | Z3201 | wzx | O antigen flippase Wzx | 962096 |
| 1766849_s_at | NC_002655.2 | Escherichia coli O157:H7 EDL933 | 2855525-2856709 | Z3203 | wzy | O antigen polymerase | 962098 |
| 1766849_s_at | NC_002695.1 | Escherichia coli O157:H7 Sakai | 2785311-2786495 | ECs2844 | - | O antigen polymerase | 912486 |
| 1764806_s_at | NC_002695.1 | Escherichia coli O157:H7 Sakai | 2778264-2778773 | ECs2837 | - | putative GDP-L-fucose pathway enzyme | 912421 |
Using the MAS 5.0 gene detection method, we filtered those probe sets that were consistently called "present" in all O157:H7 strains yet were called "absent" in all non-O157:H7 strains. The 14 probe sets shown here correspond to O157:H7-specific genes.
Figure 3Molecular Phylogenetic analysis by Maximum Likelihood method: The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model [35]. The tree with the highest log likelihood (-282250.6332) is shown. Initial tree(s) for the heuristic search were obtained automatically as follows. When the number of common sites was < 100 or less than one fourth of the total number of sites, the maximum parsimony method was used; otherwise BIONJ method with MCL distance matrix was used. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. There were a total of 10208 positions in the final dataset.