| Literature DB >> 34930397 |
Darren Heavens1, Caisey V Pulford2, Blanca M Perez-Sepulveda3, Alexander V Predeus2, Ross Low1, Hermione Webster2, Gregory F Dykes2, Christian Schudoma1, Will Rowe2,4, James Lipscombe1, Chris Watkins1, Benjamin Kumwenda5, Neil Shearer1, Karl Costigan2, Kate S Baker2, Nicholas A Feasey6,7, Jay C D Hinton8, Neil Hall9,10,11.
Abstract
We have developed an efficient and inexpensive pipeline for streamlining large-scale collection and genome sequencing of bacterial isolates. Evaluation of this method involved a worldwide research collaboration focused on the model organism Salmonella enterica, the 10KSG consortium. Following the optimization of a logistics pipeline that involved shipping isolates as thermolysates in ambient conditions, the project assembled a diverse collection of 10,419 isolates from low- and middle-income countries. The genomes were sequenced using the LITE pipeline for library construction, with a total reagent cost of less than USD$10 per genome. Our method can be applied to other large bacterial collections to underpin global collaborations.Entities:
Keywords: Salmonella; Thermolysates; Whole-genome sequencing; iNTS
Mesh:
Substances:
Year: 2021 PMID: 34930397 PMCID: PMC8690886 DOI: 10.1186/s13059-021-02536-3
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Summary of the geographical origin, timeline, and body site source of 10,419 bacterial isolates. The 10,419 isolates were collected from 53 countries/territories spanning 5 continents (America, Africa, Asia, Europe, and Oceania), with most isolates originating from Africa (56%) and America (26%). The samples were mostly of human origin (86%), of which 52% were blood isolates, 41% were stool isolates, and 7% from other body compartments. About 5% samples originated from environmental sources, 6% were of animal origin, and 3% unknown. The bacterial pathogens were isolated over a 68-year time period, from 1949 to 2017. The majority of samples were isolated after 1990
Sources of isolates collected by the 10KSG consortium
| Africa | America | Other | Total | |
|---|---|---|---|---|
| Farm animal | 166 | 51 | 12 | 229 |
| Other | 128 | 202 | 82 | 412 |
| Food | 0 | 56 | 12 | 68 |
| Other | 6 | 301 | 68 | 375 |
| Blood | 3461 | 1238 | 12 | 4711 |
| Stool | 1591 | 772 | 1334 | 3697 |
| Other | 560 | 97 | 221 | 878 |
Asia, Europe, and Oceania
b The source location (continent) was unknown for 49 of the total 10,419 isolates
AMR phenotypes of 3463 isolates collected by the 10KSG consortium
| Antimicrobial resistance | Africa | America | Other | Total |
|---|---|---|---|---|
| MDRc | 1271 | 122 | 21 | 1414 |
| 1–2 agentsd | 581 | 69 | 56 | 706 |
| Susceptiblee | 1139 | 142 | 62 | 1343 |
Asia, Europe, and Oceania
b Antimicrobial resistance profile performed by Kirby-Bauer technique. The antimicrobials used for profiling varied depending on the study, and it included ampicillin, chloramphenicol, streptomycin, tetracycline, gentamicin, kanamycin, nalidixic acid, trimethoprim, ciprofloxacin, ceftriaxone, and cotrimoxazole
c Resistance to ampicillin, cotrimoxazole, and chloramphenicol
Antimicrobial resistance to 1–2 tested agents
e Susceptible to tested agents
Processing time and consumable costs for DNA extraction and sequencing
| Activity | Processing time (h) | Hands-on time (h) | Consumable cost (USD$) |
|---|---|---|---|
| DNA extraction | 1 | 0.5 | 93.88 |
| DNA QC and normalization | 1 | 0.5 | 136.44 |
| Library Construction, QC, pooling and size selection | 6 | 1 | 277.86 |
| Sequencingc | 85 | 1 | 459.35 |
Per 96-well plate
b Converted from GBP (1 GBP = 1.25 USD)
c Based on Illumina HiSeq4000 runs
Fig. 2LITE (Low Input, Transposase Enabled) pipeline for library construction. The DNA was extracted using a protocol based on the MagAttract HMW DNA isolation kit (Qiagen). Library construction was performed by tagmentation using Nextera tagmentation kit, size selected on a BluePippin, and quantified using a High Sensitivity BioAnalyzer kit (Agilent) and Qubit dsDNA HS Assay (ThermoFisher). Genome sequencing of “super pools” was performed in a HiSeqTM 4000 (Illumina) system, and re-sequencing in NovaSeqTM 6000 (Illumina) when needed, both with a 2 × 150 bp paired ends read metric
Fig. 3The sequential quality control process used to select whole-genome sequences for detailed analysis. Of the 10,419 isolates, 443 failed the DNA extraction or quality control prior to genome sequencing. We produced sequencing libraries of 9975 samples, of which 1366 were not bioinformatically identified as Salmonella enterica. These 1366 corresponded to 1157 which were part of the 25% non-Salmonella component of the project, plus 209 isolates that had been mis-identified as Salmonella before sequencing. Of the 7236 Salmonella genomes, 6248 had sequence coverage over 10×, of which 5833 passed the “stringent criteria.” Of the 415 samples that failed the “stringent criteria,” 284 samples were rescued based on a “clean up” (55) or a “relaxed criteria” (229). Overall, we generated 6117 high-quality Salmonella genomes
Fig. 4Genome-based summary of Salmonella enterica from African and American datasets, organized by continent, year of isolation, and serovar. Of the 6117 Salmonella enterica genomes that were successfully sequenced and that passed QC, 3100 (50.7%) were from Africa and 2313 (37.8%) were from America. Bubble size represents the number of genomes isolated between 1959 and 2017. The graphs represent the proportion of the main Salmonella serovars predicted based on genome analysis: 1844 S. typhimurium and 657 S. enteritidis from Africa, and 474 S. typhimurium and 676 S. enteritidis from America