| Literature DB >> 26251049 |
Theodore R Pak1, Andrew Kasarskis1.
Abstract
Recent reviews have examined the extent to which routine next-generation sequencing (NGS) on clinical specimens will improve the capabilities of clinical microbiology laboratories in the short term, but do not explore integrating NGS with clinical data from electronic medical records (EMRs), immune profiling data, and other rich datasets to create multiscale predictive models. This review introduces a range of "omics" and patient data sources relevant to managing infections and proposes 3 potentially disruptive applications for these data in the clinical workflow. The combined threats of healthcare-associated infections and multidrug-resistant organisms may be addressed by multiscale analysis of NGS and EMR data that is ideally updated and refined over time within each healthcare organization. Such data and analysis should form the cornerstone of future learning health systems for infectious disease.Entities:
Keywords: electronic medical records; healthcare-associated infections; hospital-acquired infections; multiscale analysis; whole genome sequencing
Mesh:
Year: 2015 PMID: 26251049 PMCID: PMC4643486 DOI: 10.1093/cid/civ670
Source DB: PubMed Journal: Clin Infect Dis ISSN: 1058-4838 Impact factor: 9.079
Examples of Public Bioinformatics Databases That May Be Leveraged for Multiscale Analysis of Infectious Diseasea
| Database Focus | For General Research | For Infectious Disease | |
|---|---|---|---|
| Multipathogen | Pathogen-Specific | ||
| Genomes |
NCBI Nucleotide (GenBank/RefSeq) ENA/EMBL DDBJ |
ViPR NMPDR PATRIC EuPathDB |
Influenza Research Database Tuberculosis Database LANL: Databases for HIV, HCV, and HFV |
| Gene products and functionality |
UniProt KEGG |
Pathogen-Host Interaction Database Antibiotic Resistance Genes Database Comprehensive Antibiotic Resistance Database | |
| Expression and immune profiles |
GEO ArrayExpress |
ImmPort | |
Citations for individual databases can be found in the Supplementary Data.
Abbreviations: DDBJ, DNA Data Bank of Japan; ENA/EMBL, European Nucleotide Archive/European Molecular Biology Laboratory; EuPathDB, Eukaryotic Pathogen Database; GEO, Gene Expression Omnibus; HCV, hepatitis C virus; HFV, hemorrhagic fever viruses; HIV, human immunodeficiency virus; ImmPort, Immunology Database and Analysis Portal; KEGG, Kyoto Encyclopedia of Genes and Genomes; LANL, Los Alamos National Laboratory; NCBI, National Center for Biotechnology Information; NMPDR, National Microbial Pathogen Data Resource; PATRIC, Pathosystems Resource Integration Center; ViPR, Virus Pathogen Database and Analysis Resource.
a Not an exhaustive list.
Selected Published Bioinformatics Software Packages or Databases That Address Specific Steps of Clinical Microbiology Tasks Using Next-Generation Sequencing Dataa
| Problem Domain | Software or Database |
|---|---|
| Strain typing |
Multilocus sequence typing database |
| De novo assembly from long reads |
Celera Hierarchical Genome Assembly Process |
| Species identification | |
| From clonal sample |
NCBI BLAST GenBank Other genome databases in Table |
| From nonclonal sample | |
| Meta-assembly |
AMOS MIRA MetaVelvet |
| Clustering and species annotation |
MEGAN MG-RAST |
| Maximum likelihood phylogeny trees |
BEAST RAxML ClonalFrame ClonalOrigin |
| Whole-genome alignment | |
| For SNP calling |
Mummer Mugsy |
| For structural variant calling |
Mauve |
| Gene annotation | |
| Bacterial |
GLIMMER RAST |
| Drug resistance in bacteria |
ResFinder ARG-ANNOT |
| Other |
Influenza Virus Sequence Annotation Tool |
Citations for individual databases can be found in the Supplementary Data.
Abbreviations: AMOS, A Modular, Open-Source assembler; ARG-ANNOT, Antibiotic Resistance Gene–ANNOTation; BEAST, Bayesian Evolutionary Analysis Sampling Trees; BLAST, Basic Local Alignment Search Tool; GLIMMER, Gene Locator and Interpolated Markov Modeler; MEGAN, MetaGenome Analyzer Database; MG-RAST, Metagenomics Rapid Annotation Using Subsystem Technology; MIRA, Mimicking Intelligent Read Assembly; NCBI, National Center for Biotechnology Information; RAxML, Randomized Axelerated Maximum Likelihood; RAST, Rapid Annotation using Subsystem Technology; SNP, single-nucleotide polymorphism.
a Not an exhaustive list. Well-established tools are available for many specific subtasks.
Figure 1.A learning health system for infectious diseases. Next-generation sequencing (NGS) technologies now permit routine genomic analysis of clinical microbiology specimens. When integrated with pathogen phenotypes derived from clinical metadata in electronic medical records (EMRs) and laboratory metadata, we can generate predictive models for pathogen transmission, outbreaks, drug resistance, virulence, and risk factors for infection or critical outcomes that are specific to the health system and its patient population. If management strategies are formulated from these predictions and sent to infectious disease (ID) physicians and hospital infection control, a continuous loop of data analysis, application, and model refinement is created.