| Literature DB >> 26646014 |
Brittany Goldberg1, Heike Sichtig2, Chelsie Geyer3, Nathan Ledeboer4, George M Weinstock5.
Abstract
Next-generation DNA sequencing (NGS) has progressed enormously over the past decade, transforming genomic analysis and opening up many new opportunities for applications in clinical microbiology laboratories. The impact of NGS on microbiology has been revolutionary, with new microbial genomic sequences being generated daily, leading to the development of large databases of genomes and gene sequences. The ability to analyze microbial communities without culturing organisms has created the ever-growing field of metagenomics and microbiome analysis and has generated significant new insights into the relation between host and microbe. The medical literature contains many examples of how this new technology can be used for infectious disease diagnostics and pathogen analysis. The implementation of NGS in medical practice has been a slow process due to various challenges such as clinical trials, lack of applicable regulatory guidelines, and the adaptation of the technology to the clinical environment. In April 2015, the American Academy of Microbiology (AAM) convened a colloquium to begin to define these issues, and in this document, we present some of the concepts that were generated from these discussions.Entities:
Mesh:
Year: 2015 PMID: 26646014 PMCID: PMC4669390 DOI: 10.1128/mBio.01888-15
Source DB: PubMed Journal: mBio Impact factor: 7.867
Glossary of terms used in DNA sequence analysis
| Term | Abbreviation | Definition |
|---|---|---|
| 16S rRNA gene | A slowly evolving gene in bacteria whose sequence | |
| Alignment | The process of comparing the sequence of a single | |
| Assembly | Reconstructing a genome, in whole or in part, | |
| Contig | A contiguous stretch of sequence produced when | |
| Dideoxynucleotide sequencing | A “classical” method of DNA sequencing that preceded | |
| Metagenomics | Analyzing a mixture of microbial genomes, a metagenome, | |
| Metagenomic whole-genome | mWGS | The application of WGS to a metagenomics sample. DNA |
| Microbiome | A community of microbes comprising bacteria, viruses, | |
| Next-generation sequencing | NGS | A collection of DNA sequencing methods that each |
| Reference genome | A genome sequence of a particular organism that can | |
| Read | The basic element produced by DNA sequencing. | |
| Sanger sequencing | A “classical” method of DNA sequencing that preceded | |
| Single nucleotide polymorphism | SNP | A difference of a single base compared to a reference |
| Variant | Any difference in a DNA sequence compared to | |
| Whole-genome shotgun sequencing | WGS | Randomly fragmenting an entire genome and obtaining |
Current spectrum of popular, available NGS instruments (as of July 2015)
| Instrument | Read length | No. of | Run time | Run cost ($) | No. of genomes/ | Comment |
|---|---|---|---|---|---|---|
| ILMN MiSeq | ≤300 bases | ≤20 million | ≤2.5 days | ≤1,500 | 40/4 | Read pairs |
| ILMN NextSeq | ≤150 bases (high) | ≤400 million | ≤1.2 days | ≤5,000 | 400/60 | Read pairs |
| ILMN | ≤250 bases (high) | ≤300 million | 2.5 days | <8,000 | 300/80 | Read pairs, |
| PB RSII | ~10 kb (low | 70,000 | ≤4 h | ≤1,200 | 3/TBD | Shorter (e.g., 2-kb) |
| LT Ion PGM | ≤400 bases | ≤5 million | ≤7 h | ≤750 | 7/1 | PGM318 chip; |
| LT Ion Proton | ≤200 bases | ≤80 million | ≤4 h | 1,000 | 30/16 | PI chip; no |
| ONT MinION | 10 kb–100 kb (low) | Variable | Variable | 1,000 | TBD | No paired ends; |
Abbreviations: ILMN, Illumina; LT, Life Technologies; PB, Pacific Biosciences; ONT, Oxford Nanopore Technologies. TBD, to be determined. ILMN is from Illumina. LT is from http://allseq.com/knowledgebank/sequencing-platforms/life-technologies-ion-torrent. Genomes/run, 100× coverage of a 3-Mb genome. Metagenomes/run, 5 million reads (pairs)/metagenome.
FIG 1 NGS genome analysis. The general process of using NGS for analysis of a single genome is depicted in this figure. Note that there are many variations on this approach. Purified DNA (i.e., input genome) is fragmented and run through the DNA sequencing process. The sequencing instrument produces either short (e.g., for Illumina) or long (e.g., for Pacific Biosciences) sequence reads depending on the platform used. When the goal is to produce a complete genome (e.g., for identifying virulence or antibiotic resistance genes or for comparative genomic studies), the reads are assembled into genomes using specialized software. Some gaps in the assembly may occur, leading to a draft genome sequence composed of many contigs. When the genome is assembled without gaps, it is said to be closed. When the goal is to identify variants (e.g., SNPs) with respect to a reference genome, the reads can be aligned directly with the reference genome and sequence variants can be identified using specialized programs. These variants serve to define the organism at the subspecies and strain level and are useful in epidemiological tracking as well as to identify mutations that occur.
FIG 2 NGS workflow. A high-level overview of the steps taken in the NGS data production process and some of the equipment and software used in this process. The specific instruments and programs may vary as there are multiple solutions (e.g., Table 2). DB, database; QA, quality assurance; seq’ing, sequencing; conc, concentration; LIMS, laboratory information management system.
FIG 3 Potential clinical applications for metagenomics sequencing. There are numerous potential applications of NGS technology to the clinical microbiology laboratory. Each entry in the chart represents a potential area for the utilization of NGS diagnostics and/or future research.
FIG 4 The FDA is considering the following information for the clearance/approval of an infectious disease NGS-based test/assay. The FDA presubmission process can be utilized for outstanding questions and to request additional information while policy is still being developed (26). IRB, institutional review board.