| Literature DB >> 24904612 |
Abstract
Next generation sequencing (NGS) technologies have impressively accelerated research in biological science during the last years by enabling the production of large volumes of sequence data to a drastically lower price per base, compared to traditional sequencing methods. The recent and ongoing developments in the field allow addressing research questions in plant-microbe biology that were not conceivable just a few years ago. The present review provides an overview of NGS technologies and their usefulness for the analysis of microorganisms that live in association with plants. Possible limitations of the different sequencing systems, in particular sources of errors and bias, are critically discussed and methods are disclosed that help to overcome these shortcomings. A focus will be on the application of NGS methods in metagenomic studies, including the analysis of microbial communities by amplicon sequencing, which can be considered as a targeted metagenomic approach. Different applications of NGS technologies are exemplified by selected research articles that address the biology of the plant associated microbiota to demonstrate the worth of the new methods.Entities:
Keywords: amplicon sequencing; metagenomics; next generation sequencing; phyllosphere; plant microbiota; rhizosphere
Year: 2014 PMID: 24904612 PMCID: PMC4033234 DOI: 10.3389/fpls.2014.00216
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Schematic presentation of the library preparation and sequencing process of the most commonly used next generation sequencing platforms. All different types of starting molecules are converted into doublestranded DNA molecules that are flanked by adapters. Adapters are sequencing platform specific and enable the binding of the library molecules to surfaces, either beads or a flow cell, where they are amplified prior to sequencing. Clonal amplicons are spatially separated on the glass slides, chips, or picotiterplate. Sequencing is either a sequencing by ligation process with fluorescently labeled oligonucleotides of known sequence (SOLiD) or a sequencing by synthesis process. During Illumina sequencing, four differently labeled nucleotides are flushed over the flow cell in multiple cycles, depending on the desired read length. During 454 and Ion PGM sequencing unlabeled nucleotides are flushed in a sequential order over the flow cell. Incorporation is detected via a coupled light reaction (454) or the detection of proton release during nucleotide incorporation.
Technological specifications of currently commercially available next generation sequencing platforms.
| Roche (454 until 2006) | 454 FLX Titanium | emPCR on microbeads | Picotiterplate | Pyrosequencing | None (except for dATP, which is added as thiol derivative dATPαS) | Optical detection of light, emitted in secondary reactions initiated by release of PPi upon nucleotide incorporation | Indels in homopolymeric regions |
| 454 FLX+ | |||||||
| 454 GS Junior Titanium | |||||||
| Illumina (Solexa until 2007) | Illumina GAIIx | Bridge-PCR on flow cell surface | Flow cell | Reversible terminator sequencing by synthesis | End-blocked fluorescent nucleotides | Optical detection of fluorescent emission from incorporated dye-labeled nucleotides | Substitutions, in particular at the end of the read |
| Illumina HiSeq1000 | |||||||
| Illumina HiSeq1500 | |||||||
| Illumina HiSeq2000 | |||||||
| Illumina HiSeq2500 | |||||||
| Illumina MiSeq | |||||||
| Illumina NextSeq 500 | |||||||
| Illumina HiSeq X ten | |||||||
| Life Technologies (Agencourt until 2006, Applied Biosystems until 2008) | SOLiD 4 | emPCR on microbeads; PCR on FlowChip surface for the 5500 W models | FlowChip | Sequencing by ligation | 2-base encoded fluorescent oligonucleotides | Optical detection of fluorescent emission from ligated dye-labeled oligonucleotides | Substitutions, in particular at the end of the read |
| SOLiD 5500 | |||||||
| SOLiD 5500xl | |||||||
| SOLiD 5500 W | |||||||
| SOLiD 5500xl W | |||||||
| Life Technologies (Ion Torrent until 2010) | Ion PGM | emPCR on microbeads | Ion Chip, a semiconductor chip | Semiconductor-based sequencing by synthesis | None | Transistor-based detection of H+ shift upon nucleotide incorporation | Indels |
| Ion Proton | |||||||
| Pacific biosciences | PacBio RS | Not applied | SMRT cell | Single-molecule, real-time DNA sequencing by synthesis | Phosphor-linked fluorescent nucleotides | Real-time optical detection of fluorescent dye in polymerase active site during incorporation | Indels |
Data output of currently commercially available next generation sequencing platforms.
| 454 FLX+ | 1 PTP with gaskets to separate 2, 4, 8 or 16 regions | FLX (modal 450 bp, max. 600 bp) | 10 h | 450 Mb | 1 per PTP (0.7 for amplicons) |
| FLX+ (modal 700 bp, max. 1000 bp) | 23 h | 700 Mb | 1 per PTP (0.7 for amplicons) | ||
| 454 GS Junior Titanium | 1 PTP | ~450 bp | 10 h | 35 Mb | 0.1 per PTP (0.07 for amplicons) |
| HiSeq 2000/2500 (High output mode) V3 kits | 8 lanes per flow cell, 1 or 2 flow cells per run | 36 bp | 2 days | 95–105 Gb | 165–185 per lane |
| 2 × 50 bp | 5.5 days | 270–300 Gb | |||
| 100 bp | 5 days | 270–300 Gb | |||
| 2 × 100 bp | 11 days | 540–600 Gb | |||
| HiSeq 2000/2500 (High output mode) V4 kits | 8 lanes per flow cell, 1 or 2 flow cells per run | 36 bp | 29 h | 128–144 Gb | 250 per lane |
| 2 × 50 bp | 2.5 days | 360–400 Gb | |||
| 2 × 100 bp | 5 days | 720–800 Gb | |||
| 2 × 100 bp | 6 days | 900–1000 Gb | |||
| HiSeq 2500 (Rapid run mode) V3 kits | 2 lanes per flow cell (not independent), 1 or 2 flow cells per run | 36 bp | 7 h | 18–22 Gb | 125–150 per lane |
| 2 × 50 bp | 16 h | 50–60 Gb | |||
| 2 × 100 bp | 27 h | 100–120 Gb | |||
| 2 × 150 bp | 40 h | 150–180 Gb | |||
| HiSeq X ten | 1 or 2 flow cells | 2 × 150 bp | <3 days | 1.6–1.8 Tb | 3000 per flow cell |
| miSeq, V2 kits | 1 lane, 1 flow cell | 36 bp | 4 h | 540–610 Mb | 12–15 per flow cell |
| 2 × 25 bp | 5.5 h | 750–850 Mb | |||
| 2 × 150 bp | 24 h | 4.5–5.1 Gb | |||
| 2 × 250 bp | 39 h | 7.5–8.5 Gb | |||
| miSeq, V3 kits | 1 lane, 1 flow cell | 2 × 75 bp | 24 h | 3.3–3.8 Gb | 22–25 per flow cell |
| 2 × 300 bp | 55 h | 13.2–15 Gb | |||
| NextSeq 500 (High output mode) | 4 lanes (not independent), 1 flow cell | 75 bp | 11 h | 25–30 Gb | 400 per flow cell |
| 2 × 75 bp | 18 h | 50–60 Gb | |||
| 2 × 150 bp | 29 h | 100–120 Gb | |||
| NextSeq 500 (Mid output mode) | 4 lanes (not independent), 1 flow cell | 2 × 75 bp | 15 h | 16–20 Gb | 130 per flow cell |
| 2 × 150 bp | 26 h | 32–39 Gb | |||
| SOLiD 5500xl | 2 × 6 lanes | 75 bp | 5 days | 160 Gb | 160 per lane |
| 75 bp + 35 bp | 8 days | 220 Gb | |||
| 60 bp + 60 bp | 8 days | 260 Gb | |||
| SOLiD 5500xl W | 2 × 6 lanes | 50 bp | 4 days | 160 Gb | 265 per lane |
| 75 bp | 5 days | 240 Gb | |||
| 2 × 50 bp | 8 days | 320 Gb | |||
| Ion PGM, 314 chip v2 | 1 Chip | 200 bp mode | 2.3 h | 30–50 Mb | 0.4–0.55 per chip |
| 400 bp mode | 3.7 h | 60–100 Mb | |||
| Ion PGM, 316 chip v2 | 1 Chip | 200 bp mode | 3.0 h | 300–600 Mb | 2–3 per chip |
| 400 bp mode | 4.9 h | 600 Mb–1 Gb | |||
| Ion PGM, 318 chip v2 | 1 Chip | 200 bp mode | 4.4 h | 600 Mb–1 Gb | 4–5.5 per chip |
| 400 bp mode | 7.3 h | 1.2–2.0 Gb | |||
| Ion Proton, PI chip | 1 Chip | 200 bp mode | 2–4 h | Up to 10 Gb | 60–80 per chip |
| PacBio RS II | Up to 16 SMRT cells | C2/P4 chemistry, mean read length ~8000 bp | 2–3 h per cell | 400 Mb per cell | 0.05 per SMRT cell |
“2 ×” refers to paired end runs; more run conditions in the given range are possible for Illumina instruments.
Sequencing time does not include library amplification, except for the MiSeq and NextSeq platforms.
Output for 2 flow cells per run in case of the Illumina HiSeq systems.
The two reads of a paired end read are counted as one paired end read here.
Lanes can only be independently loaded with different libraries if cluster amplification is done on the cBot.
Not yet available, dedicated to human genome sequencing.
Figure 2Construction of mate pair libraries.
Metagenomic studies based on NGS technology that target the plant-associated microbiota.
| Roche 454 | 3.2 million raw reads | Rhizosphere | Soybean ( | The rhizosphere community is selected from the bulk soil based on functions related to N, Fe, P, and K metabolism | Mendes et al., |
| 2,472,359 filtered reads | |||||
| Mean read number per sample 103,014 | |||||
| Mean read length 523 bp | |||||
| Roche 454 | Not specified | Rhizosphere | Barley rhizosphere samples collected from an experimental field in Ireland with 15 years of barley monoculture under low-input mineral management regime | Identification of genes and operons involved in mineral phosphate solubilization in the rhizosphere | Chhabra et al., |
| Illumina Miseq | 15 million paired end reads | Phyllosphere | Samples from | Differences in metagenomic composition of replicate phyllosphere enrichment cultures; enrichment of | Ottesen et al., |
| 2.6 Gbp | |||||
| Roche 454 | Not specified | Phyllosphere | Leaves, stems, roots, flowers, and fruits from outdoor grown tomato ( | Distinct microbial communities detected on different tomato plant organs | Ottesen et al., |
| Rhizosphere | |||||
| Roche 454 | 8445 and 3799 filtered reads | Rhizosphere | Rhizosphere samples from greenhouse grown | Differences in microbial community composition in the rhizosphere of the differently developed plants; identification of genes related to phytic acid utilization | Unno and Shinano, |
| Mean read length 228 and 226 bp | |||||
| Roche 454 | 448 Mb sequence data | Phyllosphere | Leaf samples of tamarisk ( | Diverse microbial rhodopsins detected in phyllosphere bacteria | Atamna-Ismaeel et al., |
| Mean read length 357 bp | Detection of genes encoding proteins involved in anoxygenic photosynthesis ( | Atamna-Ismaeel et al., | |||
| Roche 454 | 832 and 396 Mb of sequence data per sample | Phyllosphere | Phyllosphere and rhizosphere sample of field grown rice ( | Contrasting proteome patterns in phyllosphere and rhizosphere of rice | Knief et al., |
| Rhizosphere | |||||
| Roche 454 | 1,109,816 reads | Phyllosphere | Leaf samples from field grown soybean ( | High consistency in the microbial community composition and their proteomes on different host plants | Delmotte et al., |
| 260 Mb of sequence data | |||||
| 235 bp mean read length | |||||
| Roche 454 | 419,571 reads | (Phyllosphere) | Psyllid infected with the endophyte “ | Complete genome sequence of the uncultured plant pathogen and insect symbiont “ | Duan et al., |
| 216 bp mean read length | |||||
| 90,813,125 bp of sequence data |