Microbial genomics and related transcriptomics methods rely on culturing techniques to obtain enough DNA suitable for high‐throughput sequencing without resorting to DNA amplification techniques. A few microgram of DNA is needed for most common next‐generation sequencing methods. For transcriptome analysis, sufficient cDNA is needed to measure low abundance mRNA copies in the cell. However, the large majority of microbes on earth resist cultivation, hampering research into their relevant gene pool, ecological niche or industrial relevance. For example, many environmental or gut‐related species cannot be grown outside their natural habitat. Even if we isolate the metagenome or the metatranscriptome from these environments, this reveals only a fragmented sequence landscape that is difficult to assign to individual species. Although enrichment techniques or metatransciptome analysis of previously unculturable species have been shown to assist in directed culturing, e.g. of a Rikenella‐like bacterium (Bomar ), the unravelling of a complex metagenome into its individual genomes and their organization is impossible using current technologies.A major challenge is the analysis of bacteria and other organisms living inside a complex matrix, like biofilms. Metagenome or transcriptome analysis of microorganisms has been described for biofilms consisting of a single species by scraping of the biofilm to obtain enough material (Holmes ), but for multi‐species biofilms this method results in a metagenome or metatranscriptome dataset. The solution to these challenges may be the isolation and genomic analysis of unculturable single cells isolated from such environments. Here we describe in brief the state‐of‐the‐art in single‐cell microbial genomics.
Single‐cell isolation
Several methods exist to extract and investigate single microbial cells from their environment. Flow cytometry or fluorescence‐activated cell sorting (FACS) has been used since the 1970s and its applications in microbiology were recognized early (Fouchet ); recent advances are described by Müller and Nebe‐von‐Caron (2010), Wang and Bodovitz (2010), and Wang ). Micromanipulation has been described by Kvist ) and more recently by Woyke ). Microfluidic device techniques are shown to be effective by combining the separation of cells and subsequently performing biochemical reactions on the device itself, thereby maximizing reaction yield (Marcy ) (Fig. 1).
Figure 1
Photograph of a single‐cell isolation and genome amplification chip capable of processing nine samples in parallel (eight cells, one positive control). A. To visualize the architecture, the channels and chambers have been filled with blue food colouring and the control lines to actuate the valves have been filled with red food colouring (scale bar 5 mm). B. Schematic diagram of the automated sorting procedure. Closed valves are shown in red, open valves are transparent. Cells are drawn in green. C. Typical result of cell sorting showing for each unit (seven with a single cell and one negative control without a cell) a colour combination of a phase contrast image (gray) and a fluorescence image (green). A green overlaid square has been placed around the cell to ease visualization, whereas a red crossed square indicates the absence of cell. Scale bar is 100 µm. Reprinted from Marcy ).
Photograph of a single‐cell isolation and genome amplification chip capable of processing nine samples in parallel (eight cells, one positive control). A. To visualize the architecture, the channels and chambers have been filled with blue food colouring and the control lines to actuate the valves have been filled with red food colouring (scale bar 5 mm). B. Schematic diagram of the automated sorting procedure. Closed valves are shown in red, open valves are transparent. Cells are drawn in green. C. Typical result of cell sorting showing for each unit (seven with a single cell and one negative control without a cell) a colour combination of a phase contrast image (gray) and a fluorescence image (green). A green overlaid square has been placed around the cell to ease visualization, whereas a red crossed square indicates the absence of cell. Scale bar is 100 µm. Reprinted from Marcy ).
Single‐cell genome sequencing and data analysis
Whereas classical next‐generation sequencing to determine an organism's genome sequence relies on pooling DNA from 106–108 cells, single‐cell genomics relies on whole‐genome amplification from a single cell. Most studies rely on Multiple displacement amplification (MDA), a biochemical amplification technique using random primers and ϕ29 DNA polymerase (Dean ; Raghunathan ; Zhang ; Marcy ). Other amplification techniques like random‐primed PCR result in a more over‐ and under‐representation of different regions of the template DNA and generate very short fragments (Dean ; Hosono ). MDA, however, results in fragments of 12–100 kb rendering them suitable for sequencing. Although the complete microbial genome from a single cell can be amplified to amounts required for current sequencing methods without a priori sequence knowledge, early studies suggested that up to 40% of the genomic sequence was missed (Podar ; Marcy ; Woyke ) (Table 1).
Table 1
Examples of single‐cell genome sequencing.
Microorganism
Assembled bases (Mb)
Estimated % genome recovery
Scaffolds
Contigs
GC%
Single cell separation
Isolation source
Reference
TM7a (new phylum)
2.865
?
1825
34.3
Microfluidics
Human mouth biofilm
Marcy et al. (2007b)
TM7_GTL1 (new phylum)
0.679
?
132
48.5
FISH/FACS
Soil
Podar et al. (2007)
Prochlorococcus MED4
95
755
FACS
Sea water; lab culturea
Rodrigue et al. (2009)
Flavobacterium MS024‐2A
1.905
91
17
36
Flow cytometer
Coastal water, Maine, USA
Woyke et al. (2009)
Flavobacterium MS024‐3C
1.505
78
21
39
Flow cytometer
Coastal water, Maine, USA
Woyke et al. (2009)
Cand. Sulcia muelleri DMIN
0.244
100
1
1
22.5
Micromanipulator
Symbiont from insect bacteriome (green sharpshooter)
Woyke et al. (2010)
Poribacteria
1.885
66
1597
53.4
FACS
Symbiont from marine sponge
Siegl et al. (2011)
Cand. Nitrosoarchaeum limnia SFB1
1.690b
95
26
136
32.4
Microfluidics, laser tweezer
Ammonia‐oxidizing enrichment culture; sediment water, San Fransisco bay, USA
Blainey et al. (2011)
Method validation using strain with known genome sequence.
Pooled sequence data from five individual cells; see Table 2.
Examples of single‐cell genome sequencing.Method validation using strain with known genome sequence.Pooled sequence data from five individual cells; see Table 2.
Table 2
Assembly statistics for sequencing of three single cells of Nitrosoarchaeum limnia SFB1, and consensus genome (reads from metagenome and five single cells).
Assembly statistics
Cell 23
Cell 21
Cell 3
Five single cells co‐assembly
Consensus single cells and metagenome
Raw read bases
17 107 411
52 341 561
29 999 202
118 796 782
150 994 537
Assembly bases
1 094 113
1 039 820
1 041 604
1 690 404
1 769 573
Scaffolds
68
76
83
26
2
Unscaffolded contigs
287
177
265
110
29
Estimated % genome coverage
62
59
59
95
99
Adapted from Table 1 of Blainey ).
An overview of an MDA set‐up using a microfluidic device is shown in Fig. 2, although FACS‐based methods are also often reported in literature (Rodrigue ; Siegl and Hentschel, 2010). All DNA in the initial sample will be amplified, which renders the method very prone to DNA contamination. Another disadvantage of the initial method is uneven amplification of the genome, which results in high‐coverage sequencing of the amplified genomic regions while remaining sequences may not be sufficiently covered (Zhang ). Marcy ) demonstrated that reducing MDA reaction volumes lowers non‐specific synthesis from contaminant DNA templates and unfavourable interactions between primers. The work of Rodrigue ) demonstrated a biochemical method to normalize the products obtained in MDA reactions. They also discussed the problem of chimera formation linking non‐contiguous chromosomal regions in MDA (Dean ; Zhang ), which may hamper sequence assembly and render mate‐pair data less efficient in contig positioning. Several other single‐cell techniques are described in recent reviews by Wang and Bodovitz (2010), Kalisky and Quake (2011), and Pan ). As data analysis from single‐cell amplified genomes is equally challenging, the software framework SmashCell has been developed to automate the main steps in sequence assembly, gene prediction, annotation and visualization (Harrington ).
Figure 2
A mixture of cells sampled from a complex microbial ecosystem is introduced into the chip. Single cells are selected using an optical trap, and are sorted into chambers for cell lysis and genome amplification. Genomes are amplified in nanolitre MDA reactions to produce larger quantities of DNA (shown are SYBR Green–stained products in microfluidic reaction chambers). Sequencing libraries are created from the amplified genomic DNA for sequencing on a high‐throughput DNA sequencer. The sequence reads are assembled to recover the genome sequence, which is annotated to identify genes and pathways present in the original cell. Reprinted by permission from Macmillan Publishers Ltd: Nature Methods (Kalisky and Quake, 2011), copyright 2011. The microfluidics image was reprinted from Leslie (2011).
A mixture of cells sampled from a complex microbial ecosystem is introduced into the chip. Single cells are selected using an optical trap, and are sorted into chambers for cell lysis and genome amplification. Genomes are amplified in nanolitre MDA reactions to produce larger quantities of DNA (shown are SYBR Green–stained products in microfluidic reaction chambers). Sequencing libraries are created from the amplified genomic DNA for sequencing on a high‐throughput DNA sequencer. The sequence reads are assembled to recover the genome sequence, which is annotated to identify genes and pathways present in the original cell. Reprinted by permission from Macmillan Publishers Ltd: Nature Methods (Kalisky and Quake, 2011), copyright 2011. The microfluidics image was reprinted from Leslie (2011).
Single‐cell genome sequences of uncultured microorganisms
Examples of sequencing of single amplified genomes (SAGs) are listed in Table 1. Woyke ) describe using a micro‐displacement technique to sequence a genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN, a symbiont isolated from the bacteriome of the green sharpshooter Draeculacephala minerva. This polyploid bacterium has an estimated 200–900 genome copies per cell. Of the 57 Mb of sequence generated, approximately 90% was of contaminant origin, as estimated by mapping to a previously sequenced genome of Sulcia and phylogenetic analysis with blastx and MEGAN (Mitra ). The remaining reads were assembled into a draft genome, misassemblies due to chimeras were corrected manually, and subsequent application of primer walking, sequencing PCR products and Illumina sequencing resulted in a final finished genome (Fig. 3).
Figure 3
Sulcia cell isolation and sequence coverage, closure and polishing locations along the Sulcia DMIN single cell genome. A. Micromanipulation of the single Sulcia cell from the sharpshooter bacteriome metasample. B. Sequence coverage including closure and polishing locations along the finished, circular Sulcia DMIN. Reprinted from Woyke ). For figure details see the original article.
Sulcia cell isolation and sequence coverage, closure and polishing locations along the Sulcia DMIN single cell genome. A. Micromanipulation of the single Sulcia cell from the sharpshooter bacteriome metasample. B. Sequence coverage including closure and polishing locations along the finished, circular Sulcia DMIN. Reprinted from Woyke ). For figure details see the original article.Siegl ) used FACS to isolate cells from the candidate phylum Poribacteria and subsequently MDA to obtain a SAG. These bacteria are almost exclusively found in marine sponges as symbionts and resist cultivation efforts. The SAG of 1.88 Mb was contained in 1597 contigs, which covered an estimated two‐thirds of the total genomic DNA based on the distribution of tRNA genes and their specificities found in the contigs. Nevertheless, a comprehensive overview of poribacterial metabolism could be deduced (Fig. 4). The extensive Sup‐type polyketide synthases found in the SAG of Poribacteria confirmed the previously proposed assignment of Sup‐PKS to this species. With the finding of a second putative PKS system showing high similarity to the lipopolysaccharide type I PKS WcbR from Nitrosomonas and Burkholderia, as well as RkpA from Sinorhizobium fredii, they suggested that Poribacteria contain at least two different types of PKS systems and their products may be involved in sponge–microbe interactions. This study showed that single‐cell genomics is highly capable of dissecting the genomic information from unculturable bacteria, shedding light on genomic organization, metabolic functions and possibly new insight in the debate on the origin of sponge bioactive compounds.
Figure 4
A schematic overview of poribacterial metabolism as deduced from SAG sequencing. Reprinted by permission from Macmillan Publishers Ltd: The ISME Journal (Siegl ), copyright 2011.
A schematic overview of poribacterial metabolism as deduced from SAG sequencing. Reprinted by permission from Macmillan Publishers Ltd: The ISME Journal (Siegl ), copyright 2011.Ammonia‐oxidizing archaea (AOA) are among the most abundant microbes on Earth, and may significantly impact global nitrogen and carbon cycles. Five single cells were isolated from a low‐salinity sediment AOA‐enrichment culture using a microfluidic device and laser tweezers, and DNA was amplified and sequenced separately from each cell (Blainey ) (Tables 1 and 2). Individually, three single‐cell datasets gave assemblies of more than 1 Mb at sequencing depths of 10× to 30×, and an estimated 60% genomic coverage each; the low coverage is considered typical due to MDA amplification bias. Surprisingly, each of the single‐cell assemblies represented a different 60% of the target genome, and combining the five datasets led to a single‐cell assembly representing > 95% of the Nitrosoarchaeum limnia genome. Based on nucleotide identity comparisons, this AOA is proposed to represent a new genus of Crenarchaeota. In contrast to other described AOA, this low‐salinity archaeum appears to be motile, based on the presence of numerous motility and chemotaxis‐associated genes in the genome (Blainey ).Assembly statistics for sequencing of three single cells of Nitrosoarchaeum limnia SFB1, and consensus genome (reads from metagenome and five single cells).Adapted from Table 1 of Blainey ).
Single‐cell transcriptomics, metabolomics and proteomics
Recent reports on single‐cell transcriptomics discuss mainly the analysis of polyadenylated mRNA of eukaryotes. A comprehensive overview of the technologies involved is given by Tang ). In short, the single‐cell methods exploit reverse transcription using oligo(dT) primers to convert mRNAs with poly(A) tails into cDNAs, followed by uniform amplification and sequencing (RNA‐seq). However, currently no single‐cell analysis reports are known that exploit protocols for mRNA extraction from bacterial cells, for instance using the MessageAmp II‐Bacteria Kit (Ambion) as described by Frias‐Lopez ). Single‐cell metabolome and proteome/peptidome analyses are still in their infancy, as these compounds cannot be amplified and their analysis requires technological breakthroughs in pushing the limits of detection (Rubakhin ).
Future
Since the introduction of single‐cell genomics (Raghunathan ), there have been surprisingly few reports of successful reconstruction of whole genomes from single unculturable bacterial cells (Table 1). This undoubtedly reflects the extreme difficulties in the various steps of single‐cell isolation, miniaturization, DNA amplification, avoidance of contamination and data analysis. Nevertheless, the pioneering examples show that it is definitely feasible to sequence genomes of single unculturable cells isolated from complex consortia, and we expect this approach to become more widespread as miniaturization technologies improve.Recently, it has also been recognized that isogenic microbial populations (pure cultures) contain substantial cell‐to‐cell differences in physiological parameters such as growth rate, resistance to stress and regulatory circuit output (Ingham ; Lidstrom and Konopka, 2010). In this light, adaptation of single‐cell genome sequencing using microfluidic approaches towards RNA‐seq transcriptome analysis of single cells using next‐generation mRNA sequencing should become increasingly important (Siezen ).
Authors: Kun Zhang; Adam C Martiny; Nikos B Reppas; Kerrie W Barry; Joel Malek; Sallie W Chisholm; George M Church Journal: Nat Biotechnol Date: 2006-05-28 Impact factor: 54.908
Authors: Jorge Frias-Lopez; Yanmei Shi; Gene W Tyson; Maureen L Coleman; Stephan C Schuster; Sallie W Chisholm; Edward F Delong Journal: Proc Natl Acad Sci U S A Date: 2008-03-03 Impact factor: 11.205
Authors: Dawn E Holmes; Swades K Chaudhuri; Kelly P Nevin; Teena Mehta; Barbara A Methé; Anna Liu; Joy E Ward; Trevor L Woodard; Jennifer Webster; Derek R Lovley Journal: Environ Microbiol Date: 2006-10 Impact factor: 5.491
Authors: Seiyu Hosono; A Fawad Faruqi; Frank B Dean; Yuefen Du; Zhenyu Sun; Xiaohong Wu; Jing Du; Stephen F Kingsmore; Michael Egholm; Roger S Lasken Journal: Genome Res Date: 2003-04-14 Impact factor: 9.043
Authors: Paul C Blainey; Annika C Mosier; Anastasia Potanina; Christopher A Francis; Stephen R Quake Journal: PLoS One Date: 2011-02-22 Impact factor: 3.240
Authors: Scott Clingenpeel; Alicia Clum; Patrick Schwientek; Christian Rinke; Tanja Woyke Journal: Front Microbiol Date: 2015-01-08 Impact factor: 5.640