Literature DB >> 24143115

Whole-genome sequencing in bacteriology: state of the art.

Abstract

Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics.

Entities: Chemical Disease Species

Keywords: bacterial genome sequencing assembly review

Year: 2013 PMID： 24143115 PMCID： PMC3797280 DOI： 10.2147/IDR.S35710

Source DB: PubMed Journal: Infect Drug Resist ISSN： 1178-6973 Impact factor: 4.003

History of bacterial genome sequencing

The first sequenced bacterial genome was Haemophilus influenzae1 in 1995. Since then, the Genomes Online Database2 lists 2,264 finished bacterial genomes and 4,067 permanent draft genomes (genomes that are sequenced but not completely closed). The majority of these have been deposited since 2008, after the commercial introduction of high-throughput sequencing. A number of sequencing techniques have been subsequently introduced making bacterial genome sequencing significantly cheaper and easier. This has decreased the cost per megabase of sequence by five logs (see figure 1), which has allowed for sequencing of large numbers of genomes. These advances have allowed movement from sequencing individual genomes to sequencing multiple strains. However, the general workflow of bacterial sequencing remains generally unchanged – sample preparation, DNA sequencing, sequence assembly, and bioinformatic analysis. This review will examine each of these, as well as examining some of the current applications of these technologies.

Figure 1

Cost per megabase of sequencing, from 2001 to 2012.

Adapted from the NIH NHGRI Genome Sequencing Program website (http://www.genome.gov/sequencingcosts/).

Abbreviations: NIH, National Institutes of Health, NHGRI, National Human Genome Research Institute.

Sample preparation

The major advance in sample preparation is enabling more effective isolation of small amounts of DNA, allowing genome sequencing from limited or degraded initial samples. This includes the development of isothermal amplification for multiple displacement amplification (MDA). This technique uses the phi29 DNA polymerase combined with random hexamers to produce DNA fragments in the multiple-kilobase range.3 This allows genomic-scale sequencing from small starting samples of DNA. Based on studies in Anaplasma, it appears that sequencing after phi29 amplification provides similar genomic coverage and single’nucleotide’polymorphism’(SNP) rates as traditional sample preparation.4 While additional chimeric sequences (a single sequence derived from two separate pieces of DNA) were generated, these did not interfere with genome assembly. MDA has been used to sequence the genome from a single unculturable intracytoplasmic symbiont of Draeculacephala minerva.5 This may provide a significant amount of information on genome sequences from unculturable bacteria, allowing whole genome sequencing rather than the limited information from metagenomic studies.

Sequencing technologies

The biggest revolution in genomics the last several years has been the emergence of new sequencing technologies. These have shifted the bottleneck in genome sequencing from generation of raw sequence to bioinformatic processing of samples. Each sequencing technology has specific strengths and weaknesses, making selection of the appropriate technique important to obtaining the desired experimental results. Tables 1 and 2 give an overview of different sequencing technologies and some relative strengths and weaknesses. Individual techniques are described below. However, these technologies are continuously revised; the mean read length of pyrosequencing, for example, has grown from approximately 150 bp6 to approximately 700 bp7 in the last five years. Consultation with the sequencing center early in the planning stage of an experiment is helpful in obtaining the best results, as they can provide updates on the technologies in use and tailor the sequencing runs to the needs of the experiment.

Table 1

An overview of current sequencing technologies

Platform	Run time	Sequence yield per run	Reported accuracy	Mean read length	Paired reads	Template DNA required	Reads per run
Illumina MiSeq	27 hours	8 Gb	>85% above Q30	2 × 250 bp	Yes	100 ng⁻¹ μg	15 M
Illumina HiSeq 1500
Rapid run	27–40 hours	60–90 Gb	>80% above Q30	2 × 150 bp	Yes	100 ng⁻¹ μg	300 M
High output	8.5 days	300 Gb	>80% above Q30	2 × 100 bp	Yes	100 ng⁻¹ μg	1.5 B
Illumina HiSeq 2500
Rapid run	27–40 hours	90–120 Gb	>80% above Q30	2 × 150 bp	Yes	100 ng⁻¹ μg	600 M
High output	11 days	600 Gb	>80% above Q30	2 × 100 bp	Yes	100 ng⁻¹ μg	3 B
Illumina GAIIx	14 days	95 Gb	>80% above Q30	2 × 150 bp	Yes	100 ng⁻¹ μg	320 M
PacBio RS II	2 hours	230 Mb	Approx 86% (Q8)	Approx 4,500 bp	No	250 ng⁻¹ μg	50 k
Ion Torrent
Ion 314 chip v2	2.3–3.7 hours	30–100 Mb	>90% above Q20	200–400 bp	Yes	100 ng⁻¹ μg	400–550 k
Ion 316 chip v2	3–4.9 hours	300 Mb–1 Gb	>90% above Q20	200–400 bp	Yes	100 ng⁻¹ μg	2–3 M
Ion 318 chip v2	4.4–7.3 hours	600 Mb–2 Gb	>90% above Q20	200–400 bp	Yes	100 ng⁻¹ μg	4–5.5 M
SOLiD 5500 W	2–7 days	80–160 Gb	90% above Q40	2 × 60 bp	Yes	10 ng⁻⁵ μg	1.2 B
SOLiD 5500xl W	2–7 days	160–320 Gb	90% above Q40	2 × 60 bp	Yes	10 ng⁻⁵ μg	2.4 B
454 GS FLX+	10–23 hours	450–700 Mb	Mostly >Q30	Up to 1 kb	Yes	700 ng⁻¹ μg	1 M
454 GS Jr	10 hours	35 Mb	Mostly >Q30	400 bp	Yes	700 ng⁻¹ μg	100 k

Notes: Illumina MiSeq, Illumina HiSeq 1500, Illumina HiSeq 2500, Illumina GAIIx (Illumina Inc., San Diego, CA, USA);Ion Torrent (Life Technologies Corporation, Grand Island, NY, USA); PacBio RS II (Pacific Biosciences Inc, Menlo Park, CA 94025); SOLiD 5500W, SOLiD 5500xl W (Sequencing by Oligo Ligation Detection)(Life Technologies Corporation, Grand Island, NY, USA). Q score = −10 log 10 P, where P is the probability of an incorrect base call. 454 GS FLX+ and 454 GS Jr (Roche Inc., Branford, CT, USA).

Abbreviation: DNA, deoxyribonucleic acid.

Table 2

Relative strengths and weaknesses of current sequencing technologies

Platform	Strengths	Weaknesses
Illumina	Low error rates	Higher indel rates
MiSeq	Support for paired end sequencing	Errors with GC-rich sequences
Illumina	Low error rates	Relatively short read lengths
HiSeq	Support for paired end sequencing
PacBio	Long read lengths(/br)Detects DNA methylation	SNP detection less sensitive due to higher individual read error length
The Ion Personal Genome Machine® (PGM™)	SNP detection	Bias with AT-rich regions
SOLiD	High accuracyFlexible configuration	Short read lengths
454	Read length	Higher indel rates
454	Sequencing speed	Difficulty sequencing homopolymeric tracts

Notes: Illumina (Illumina Inc., San Diego, CA, USA); Ion Torrent (Life Technologies Corporation, Grand Island, NY, USA); PacBio (Pacific Biosciences Inc, Menlo Park, CA 94025); MySeq and HiSeq (Illumina Inc., San Diego, CA, USA), SOLiD (Sequencing by Oligo Ligation Detection)(Life Technologies Corporation, Grand Island, NY, USA)

Abbreviations: DNA, Deoxyribonucleic acid; AT, adenine and thymine, GC, guanine and cytosine, SNP, single nucleotide polymorphism.

Current technologies

Pyrosequencing (454)

Pyrosequencing (454) (Roche Inc., Branford, CT, USA) uses a “sequencing by synthesis” approach. Deoxynucleotides are added one at a time and incorporation is detected by converting the amount of phosphorus released in deoxynucleotide incorporation into a light signal that is read by the sequencer. Because of this, it tends to have difficulty with homopolymeric tracts, as the difference in light intensity between progressively longer nucleotide repeats is relatively less. In general, the strengths of pyrosequencing are its relatively long read lengths and rapid turnaround time, which make it especially useful for de novo sequencing projects and organisms with large numbers of repeats or long repetitive regions.

Sequencing by Oligo Ligation Detection

Sequencing by Oligo Ligation Detection (SOLiD) (Life Technologies Corporation, Grand Island, NY, USA) uses a “sequencing by ligation” approach. Numerous degenerate 8-mers are ligated to the single stranded DNA (ssDNA) template, with two nucleotides specific for the strand being sequenced and the remaining six bases degenerate. As the probes are ligated to the template, fluorescent dyes are cleaved off and detected by the sequencer. Every nucleotide participates in two ligation reactions, which allows for error checking of each read. This gives SOLiD an advantage for SNP detection, as it tends to have high reliability in SNP sequencing. However, since each nucleotide sequence is based off of a combination of two reads (termed “colorspace”), rather than a nucleotide sequence with a quality score, fully utilizing these data require tools designed for SOLiD sequences.

MySeq and HiSeq

MySeq and HiSeq (Illumina Inc., San Diego, CA, USA), machines use a “sequencing by synthesis” technique, where individual DNA molecules are attached to the surface of flow cells and isothermal ‘bridging’ amplification is used to amplify signals. These are then sequenced using reversible fluorophore-labeled nucleotides, which are optically read from each flow cell. While these have high accuracies and produce large amounts of raw data, the individual read lengths tend to be shorter, which can be problematic for genomes with large repeats. Illumina’s Nextera sample preparation kit can allow for template amounts as low as 50 ng, which can be useful for organisms that are difficult to culture.

The Ion Personal Genome Machine® (PGM™)

Ion Torrent Personal Genome Machine® (PGM™) (Life Technologies Corporation, Grand Island, NY, USA) uses a “sequencing by synthesis” approach, measuring the hydrogen ions released during deoxynucleotide incorporation. This is measured by semiconductors in the disposable chips used by the machine for sequencing. As there is no optical component of the sequencer, machine throughput can be increased by modifications to the chips used without additional sequenced modification, which has led to a tremendous increase in the throughput since the initial release. This also allows selection of chips giving the appropriate sequencing coverage for the desired application, which can make sequencing more cost-effective. While there is generally high accuracy, there are difficulties with high adenine-thymine (AT)-rich sequences, which can lead to gaps in coverage.8 In addition to the PGM, Life Technologies has also released the Proton system, which allows for larger chips with more sequence per run.

PacBio RS II Single Molecule Real-time Sequencing

PacBio RS II Single Molecule Real-time Sequencing (SMRT) (Pacific Biosciences Inc., Menlo Park, CA 94025) uses a variation of “sequencing by synthesis”, using fluorescent-labeled deoxynucleotides added to a zero-mode waveguide (ZMW) with a DNA polymerase embedded in the bottom. As deoxynucleotides are added to the template, the fluorescent signals are read in real-time by the sequencer. While the accuracy of individual reads tends to be low (∼85% or so), errors tend to be random, rather than due to specific DNA features, so increased coverage allows for high cumulative accuracy rates.9 The main advantage of PacBio is its long read length; while mean read lengths tend to be approximately 5 kb, reads of more than 10 kb are not uncommon. Also, as the machine observes the reaction in real time, it can detect some base modifications, such as methylation10, without additional reagents due to alterations in the deoxynucleotide incorporation time. In addition, experiments have been made to sequence DNA without the initial amplification step in library preparation.11

Future technologies

GnuBio

GnuBio (GnuBio Inc., Cambridge, MA USA) sequencer uses a sequencing by amplification approach, using microfluidics to combine target selection, DNA amplification, library preparation, and sequencing into one instrument.12 While this is targeted at clinical applications, this has a number of applications for microbial sequencing. Beta testing began in April of 2013.

GridION/MiniION

GridION/MiniION (Oxford Nanopore Technologies Ltd, Oxford, UK) systems use nanopore technology and disposable cartridges to perform a number of possible experiments, including DNA sequencing.13 The nanopores use voltage variation produced when ssDNA is fed through the nanopore via an enzyme. No amplification step is necessary, allowing examination of DNA sequence modifications (such as methylation) directly. The GridION is meant to be an expandable, reusable system for core laboratories, while the MiniION is a single-use device for individual laboratories.

Genome assemblers

While raw sequence data is useful, it is significantly more valuable after assembly into contiguous DNA sequences (contigs). There are a number of strategies for assembly, and sequences can be assembled either de novo or assembled against a reference sequence. A number of assemblers have been used on bacterial genome sequences; some of the more common options are discussed below in alphabetical order.

ABySS

ABySS (Assembly By Short Sequences) is a de novo parallel paired-end assembler that works with Illumina, SOLiD, pyrosequencing, and Sanger reads.14 It also works with combinations of technologies by calculating the distribution of read sizes for each, so an accurate empirical distribution can be obtained. In addition, it has been adapted for transcriptome assembly with RNA seq data.

Celera Assembler

CABOG (Celera Assembler with the Best Overlap Graph)15 is a de novo assembler that was first developed for the original human genome project. It has subsequently been modified to assemble pyrosequencing16, Illumina, and PacBio reads. While it is primarily geared toward mammalian sequences, it can also be utilized for microbial sequences.17

Edena

Edena (Exact DE Novo Assembler) is a de novo overlaps graph-based short reads assembler.18 It requires reads to be a similar length, as it is designed for Illumina-based sequences; therefore, pyrosequencing and Sanger-based reads would need to be trimmed to a similar length to be processed. This program is specifically designed for bacterial genome assemblies.

EULER-SR

EULER-SR is a de novo assembler that uses an A-Bruijn graph technique to assemble Sanger, pyrosequencing, and Illumina reads.19 This is geared toward assembly of DNA sequences from individual organisms, as well as clustering of sequences from metagenomic analyses.

MaSuRCA

MaSuRCA (Maryland Super Read Cabog Assembler) is a new de novo genome assembler that combines de Bruijn graphs and Overlap-layout-consensus approaches to increase efficiency.20 It can use a combination of short (Illumina and SOLiD) and longer (pyrosequencing) reads. This assembler performed best on a recent comparison of several modern assemblers with a number of bacterial genomic data sets.21

MIRA

MIRA (Mimicking Intelligent Read Assembly) is a whole genome shotgun and expressed sequence tag (EST) assembler22 for Sanger and pyrosequencing reads, as well as Illumina, Life Technologies, and PacBio reads with the development version.23 It can perform both de novo and reference-based assemblies. It features sequence editors, allowing repair of sequencing errors and use of quality data in generating assemblies. It also will assemble to a reference sequence and call SNPs and other mutations.

SOAP suite

SOAPdenovo2 (Short Oligonucleotide Analysis Package) is made up of multiple modules that perform error correction, assembly, paired end mapping, and scaffold construction24, and is specifically designed for de novo assembly of Illumina reads. While this was designed for large genomes, it has been tested and works well on microbial genomes as well. There is a separate program, SOAP2 and SOAP3, that align reads to reference genomes. In addition to the assembler, there are additional tools for SNP and indel detection.

SOPRA

SOPRA (Statistical Optimization of Paired Read Assembly) is a de novo assembler that attempts to compensate for inaccuracies in the high throughput reads.25 It accepts pyrosequencing, Illumina, and SOLiD reads, and can use data on mate pair distances to create scaffolds. It can convert SOLiD colorspace to base-space, and use that for quality checking. However, SOPRA requires contigs as input; the developers recommend Velvet as a contig assembler, but the program can use FASTA contigs generated by any program.

Velvet

Velvet is a De Bruijn graph-based de novo assembler26 that can assemble Illumina, SOLiD, pyrosequencing, and Sanger reads.27 In addition, if compiled to support colorspace, it can use colorspace assembly as well as base-space assembly. Velvet is one of the first De Bruijn graph assemblers, and has continued to be updated, including updates to allow for mixed-length assembly and paired-end assembly.28

Optical mapping

In addition to traditional assembly of sequence reads into contigs, high-resolution optical mapping has been combined with contig assembly to allow more rapid assembly of contigs and determination of gap locations.29 Software is able to take the optical map and arrange contigs, either with the assistance of a reference sequence or de novo. Whole genome mapping has been used as a scaffold to perform the initial assembly of pyrosequencing reads to better identify gaps in sequence coverage, allowing complete genome assembly without paired-end sequencing.30

Accessory programs

In addition to sequence assembly, there are a number of other computer programs that can be helpful in further processing sequence data.

Trimmomatic

Trimmomatic is useful for processing Illumina data, screening libraries for a number of quality parameters, including adapter trimming, cropping, trimming based on a minimum length, and converting quality scores.31 Sequences that do not meet quality guidelines are automatically trimmed out. However, unlike a number of other methods for sequence trimming, Trimmomatic is aware of paired end data, and maintains the paired end links.

CGView Comparison Tool

CGView Comparison Tool (CCT) is a program to visually compare multiple circular genomes32 that takes sequence alignment output and uses it to visualize the results against a reference genome. One strength over many other tools is the ability to compare thousands of genomes in the same map.

Artemis Comparison Tool

The Artemis Comparison Tool (ACT) is another tool to visualize multiple genome comparisons.33 For people who use Artemis for genome annotation, the user interface for ACT is almost identical, making it easy to use. The interactive user interface makes it useful for examination of genomes and SNP detection. In addition to the stand-alone program, a web tool (WebACT) has also been developed for online work.34

Galaxy

Galaxy is a web application that can use a variety of bioinformatics tools.35 It is also extensible, so programmers can add support for nearly any desired bioinformatics tool. While it started as primarily a method for working with text-based data, such as DNA sequences, recent developments have added data visualization tools as well.36 The main strength of Galaxy is the ability for multiple researchers to work on data sets together via web browsers. In addition to sharing datasets, researchers can also share workflows, allowing others to replicate their results and allowing editing and saving of workflows for future use. There are public servers for Galaxy, but it can also be downloaded and run locally or in the cloud to use additional storage and computing resources.

Applications of genome sequencing

Clinical medicine

One recent development has been the application of high-throughput DNA sequencing to clinical applications. First, as requirements for DNA template concentration and purity for genome sequencing decrease, clinical samples can be directly sequenced, allowing for organism identification and possible identification of traits such as antibiotic resistance.37 While complete genome assembly is still time consuming, high-throughput sequencing and assembly can reveal a tremendous amount of information about target organisms without obtaining the complete genome sequence.38 This may also make large numbers of clinical samples available for use in research studies, as complete genome sequences of Chlamydia trachomatis were isolated from discarded swabs after testing.39 In addition to rapidly determining genotype/phenotype association, whole genome sequencing (WGS) techniques have been used in several public health surveys, analyzing nosocomial infections in hospitals and differentiating them from non-outbreak isolates40 and in retrospective analysis to track the spread of infections through hospitals.41 Whole genome sequencing gives the ability to determine where an infection was acquired from, and has, in some cases, revealed previously unknown bacterial reservoirs.42 This may be aided by the continued sequencing of multiple bacterial strains, as evidenced by the determination of a minimum core genome for Streptococcus suis, which allowed determination of genes unique to animal versus human strains. Other genomic analyses have detected infection with multiple strains of the same organism, revealing previously unknown transmission events.43 Another related activity is using microbial genomics and metagenomics for forensics. This has been done for cases of bioterrorism, such as the anthrax letter attack investigation, where the isolates from the letters were linked and were different from those previously suspected in the investigation.44 Microbial sequencing may also be used in the future for criminal investigation, as skin microbial populations are relatively unique and can be used to identify items handled by people up to two weeks previously.45 However, the abundance of sequence information makes bioinformatics the bottleneck in utilization of sequences in clinical samples. Future developments may help automate sequence assembly and annotation46, as well as automating bacterial typing from whole genome sequences,47 speeding analysis.

Genomic archaeology

In addition to clinical medicine, the reduction in DNA template requirements for sequencing have produced profound developments in genomic archaeology. Medieval isolates of Yersinia pestis from victims of the black death were sequenced using the Illumina platform, yielding 93% genome coverage.48 This has revealed that current isolates of Y. pestis appear to be descended from the medieval strain, and that the virulence of the Black Death organism does not appear to be due to bacterial genotype. In another study, multiple ancient isolates of Mycobacterium leprae were sequenced from bone lesions and compared to modern isolates.49 This is the first study to assemble a complete genome de novo from ancient sequences, rather than use a modern reference sequence for scaffolding. This has allowed tracking of the spread of leprosy from ancient times to modern day, as well as drawing conclusions about why leprosy disappeared from Europe but persists in many developing countries today. Other studies have examined the bacterial composition of ancient dental calculus,50 allowing for comparisons of historical bacterial populations compared with modern day oral flora, and using that to examine environmental factors associated with dental disease. Finally, another study examined the bacterial populations in waterlogged, preserved wood,51 which will aid in preserving historic wrecks and establishing underwater archaeological parks.

Metagenomics studies

While sequencing advances have caused a huge growth in the field of metagenomics, metagenomic studies have started to be exploited as sources of raw sequence for genome projects. The increases in DNA sequencing throughput have allowed shifting metagenomics studies from amplification of 16S to shotgun sequencing of the entire sample DNA population.52 Generally, this depends on having a predominance of small numbers of microbial genomes in the population, allowing for assembly into complete or near-complete genomes.53,54 However, one study combined multiple metagenomics studies from a population to assemble twelve near-complete or complete genome sequences55 from low-prevalence populations.

Bacterial evolution

With the large numbers of sequenced genomes, a variety of techniques from other organisms have subsequently been applied to bacteria. Genome-wide association studies have begun to be applied to bacteria. In one study, Campylobacter strains from a variety of hosts were examined to determine factors involved in host specificity.56 While most lineages were able to switch hosts, some lineages were associated with specific hosts. These were linked with vitamin B5 biosynthesis genes, and cattle isolates were able to grow better in vitamin B5-depleted media. Another study examined the microbiota in patients with and without type 2 diabetes, finding a significant decrease in butyrate-producing bacteria and an increase in opportunistic pathogens.57 Other studies have examined a number of bacteria to determine changes associated with the development of pathogenicity. One study found that pathogenic bacteria have smaller genomes, with less ribosomal RNA, less transcriptional regulators, and more genes for toxins and DNA replication.58 Similar reductions are detected in experimental populations with multiple generations.59 Other studies have examined multiple strains to correlate phenotypic differences with polymorphisms and transcriptional differences in bacteria that are unable to be cultured.60 Other studies have examined the rate of polymorphism formation in multiple species, finding that SNPs can occur in non-random locations depending on the nature of the mutation.61 Future work will likely involve correlating genomic data with transcriptional regulatory data, metabolic pathway reconstruction, and proteomics data.62 While the ultimate goal would be to establish whole-cell models of bacterial systems, the raw data to drive these models will still be complete, edited bacterial genomes.

Conclusion

Advances in sample preparation, DNA sequencing, and assembly technology have caused an explosion in the number of sequenced bacterial genomes, and are enabling new uses for bacterial genome sequencing. As technology improves, the number of applications will only increase, making understanding the spectrum of technology more important. Further, collaboration will be more important, making web tools for manipulation of genomic data more useful.

54 in total

1. Short read fragment assembly of bacterial genomes.

Authors: Mark J Chaisson; Pavel A Pevzner
Journal: Genome Res Date: 2007-12-14 Impact factor: 9.043

2. Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors: Daniel R Zerbino; Ewan Birney
Journal: Genome Res Date: 2008-03-18 Impact factor: 9.043

3. A metagenome-wide association study of gut microbiota in type 2 diabetes.

Authors: Junjie Qin; Yingrui Li; Zhiming Cai; Shenghui Li; Jianfeng Zhu; Fan Zhang; Suisha Liang; Wenwei Zhang; Yuanlin Guan; Dongqian Shen; Yangqing Peng; Dongya Zhang; Zhuye Jie; Wenxian Wu; Youwen Qin; Wenbin Xue; Junhua Li; Lingchuan Han; Donghui Lu; Peixian Wu; Yali Dai; Xiaojuan Sun; Zesong Li; Aifa Tang; Shilong Zhong; Xiaoping Li; Weineng Chen; Ran Xu; Mingbang Wang; Qiang Feng; Meihua Gong; Jing Yu; Yanyan Zhang; Ming Zhang; Torben Hansen; Gaston Sanchez; Jeroen Raes; Gwen Falony; Shujiro Okuda; Mathieu Almeida; Emmanuelle LeChatelier; Pierre Renault; Nicolas Pons; Jean-Michel Batto; Zhaoxi Zhang; Hua Chen; Ruifu Yang; Weimou Zheng; Songgang Li; Huanming Yang; Jian Wang; S Dusko Ehrlich; Rasmus Nielsen; Oluf Pedersen; Karsten Kristiansen; Jun Wang
Journal: Nature Date: 2012-09-26 Impact factor: 49.962

4. Rapid bacterial whole-genome sequencing to enhance diagnostic and public health microbiology.

Authors: Sandra Reuter; Matthew J Ellington; Edward J P Cartwright; Claudio U Köser; M Estée Török; Theodore Gouliouris; Simon R Harris; Nicholas M Brown; Matthew T G Holden; Mike Quail; Julian Parkhill; Geoffrey P Smith; Stephen D Bentley; Sharon J Peacock
Journal: JAMA Intern Med Date: 2013-08-12 Impact factor: 21.873

5. Comparing thousands of circular genomes using the CGView Comparison Tool.

Authors: Jason R Grant; Adriano S Arantes; Paul Stothard
Journal: BMC Genomics Date: 2012-05-23 Impact factor: 3.969

6. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers.

Authors: Michael A Quail; Miriam Smith; Paul Coupland; Thomas D Otto; Simon R Harris; Thomas R Connor; Anna Bertoni; Harold P Swerdlow; Yong Gu
Journal: BMC Genomics Date: 2012-07-24 Impact factor: 3.969

7. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.

Authors: Jeremy Goecks; Anton Nekrutenko; James Taylor
Journal: Genome Biol Date: 2010-08-25 Impact factor: 13.583

8. Determining the repertoire of immunodominant proteins via whole-genome amplification of intracellular pathogens.

Authors: Michael J Dark; Anna M Lundgren; Anthony F Barbet
Journal: PLoS One Date: 2012-04-30 Impact factor: 3.240

9. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors: Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal: Gigascience Date: 2012-12-27 Impact factor: 6.524

10. Aggressive assembly of pyrosequencing reads with mates.

Authors: Jason R Miller; Arthur L Delcher; Sergey Koren; Eli Venter; Brian P Walenz; Anushka Brownley; Justin Johnson; Kelvin Li; Clark Mobarry; Granger Sutton
Journal: Bioinformatics Date: 2008-10-24 Impact factor: 6.937

12 in total

1. Evaluation of high-resolution melting curve analysis of ligation-mediated real-time PCR, a rapid method for epidemiological typing of ESKAPE (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter Species) pathogens.

Authors: Hanna Woksepp; Anna Ryberg; Hanna Billström; Anita Hällgren; Lennart E Nilsson; Britt-Inger Marklund; Barbro Olsson-Liljequist; Thomas Schön
Journal: J Clin Microbiol Date: 2014-09-17 Impact factor: 5.948

Review 2. Using bacterial genomes and essential genes for the development of new antibiotics.

Authors: Francisco R Fields; Shaun W Lee; Michael J McConnell
Journal: Biochem Pharmacol Date: 2016-12-08 Impact factor: 5.858

Review 3. Application of metagenomics in understanding oral health and disease.

Authors: Ping Xu; John Gunsolley
Journal: Virulence Date: 2014-03-18 Impact factor: 5.882

4. Pseudomonas aeruginosa Genome Evolution in Patients and under the Hospital Environment.

Authors: Céline Lucchetti-Miganeh; David Redelberger; Gaël Chambonnier; François Rechenmann; Sylvie Elsen; Christophe Bordi; Katy Jeannot; Ina Attrée; Patrick Plésiat; Sophie de Bentzmann
Journal: Pathogens Date: 2014-04-10

5. Effect of Shear Stress on Pseudomonas aeruginosa Isolated from the Cystic Fibrosis Lung.

Authors: Jozef Dingemans; Pieter Monsieurs; Sung-Huan Yu; Aurélie Crabbé; Konrad U Förstner; Anne Malfroot; Pierre Cornelis; Rob Van Houdt
Journal: MBio Date: 2016-08-02 Impact factor: 7.867

6. Transcriptomic analysis of staphylococcal sRNAs: insights into species-specific adaption and the evolution of pathogenesis.

Authors: William H Broach; Andy Weiss; Lindsey N Shaw
Journal: Microb Genom Date: 2016-07-26

7. Application of Next-Generation Sequencing for Characterization of Surveillance and Clinical Trial Isolates: Analysis of the Distribution of β-lactamase Resistance Genes and Lineage Background in the United States.

Authors: Rodrigo E Mendes; Ronald N Jones; Leah N Woosley; Vincent Cattoir; Mariana Castanheira
Journal: Open Forum Infect Dis Date: 2019-03-15 Impact factor: 3.835