Literature DB >> 23282199

Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome.

Yajun Wang1, Yao Yu, Bohu Pan, Pei Hao, Yixue Li, Zhifeng Shao, Xiaogang Xu, Xuan Li.   

Abstract

BACKGROUND: Sequencing of bacterial genomes became an essential approach to study pathogen virulence and the phylogenetic relationship among close related strains. Bacterium Enterococcus faecium emerged as an important nosocomial pathogen that were often associated with resistance to common antibiotics in hospitals. With highly divergent gene contents, it presented a challenge to the next generation sequencing (NGS) technologies featuring high-throughput and shorter read-length. This study was designed to investigate the properties and systematic biases of NGS technologies and evaluate critical parameters influencing the outcomes of hybrid assemblies using combinations of NGS data.
RESULTS: A hospital strain of E. faecium was sequenced using three different NGS platforms: 454 GS-FLX, Illumina GAIIx, and ABI SOLiD4.0, to approximately 28-, 500-, and 400-fold coverage depth. We built a pipeline that merged contigs from each NGS data into hybrid assemblies. The results revealed that each single NGS assembly had a ceiling in continuity that could not be overcome by simply increasing data coverage depth. Each NGS technology displayed some intrinsic properties, i.e. base calling error, systematic bias, etc. The gaps and low coverage regions of each NGS assembly were associated with lower GC contents. In order to optimize the hybrid assembly approach, we tested with varying amount and different combination of NGS data, and obtained optimal conditions for assembly continuity. We also, for the first time, showed that SOLiD data could help make much improved assemblies of E. faecium genome using the hybrid approach when combined with other type of NGS data.
CONCLUSIONS: The current study addressed the difficult issue of how to most effectively construct a complete microbial genome using today's state of the art sequencing technologies. We characterized the sequence data and genome assembly from each NGS technologies, tested conditions for hybrid assembly with combinations of NGS data, and obtained optimized parameters for achieving most cost-efficiency assembly. Our study helped form some guidelines to direct genomic work on other microorganisms, thus have important practical implications.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23282199      PMCID: PMC3524012          DOI: 10.1186/1752-0509-6-S3-S21

Source DB:  PubMed          Journal:  BMC Syst Biol        ISSN: 1752-0509


  26 in total

1.  Genome sequencing in microfabricated high-density picolitre reactors.

Authors:  Marcel Margulies; Michael Egholm; William E Altman; Said Attiya; Joel S Bader; Lisa A Bemben; Jan Berka; Michael S Braverman; Yi-Ju Chen; Zhoutao Chen; Scott B Dewell; Lei Du; Joseph M Fierro; Xavier V Gomes; Brian C Godwin; Wen He; Scott Helgesen; Chun Heen Ho; Chun He Ho; Gerard P Irzyk; Szilveszter C Jando; Maria L I Alenquer; Thomas P Jarvie; Kshama B Jirage; Jong-Bum Kim; James R Knight; Janna R Lanza; John H Leamon; Steven M Lefkowitz; Ming Lei; Jing Li; Kenton L Lohman; Hong Lu; Vinod B Makhijani; Keith E McDade; Michael P McKenna; Eugene W Myers; Elizabeth Nickerson; John R Nobile; Ramona Plant; Bernard P Puc; Michael T Ronan; George T Roth; Gary J Sarkis; Jan Fredrik Simons; John W Simpson; Maithreyan Srinivasan; Karrie R Tartaro; Alexander Tomasz; Kari A Vogt; Greg A Volkmer; Shally H Wang; Yong Wang; Michael P Weiner; Pengguang Yu; Richard F Begley; Jonathan M Rothberg
Journal:  Nature       Date:  2005-07-31       Impact factor: 49.962

2.  SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing.

Authors:  Juliane C Dohm; Claudio Lottaz; Tatiana Borodina; Heinz Himmelbauer
Journal:  Genome Res       Date:  2007-10-01       Impact factor: 9.043

3.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment.

Authors:  B Ewing; L Hillier; M C Wendl; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

Review 4.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

5.  Comparative analysis of the first complete Enterococcus faecium genome.

Authors:  Margaret M C Lam; Torsten Seemann; Dieter M Bulach; Simon L Gladman; Honglei Chen; Volker Haring; Robert J Moore; Susan Ballard; M Lindsay Grayson; Paul D R Johnson; Benjamin P Howden; Timothy P Stinear
Journal:  J Bacteriol       Date:  2012-02-24       Impact factor: 3.490

Review 6.  Identification of high-risk enterococcal clonal complexes: global dispersion and antibiotic resistance.

Authors:  Helen L Leavis; Marc J M Bonten; Rob J L Willems
Journal:  Curr Opin Microbiol       Date:  2006-08-01       Impact factor: 7.934

7.  Typing of Enterococcus faecium by polymerase chain reaction and pulsed field gel electrophoresis.

Authors:  J Bedendo; A C Pignatari
Journal:  Braz J Med Biol Res       Date:  2000-11       Impact factor: 2.590

8.  Global spread of vancomycin-resistant Enterococcus faecium from distinct nosocomial genetic complex.

Authors:  Rob J L Willems; Janetta Top; Marga van Santen; D Ashley Robinson; Teresa M Coque; Fernando Baquero; Hajo Grundmann; Marc J M Bonten
Journal:  Emerg Infect Dis       Date:  2005-06       Impact factor: 6.883

9.  ReAS: Recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun.

Authors:  Ruiqiang Li; Jia Ye; Songgang Li; Jing Wang; Yujun Han; Chen Ye; Jian Wang; Huanming Yang; Jun Yu; Gane Ka-Shu Wong; Jun Wang
Journal:  PLoS Comput Biol       Date:  2005-09-23       Impact factor: 4.475

10.  Insertion sequence-driven diversification creates a globally dispersed emerging multiresistant subspecies of E. faecium.

Authors:  Helen L Leavis; Rob J L Willems; Willem J B van Wamel; Frank H Schuren; Martien P M Caspers; Marc J M Bonten
Journal:  PLoS Pathog       Date:  2007-01       Impact factor: 6.823

View more
  9 in total

Review 1.  Bidirectional promoters: an enigmatic genome architecture and their roles in cancers.

Authors:  Sheikh Shafin Ahmad; Nure Sharaf Nower Samia; Auroni Semonti Khan; Rafeed Rahman Turjya; Md Abdullah-Al-Kamran Khan
Journal:  Mol Biol Rep       Date:  2021-08-10       Impact factor: 2.316

Review 2.  Beyond the whole genome consensus: unravelling of PRRSV phylogenomics using next generation sequencing technologies.

Authors:  Zen H Lu; Alan L Archibald; Tahar Ait-Ali
Journal:  Virus Res       Date:  2014-10-12       Impact factor: 3.303

3.  Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532T using RNA-seq transcriptomics and high-throughput proteomics.

Authors:  John J Schellenberg; Tobin J Verbeke; Peter McQueen; Oleg V Krokhin; Xiangli Zhang; Graham Alvare; Brian Fristensky; Gerhard G Thallinger; Bernard Henrissat; John A Wilkins; David B Levin; Richard Sparling
Journal:  BMC Genomics       Date:  2014-07-07       Impact factor: 3.969

4.  Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan).

Authors:  Soma S Marla; Pallavi Mishra; Ranjeet Maurya; Mohar Singh; Dhammaprakash Pandhari Wankhede; Anil Kumar; Mahesh C Yadav; N Subbarao; Sanjeev K Singh; Rajesh Kumar
Journal:  Front Genet       Date:  2020-12-15       Impact factor: 4.599

5.  Phage Annotation Guide: Guidelines for Assembly and High-Quality Annotation.

Authors:  Dann Turner; Evelien M Adriaenssens; Igor Tolstoy; Andrew M Kropinski
Journal:  Phage (New Rochelle)       Date:  2021-12-16

6.  In Vitro and In Silico Based Approaches to Identify Potential Novel Bacteriocins from the Athlete Gut Microbiome of an Elite Athlete Cohort.

Authors:  Laura Wosinska; Calum J Walsh; Paula M O'Connor; Elaine M Lawton; Paul D Cotter; Caitriona M Guinane; Orla O'Sullivan
Journal:  Microorganisms       Date:  2022-03-24

7.  Advances in systems biology: computational algorithms and applications.

Authors:  Yufei Huang; Zhongming Zhao; Hua Xu; Yu Shyr; Bing Zhang
Journal:  BMC Syst Biol       Date:  2012-12-17

8.  An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome.

Authors:  Marco Ferrarini; Marco Moretto; Judson A Ward; Nada Šurbanovski; Vladimir Stevanović; Lara Giongo; Roberto Viola; Duccio Cavalieri; Riccardo Velasco; Alessandro Cestaro; Daniel J Sargent
Journal:  BMC Genomics       Date:  2013-10-01       Impact factor: 3.969

Review 9.  Next-generation sequence assembly: four stages of data processing and computational challenges.

Authors:  Sara El-Metwally; Taher Hamza; Magdi Zakaria; Mohamed Helmy
Journal:  PLoS Comput Biol       Date:  2013-12-12       Impact factor: 4.475

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.