Literature DB >> 27587666

PanTools: representation, storage and exploration of pan-genomic data.

Siavash Sheikhizadeh1, M Eric Schranz2, Mehmet Akdel1, Dick de Ridder1, Sandra Smit1.   

Abstract

MOTIVATION: Next-generation sequencing technology is generating a wealth of highly similar genome sequences for many species, paving the way for a transition from single-genome to pan-genome analyses. Accordingly, genomics research is going to switch from reference-centric to pan-genomic approaches. We define the pan-genome as a comprehensive representation of multiple annotated genomes, facilitating analyses on the similarity and divergence of the constituent genomes at the nucleotide, gene and genome structure level. Current pan-genomic approaches do not thoroughly address scalability, functionality and usability.
RESULTS: We introduce a generalized De Bruijn graph as a pan-genome representation, as well as an online algorithm to construct it. This representation is stored in a Neo4j graph database, which makes our approach scalable to large eukaryotic genomes. Besides the construction algorithm, our software package, called PanTools, currently provides functionality for annotating pan-genomes, adding sequences, grouping genes, retrieving gene sequences or genomic regions, reconstructing genomes and comparing and querying pan-genomes. We demonstrate the performance of the tool using datasets of 62 E. coli genomes, 93 yeast genomes and 19 Arabidopsis thaliana genomes.
AVAILABILITY AND IMPLEMENTATION: The Java implementation of PanTools is publicly available at http://www.bif.wur.nl CONTACT: sandra.smit@wur.nl.
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2016        PMID: 27587666     DOI: 10.1093/bioinformatics/btw455

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  11 in total

1.  Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.

Authors:  Jamshed Khan; Marek Kokot; Sebastian Deorowicz; Rob Patro
Journal:  Genome Biol       Date:  2022-09-08       Impact factor: 17.906

2.  VariantStore: an index for large-scale genomic variant search.

Authors:  Prashant Pandey; Yinjie Gao; Carl Kingsford
Journal:  Genome Biol       Date:  2021-08-19       Impact factor: 13.583

3.  ODGI: understanding pangenome graphs.

Authors:  Andrea Guarracino; Simon Heumos; Sven Nahnsen; Pjotr Prins; Erik Garrison
Journal:  Bioinformatics       Date:  2022-05-13       Impact factor: 6.931

4.  Analysis of Plant Pan-Genomes and Transcriptomes with GET_HOMOLOGUES-EST, a Clustering Solution for Sequences of the Same Species.

Authors:  Bruno Contreras-Moreira; Carlos P Cantalapiedra; María J García-Pereira; Sean P Gordon; John P Vogel; Ernesto Igartua; Ana M Casas; Pablo Vinuesa
Journal:  Front Plant Sci       Date:  2017-02-14       Impact factor: 5.753

Review 5.  Computational pan-genomics: status, promises and challenges.

Authors: 
Journal:  Brief Bioinform       Date:  2018-01-01       Impact factor: 11.622

6.  PanACEA: a bioinformatics tool for the exploration and visualization of bacterial pan-chromosomes.

Authors:  Thomas H Clarke; Lauren M Brinkac; Jason M Inman; Granger Sutton; Derrick E Fouts
Journal:  BMC Bioinformatics       Date:  2018-06-27       Impact factor: 3.169

7.  Estimating Pangenomes with Roary.

Authors:  Farrah Sitto; Fabia U Battistuzzi
Journal:  Mol Biol Evol       Date:  2020-03-01       Impact factor: 16.240

8.  Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs.

Authors:  Guillaume Holley; Páll Melsted
Journal:  Genome Biol       Date:  2020-09-17       Impact factor: 13.583

9.  The Pectobacterium pangenome, with a focus on Pectobacterium brasiliense, shows a robust core and extensive exchange of genes from a shared gene pool.

Authors:  Eef M Jonkheer; Balázs Brankovics; Ilse M Houwers; Jan M van der Wolf; Peter J M Bonants; Robert A M Vreeburg; Robert Bollema; Jorn R de Haan; Lidija Berke; Sandra Smit; Dick de Ridder; Theo A J van der Lee
Journal:  BMC Genomics       Date:  2021-04-14       Impact factor: 3.969

10.  Efficient inference of homologs in large eukaryotic pan-proteomes.

Authors:  Siavash Sheikhizadeh Anari; Dick de Ridder; M Eric Schranz; Sandra Smit
Journal:  BMC Bioinformatics       Date:  2018-09-26       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.