Literature DB >> 26701894

Encoding Data Using Biological Principles: The Multisample Variant Format for Phylogenomics and Population Genomics.

James B Pease, Benjamin K Rosenzweig.   

Abstract

Rapid progress in the fields of phylogenomics and population genomics has driven increases in both the size of multi-genomic datasets and the number and complexity of genome-wide analyses. We present the Multisample Variant Format, specifically designed to store multiple sequence alignments for phylogenomics and population genomic analysis. The signature feature of MVF is a distinctive encoding of aligned sites with specific biological information content (e.g., invariant, low-coverage). This biological pattern-based encoding of sequence data allows for rapid filtering and quality control of data and speeds up computation for many analyses. Similar to other modern formats, MVF has a simple data structure and flexible header structure to accommodate project metadata, allowing to also serve as an effective data publication and sharing format. We also propose several variants of the MVF format to accommodate protein and codon alignments, quality scores, and a mix of de novo and reference-aligned data. Using the MVFtools package, MVF files can be converted from other common sequence formats. MVFtools completes tasks ranging from simple transformation and filtering operations to complex genome-wide visualizations in only a few minutes, even on large datasets. In addition to presentation of MVF and MVFtools, we also discuss the application both in MVF and other existing data formats of the broader concept of using biological principles and patterns to inform sequence data encoding.

Mesh:

Year:  2015        PMID: 26701894     DOI: 10.1109/TCBB.2015.2509997

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  11 in total

1.  Powerful methods for detecting introgressed regions from population genomic data.

Authors:  Benjamin K Rosenzweig; James B Pease; Nora J Besansky; Matthew W Hahn
Journal:  Mol Ecol       Date:  2016-03-31       Impact factor: 6.185

2.  The role of gene flow in rapid and repeated evolution of cave-related traits in Mexican tetra, Astyanax mexicanus.

Authors:  Adam Herman; Yaniv Brandvain; James Weagley; William R Jeffery; Alex C Keene; Thomas J Y Kono; Helena Bilandžija; Richard Borowsky; Luis Espinasa; Kelly O'Quin; Claudia P Ornelas-García; Masato Yoshizawa; Brian Carlson; Ernesto Maldonado; Joshua B Gross; Reed A Cartwright; Nicolas Rohner; Wesley C Warren; Suzanne E McGaugh
Journal:  Mol Ecol       Date:  2018-10-16       Impact factor: 6.185

3.  Phylogenomics Reveals Three Sources of Adaptive Variation during a Rapid Radiation.

Authors:  James B Pease; David C Haak; Matthew W Hahn; Leonie C Moyle
Journal:  PLoS Biol       Date:  2016-02-12       Impact factor: 8.029

4.  The origin and remolding of genomic islands of differentiation in the European sea bass.

Authors:  Maud Duranton; François Allal; Christelle Fraïsse; Nicolas Bierne; François Bonhomme; Pierre-Alexandre Gagnaire
Journal:  Nat Commun       Date:  2018-06-28       Impact factor: 14.919

5.  The genomic impact of historical hybridization with massive mitochondrial DNA introgression.

Authors:  Fernando A Seixas; Pierre Boursot; José Melo-Ferreira
Journal:  Genome Biol       Date:  2018-07-30       Impact factor: 13.583

6.  Three New Genome Assemblies Support a Rapid Radiation in Musa acuminata (Wild Banana).

Authors:  Mathieu Rouard; Gaetan Droc; Guillaume Martin; Julie Sardos; Yann Hueber; Valentin Guignon; Alberto Cenci; Björn Geigle; Mark S Hibbins; Nabila Yahiaoui; Franc-Christophe Baurens; Vincent Berry; Matthew W Hahn; Angelique D'Hont; Nicolas Roux
Journal:  Genome Biol Evol       Date:  2018-12-01       Impact factor: 3.416

7.  Assessing biological factors affecting postspeciation introgression.

Authors:  Jennafer A P Hamlin; Mark S Hibbins; Leonie C Moyle
Journal:  Evol Lett       Date:  2020-02-28

8.  The contribution of ancient admixture to reproductive isolation between European sea bass lineages.

Authors:  Maud Duranton; François Allal; Sophie Valière; Olivier Bouchez; François Bonhomme; Pierre-Alexandre Gagnaire
Journal:  Evol Lett       Date:  2020-04-15

9.  Introgression and gene family contraction drive the evolution of lifestyle and host shifts of hypocrealean fungi.

Authors:  Weiwei Zhang; Xiaoling Zhang; Kuan Li; Chengshu Wang; Lei Cai; Wenying Zhuang; Meichun Xiang; Xingzhong Liu
Journal:  Mycology       Date:  2018-05-24

10.  Phylotranscriptomic Insights into the Diversification of Endothermic Thunnus Tunas.

Authors:  Adam G Ciezarek; Owen G Osborne; Oliver N Shipley; Edward J Brooks; Sean R Tracey; Jaime D McAllister; Luke D Gardner; Michael J E Sternberg; Barbara Block; Vincent Savolainen
Journal:  Mol Biol Evol       Date:  2019-01-01       Impact factor: 16.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.