Literature DB >> 20331777

Assembly free comparative genomics of short-read sequence data discovers the needles in the haystack.

Charles H Cannon1, Chai-Shian Kua, D Zhang, J R Harting.   

Abstract

Most comparative genomic analyses of short-read sequence (SRS) data rely upon the prior assembly of a reference sequence. Here, we present an assembly free analysis of SRS data that discovers sequence variants among focal genomes by tabulating the presence and frequency of 'complex' fragments in the data. Using data from nine tree species, we compare genomic diversity from populations to families. As a control, we simulated SRS data for three known plant genomes. The results provide insight into the quality and distributional bias of the sequencing reaction. Three main types of informative complexmers were identified, each possessing unique statistical properties. Type I complexmers are unique to a genome but suffer from a high false positive rate, being highly dependent on read coverage and distribution. Type II complexmers are shared between two genomes and can highlight potential copy-number differences. Type III complexmers are exclusive to a subset of genomes and can be useful for associating genetic differences with phenotypic or geographic variation. At the population level in an endangered timber species, numerous markers were identified that could potentially determine geographic origin of individuals and regulate international trade. We observed that the genomic data for the four fig species were more divergent than for stone oak species, possibly due to their complex pollination syndrome and high rates of gene flow. Our approach greatly enhances the application of SRS technology to the study of non-model organisms and directly identifies the most informative genetic elements for more detailed study and assembly.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20331777     DOI: 10.1111/j.1365-294X.2009.04484.x

Source DB:  PubMed          Journal:  Mol Ecol        ISSN: 0962-1083            Impact factor:   6.185


  13 in total

Review 1.  Applications of next generation sequencing in molecular ecology of non-model organisms.

Authors:  R Ekblom; J Galindo
Journal:  Heredity (Edinb)       Date:  2010-12-08       Impact factor: 3.821

Review 2.  New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing.

Authors:  Kai Song; Jie Ren; Gesine Reinert; Minghua Deng; Michael S Waterman; Fengzhu Sun
Journal:  Brief Bioinform       Date:  2013-09-23       Impact factor: 11.622

Review 3.  Community genetics: what have we accomplished and where should we be going?

Authors:  Erika I Hersch-Green; Nash E Turley; Marc T J Johnson
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2011-05-12       Impact factor: 6.237

4.  Inference of Markovian properties of molecular sequences from NGS data and applications to comparative genomics.

Authors:  Jie Ren; Kai Song; Minghua Deng; Gesine Reinert; Charles H Cannon; Fengzhu Sun
Journal:  Bioinformatics       Date:  2015-06-30       Impact factor: 6.937

5.  Alignment-free sequence comparison based on next-generation sequencing reads.

Authors:  Kai Song; Jie Ren; Zhiyuan Zhai; Xuemei Liu; Minghua Deng; Fengzhu Sun
Journal:  J Comput Biol       Date:  2013-02       Impact factor: 1.479

6.  An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data.

Authors:  Huan Fan; Anthony R Ives; Yann Surget-Groba; Charles H Cannon
Journal:  BMC Genomics       Date:  2015-07-14       Impact factor: 3.969

7.  Alignment-free supervised classification of metagenomes by recursive SVM.

Authors:  Hongfei Cui; Xuegong Zhang
Journal:  BMC Genomics       Date:  2013-09-22       Impact factor: 3.969

8.  Co-phylog: an assembly-free phylogenomic approach for closely related organisms.

Authors:  Huiguang Yi; Li Jin
Journal:  Nucleic Acids Res       Date:  2013-01-18       Impact factor: 16.971

9.  Reference-free comparative genomics of 174 chloroplasts.

Authors:  Chai-Shian Kua; Jue Ruan; John Harting; Cheng-Xi Ye; Matthew R Helmus; Jun Yu; Charles H Cannon
Journal:  PLoS One       Date:  2012-11-20       Impact factor: 3.240

10.  Cnidaria: fast, reference-free clustering of raw and assembled genome and transcriptome NGS data.

Authors:  Saulo Alves Aflitos; Edouard Severing; Gabino Sanchez-Perez; Sander Peters; Hans de Jong; Dick de Ridder
Journal:  BMC Bioinformatics       Date:  2015-11-02       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.